Outer join returns incorrect results when using sub-selects

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chris Greening

    Outer join returns incorrect results when using sub-selects

    I'm seeing a very strange problem with outer joins.

    The example below replicates the problem:

    create table data1 (dim1 integer, stat1 float);
    create table data2 (dim1 integer, stat2 float);

    insert into data1 values (1,1);
    insert into data1 values (1,2);
    insert into data1 values (2,2);
    insert into data1 values (2,2);
    insert into data1 values (3,3);
    insert into data1 values (6,3);
    insert into data1 values (7,3);

    insert into data2 values (2,1);
    insert into data2 values (2,2);
    insert into data2 values (3,2);
    insert into data2 values (4,3);
    insert into data2 values (5,3);

    I want to be able to run a query that generates the resultset:

    dim1, sum(stat1), sum(stat2)
    1,3,NULL
    2,4,3
    3,3,2
    4,NULL,3
    5,NULL,3
    6,3,NULL
    7,3,NULL

    I was planning to use the following sql:

    select nvl(s1.dim1, s2.dim1) dim1, stat1, stat2 from
    (
    select dim1, sum(stat1) stat1 from data1 group by dim1
    ) s1
    full outer join
    (
    select dim1, sum(stat2) stat2 from data2 group by dim1
    ) s2 on s1.dim1=s2.dim1
    order by dim1;

    When I run this. Instead of the 4 rows I was expecting I get back:

    dim1,stat1,stat 2
    1,3,NULL
    2,4,3
    3,3,2
    4,NULL,NULL
    5,NULL,NULL
    6,3,NULL
    7,3,NULL

    The values for stat2 seem to be getting lost when there is no value for stat1.

    Anyone have any ideas as to what is going on? Am I doing something stupid?
  • Ed prochak

    #2
    Re: Outer join returns incorrect results when using sub-selects

    chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...[color=blue]
    > I'm seeing a very strange problem with outer joins.
    >
    > The example below replicates the problem:
    >
    > create table data1 (dim1 integer, stat1 float);
    > create table data2 (dim1 integer, stat2 float);
    >
    > insert into data1 values (1,1);
    > insert into data1 values (1,2);
    > insert into data1 values (2,2);
    > insert into data1 values (2,2);
    > insert into data1 values (3,3);
    > insert into data1 values (6,3);
    > insert into data1 values (7,3);
    >
    > insert into data2 values (2,1);
    > insert into data2 values (2,2);
    > insert into data2 values (3,2);
    > insert into data2 values (4,3);
    > insert into data2 values (5,3);
    >
    > I want to be able to run a query that generates the resultset:
    >
    > dim1, sum(stat1), sum(stat2)
    > 1,3,NULL
    > 2,4,3
    > 3,3,2
    > 4,NULL,3
    > 5,NULL,3
    > 6,3,NULL
    > 7,3,NULL
    >
    > I was planning to use the following sql:
    >
    > select nvl(s1.dim1, s2.dim1) dim1, stat1, stat2 from
    > (
    > select dim1, sum(stat1) stat1 from data1 group by dim1
    > ) s1
    > full outer join
    > (
    > select dim1, sum(stat2) stat2 from data2 group by dim1
    > ) s2 on s1.dim1=s2.dim1
    > order by dim1;
    >
    > When I run this. Instead of the 4 rows I was expecting I get back:
    >
    > dim1,stat1,stat 2
    > 1,3,NULL
    > 2,4,3
    > 3,3,2
    > 4,NULL,NULL
    > 5,NULL,NULL
    > 6,3,NULL
    > 7,3,NULL
    >
    > The values for stat2 seem to be getting lost when there is no value for stat1.
    >
    > Anyone have any ideas as to what is going on? Am I doing something stupid?[/color]

    I don't see the problem, but I did notice the nvl() function. A NULL
    shouldn't occur in the join column (at least not with the data you
    presented) so you should just list the key:

    select s1.dim1, stat1, stat2 from

    I doubt that would fix the other columns, but this is going to be one
    of those "intuitive obvious" solutions, i.e., once you know the
    answer, you realize it was "intuitive obvious" 8^)

    Comment

    • Alex Filonov

      #3
      Re: Outer join returns incorrect results when using sub-selects

      chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...[color=blue]
      > I'm seeing a very strange problem with outer joins.
      >
      > The example below replicates the problem:
      >
      > create table data1 (dim1 integer, stat1 float);
      > create table data2 (dim1 integer, stat2 float);
      >
      > insert into data1 values (1,1);
      > insert into data1 values (1,2);
      > insert into data1 values (2,2);
      > insert into data1 values (2,2);
      > insert into data1 values (3,3);
      > insert into data1 values (6,3);
      > insert into data1 values (7,3);
      >
      > insert into data2 values (2,1);
      > insert into data2 values (2,2);
      > insert into data2 values (3,2);
      > insert into data2 values (4,3);
      > insert into data2 values (5,3);
      >
      > I want to be able to run a query that generates the resultset:
      >
      > dim1, sum(stat1), sum(stat2)
      > 1,3,NULL
      > 2,4,3
      > 3,3,2
      > 4,NULL,3
      > 5,NULL,3
      > 6,3,NULL
      > 7,3,NULL
      >
      > I was planning to use the following sql:[/color]
      [color=blue]
      >
      > select nvl(s1.dim1, s2.dim1) dim1, stat1, stat2 from
      > (
      > select dim1, sum(stat1) stat1 from data1 group by dim1
      > ) s1
      > full outer join
      > (
      > select dim1, sum(stat2) stat2 from data2 group by dim1
      > ) s2 on s1.dim1=s2.dim1
      > order by dim1;
      >
      > When I run this. Instead of the 4 rows I was expecting I get back:
      >
      > dim1,stat1,stat 2
      > 1,3,NULL
      > 2,4,3
      > 3,3,2
      > 4,NULL,NULL
      > 5,NULL,NULL
      > 6,3,NULL
      > 7,3,NULL
      >
      > The values for stat2 seem to be getting lost when there is no value for stat1.
      >
      > Anyone have any ideas as to what is going on? Am I doing something stupid?[/color]

      What's RDBMS version? I had similar problem with 8.1.6, which disappeared
      after upgrade to 8.1.7. There was a workaround too, forcing hash outer
      join fixed the query.

      Comment

      • Database Guy

        #4
        Re: Outer join returns incorrect results when using sub-selects

        chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...[color=blue]
        > I'm seeing a very strange problem with outer joins.
        >
        > The example below replicates the problem:
        >
        > create table data1 (dim1 integer, stat1 float);
        > create table data2 (dim1 integer, stat2 float);
        >
        > insert into data1 values (1,1);
        > insert into data1 values (1,2);
        > insert into data1 values (2,2);
        > insert into data1 values (2,2);
        > insert into data1 values (3,3);
        > insert into data1 values (6,3);
        > insert into data1 values (7,3);
        >
        > insert into data2 values (2,1);
        > insert into data2 values (2,2);
        > insert into data2 values (3,2);
        > insert into data2 values (4,3);
        > insert into data2 values (5,3);
        >
        > I want to be able to run a query that generates the resultset:
        >
        > dim1, sum(stat1), sum(stat2)
        > 1,3,NULL
        > 2,4,3
        > 3,3,2
        > 4,NULL,3
        > 5,NULL,3
        > 6,3,NULL
        > 7,3,NULL
        >
        > I was planning to use the following sql:
        >
        > select nvl(s1.dim1, s2.dim1) dim1, stat1, stat2 from
        > (
        > select dim1, sum(stat1) stat1 from data1 group by dim1
        > ) s1
        > full outer join
        > (
        > select dim1, sum(stat2) stat2 from data2 group by dim1
        > ) s2 on s1.dim1=s2.dim1
        > order by dim1;
        >
        > When I run this. Instead of the 4 rows I was expecting I get back:
        >
        > dim1,stat1,stat 2
        > 1,3,NULL
        > 2,4,3
        > 3,3,2
        > 4,NULL,NULL
        > 5,NULL,NULL
        > 6,3,NULL
        > 7,3,NULL
        >
        > The values for stat2 seem to be getting lost when there is no value for stat1.
        >
        > Anyone have any ideas as to what is going on? Am I doing something stupid?[/color]

        No, if you are getting that result with that SQL then you are correct
        and Oracle has a serious bug. Which version?


        DG

        Comment

        • Lori

          #5
          Re: Outer join returns incorrect results when using sub-selects

          select dim1,sum(stat1) stat1,sum(stat2 ) stat2
          from (select distinct dim1, stat1, stat2 from
          data1 full outer join data2 using (dim1)
          order by dim1)
          group by dim1;

          This way will work.

          Lori

          chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...[color=blue]
          > I'm seeing a very strange problem with outer joins.
          >
          > The example below replicates the problem:
          >
          > create table data1 (dim1 integer, stat1 float);
          > create table data2 (dim1 integer, stat2 float);
          >
          > insert into data1 values (1,1);
          > insert into data1 values (1,2);
          > insert into data1 values (2,2);
          > insert into data1 values (2,2);
          > insert into data1 values (3,3);
          > insert into data1 values (6,3);
          > insert into data1 values (7,3);
          >
          > insert into data2 values (2,1);
          > insert into data2 values (2,2);
          > insert into data2 values (3,2);
          > insert into data2 values (4,3);
          > insert into data2 values (5,3);
          >
          > I want to be able to run a query that generates the resultset:
          >
          > dim1, sum(stat1), sum(stat2)
          > 1,3,NULL
          > 2,4,3
          > 3,3,2
          > 4,NULL,3
          > 5,NULL,3
          > 6,3,NULL
          > 7,3,NULL
          >
          > I was planning to use the following sql:
          >
          > select nvl(s1.dim1, s2.dim1) dim1, stat1, stat2 from
          > (
          > select dim1, sum(stat1) stat1 from data1 group by dim1
          > ) s1
          > full outer join
          > (
          > select dim1, sum(stat2) stat2 from data2 group by dim1
          > ) s2 on s1.dim1=s2.dim1
          > order by dim1;
          >
          > When I run this. Instead of the 4 rows I was expecting I get back:
          >
          > dim1,stat1,stat 2
          > 1,3,NULL
          > 2,4,3
          > 3,3,2
          > 4,NULL,NULL
          > 5,NULL,NULL
          > 6,3,NULL
          > 7,3,NULL
          >
          > The values for stat2 seem to be getting lost when there is no value for stat1.
          >
          > Anyone have any ideas as to what is going on? Am I doing something stupid?[/color]

          Comment

          • Chris Greening

            #6
            Re: Outer join returns incorrect results when using sub-selects

            Hi Ed/Alex, thanks for replying

            Ed, here's a simpler example that gives the same problem:

            create table test1 (dim1 integer, stat1 float);
            create table test2 (dim1 integer, stat2 float);

            insert into test1 values (1, 1);
            insert into test2 values (2, 1);
            commit;

            select * from
            (select dim1, sum(stat1) from test1 group by dim1) s1 full outer join
            (select dim1, sum(stat2) from test2 group by dim1) s2 on s1.dim1=s2.dim1 ;

            DIM1 SUM(STAT1) DIM1 SUM(STAT2)
            ---------- ---------- ---------- ----------
            1 1
            2

            As you can see the SUM(STAT2) column is always null.

            If I remove the group by (and the sum) from the subselect I get the correct results:

            select * from
            (select dim1, stat1 from test1) s1 full outer join
            (select dim1, stat2 from test2) s2 on s1.dim1=s2.dim1 ;

            DIM1 STAT1 DIM1 STAT2
            ---------- ---------- ---------- ----------
            1 1
            2 1

            Unfortunately we need the aggregation in our subselects.

            Alex, we are using:

            Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
            With the Partitioning, OLAP and Oracle Data Mining options
            JServer Release 9.2.0.1.0 - Production

            Thanks
            Chris.

            afilonov@yahoo. com (Alex Filonov) wrote in message news:<336da121. 0309261059.60a3 2c36@posting.go ogle.com>...[color=blue]
            > chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...[color=green]
            > > I'm seeing a very strange problem with outer joins.
            > >
            > > The example below replicates the problem:
            > >
            > > create table data1 (dim1 integer, stat1 float);
            > > create table data2 (dim1 integer, stat2 float);
            > >
            > > insert into data1 values (1,1);
            > > insert into data1 values (1,2);
            > > insert into data1 values (2,2);
            > > insert into data1 values (2,2);[/color][/color]
            ....

            Comment

            • Lori

              #7
              Re: Outer join returns incorrect results when using sub-selects

              Try this :

              select dim1,sum(stat1) stat1,sum(stat2 ) stat2
              from (select distinct dim1, stat1, stat2 from
              data1 full outer join data2 using (dim1)
              order by dim1)
              group by dim1;


              Seems to give the proper output with no nvl surprises.

              Lori

              Comment

              • Alex Filonov

                #8
                Re: Outer join returns incorrect results when using sub-selects

                chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309290033.3978 c229@posting.go ogle.com>...[color=blue]
                > Hi Ed/Alex, thanks for replying
                >
                > Ed, here's a simpler example that gives the same problem:
                >
                > create table test1 (dim1 integer, stat1 float);
                > create table test2 (dim1 integer, stat2 float);
                >
                > insert into test1 values (1, 1);
                > insert into test2 values (2, 1);
                > commit;
                >
                > select * from
                > (select dim1, sum(stat1) from test1 group by dim1) s1 full outer join
                > (select dim1, sum(stat2) from test2 group by dim1) s2 on s1.dim1=s2.dim1 ;
                >
                > DIM1 SUM(STAT1) DIM1 SUM(STAT2)
                > ---------- ---------- ---------- ----------
                > 1 1
                > 2
                >
                > As you can see the SUM(STAT2) column is always null.
                >
                > If I remove the group by (and the sum) from the subselect I get the correct results:
                >[/color]

                Looks like problem was fixed in 9.2.0.3:

                SQL> create table test1 (dim1 integer, stat1 float);
                create table test2 (dim1 integer, stat2 float);

                insert into test1 values (1, 1);
                insert into test2 values (2, 1);
                commit;

                select * from
                (select dim1, sum(stat1) from test1 group by dim1) s1 full outer join
                (select dim1, sum(stat2) from test2 group by dim1) s2 on s1.dim1=s2.dim1 ;


                Table created.

                SQL>
                Table created.

                SQL> SQL>
                1 row created.

                SQL>
                1 row created.

                SQL>
                Commit complete.

                SQL> SQL> 2 3
                DIM1 SUM(STAT1) DIM1 SUM(STAT2)
                ---------- ---------- ---------- ----------
                1 1
                2 1

                You need to upgrade your server.
                [color=blue]
                > select * from
                > (select dim1, stat1 from test1) s1 full outer join
                > (select dim1, stat2 from test2) s2 on s1.dim1=s2.dim1 ;
                >
                > DIM1 STAT1 DIM1 STAT2
                > ---------- ---------- ---------- ----------
                > 1 1
                > 2 1
                >
                > Unfortunately we need the aggregation in our subselects.
                >
                > Alex, we are using:
                >
                > Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
                > With the Partitioning, OLAP and Oracle Data Mining options
                > JServer Release 9.2.0.1.0 - Production
                >
                > Thanks
                > Chris.
                >
                > afilonov@yahoo. com (Alex Filonov) wrote in message news:<336da121. 0309261059.60a3 2c36@posting.go ogle.com>...[color=green]
                > > chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...[color=darkred]
                > > > I'm seeing a very strange problem with outer joins.
                > > >
                > > > The example below replicates the problem:
                > > >
                > > > create table data1 (dim1 integer, stat1 float);
                > > > create table data2 (dim1 integer, stat2 float);
                > > >
                > > > insert into data1 values (1,1);
                > > > insert into data1 values (1,2);
                > > > insert into data1 values (2,2);
                > > > insert into data1 values (2,2);[/color][/color]
                > ...[/color]

                Comment

                • Chris Greening

                  #9
                  Re: Outer join returns incorrect results when using sub-selects

                  Hi Ed/Alex, thanks for replying

                  Ed, here's a simpler example that gives the same problem:

                  create table test1 (dim1 integer, stat1 float);
                  create table test2 (dim1 integer, stat2 float);

                  insert into test1 values (1, 1);
                  insert into test2 values (2, 1);
                  commit;

                  select * from
                  (select dim1, sum(stat1) from test1 group by dim1) s1 full outer join
                  (select dim1, sum(stat2) from test2 group by dim1) s2 on s1.dim1=s2.dim1 ;

                  DIM1 SUM(STAT1) DIM1 SUM(STAT2)
                  ---------- ---------- ---------- ----------
                  1 1
                  2

                  As you can see the SUM(STAT2) column is always null.

                  If I remove the group by (and the sum) from the subselect I get the correct results:

                  select * from
                  (select dim1, stat1 from test1) s1 full outer join
                  (select dim1, stat2 from test2) s2 on s1.dim1=s2.dim1 ;

                  DIM1 STAT1 DIM1 STAT2
                  ---------- ---------- ---------- ----------
                  1 1
                  2 1

                  Unfortunately we need the aggregation in our subselects.

                  Alex, we are using:

                  Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
                  With the Partitioning, OLAP and Oracle Data Mining options
                  JServer Release 9.2.0.1.0 - Production

                  Thanks
                  Chris.

                  afilonov@yahoo. com (Alex Filonov) wrote in message news:<336da121. 0309261059.60a3 2c36@posting.go ogle.com>...
                  chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...
                  I'm seeing a very strange problem with outer joins.

                  The example below replicates the problem:

                  create table data1 (dim1 integer, stat1 float);
                  create table data2 (dim1 integer, stat2 float);

                  insert into data1 values (1,1);
                  insert into data1 values (1,2);
                  insert into data1 values (2,2);
                  insert into data1 values (2,2);
                  ....

                  Comment

                  • Alex Filonov

                    #10
                    Re: Outer join returns incorrect results when using sub-selects

                    chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309290033.3978 c229@posting.go ogle.com>...
                    Hi Ed/Alex, thanks for replying
                    >
                    Ed, here's a simpler example that gives the same problem:
                    >
                    create table test1 (dim1 integer, stat1 float);
                    create table test2 (dim1 integer, stat2 float);
                    >
                    insert into test1 values (1, 1);
                    insert into test2 values (2, 1);
                    commit;
                    >
                    select * from
                    (select dim1, sum(stat1) from test1 group by dim1) s1 full outer join
                    (select dim1, sum(stat2) from test2 group by dim1) s2 on s1.dim1=s2.dim1 ;
                    >
                    DIM1 SUM(STAT1) DIM1 SUM(STAT2)
                    ---------- ---------- ---------- ----------
                    1 1
                    2
                    >
                    As you can see the SUM(STAT2) column is always null.
                    >
                    If I remove the group by (and the sum) from the subselect I get the correct results:
                    >
                    Looks like problem was fixed in 9.2.0.3:

                    SQLcreate table test1 (dim1 integer, stat1 float);
                    create table test2 (dim1 integer, stat2 float);

                    insert into test1 values (1, 1);
                    insert into test2 values (2, 1);
                    commit;

                    select * from
                    (select dim1, sum(stat1) from test1 group by dim1) s1 full outer join
                    (select dim1, sum(stat2) from test2 group by dim1) s2 on s1.dim1=s2.dim1 ;


                    Table created.

                    SQL>
                    Table created.

                    SQLSQL>
                    1 row created.

                    SQL>
                    1 row created.

                    SQL>
                    Commit complete.

                    SQLSQL 2 3
                    DIM1 SUM(STAT1) DIM1 SUM(STAT2)
                    ---------- ---------- ---------- ----------
                    1 1
                    2 1

                    You need to upgrade your server.
                    select * from
                    (select dim1, stat1 from test1) s1 full outer join
                    (select dim1, stat2 from test2) s2 on s1.dim1=s2.dim1 ;
                    >
                    DIM1 STAT1 DIM1 STAT2
                    ---------- ---------- ---------- ----------
                    1 1
                    2 1
                    >
                    Unfortunately we need the aggregation in our subselects.
                    >
                    Alex, we are using:
                    >
                    Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production
                    With the Partitioning, OLAP and Oracle Data Mining options
                    JServer Release 9.2.0.1.0 - Production
                    >
                    Thanks
                    Chris.
                    >
                    afilonov@yahoo. com (Alex Filonov) wrote in message news:<336da121. 0309261059.60a3 2c36@posting.go ogle.com>...
                    chris.greening@ actix.com (Chris Greening) wrote in message news:<57c2930a. 0309260545.3eef 1732@posting.go ogle.com>...
                    I'm seeing a very strange problem with outer joins.
                    >
                    The example below replicates the problem:
                    >
                    create table data1 (dim1 integer, stat1 float);
                    create table data2 (dim1 integer, stat2 float);
                    >
                    insert into data1 values (1,1);
                    insert into data1 values (1,2);
                    insert into data1 values (2,2);
                    insert into data1 values (2,2);
                    ...

                    Comment

                    Working...