Improving a SQL statement

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • wellhole
    New Member
    • Jan 2008
    • 14

    Improving a SQL statement

    Is there any way to improve this statement? A is the header table while B has 0 to many records for each record in A. I'm trying to pick only the latest record in B for the associated seq in A.

    Code:
    select * from A left outer join B on B.seq = A.seq
    and B.order = (select max(order) from B where B.seq = A.seq)
  • lowpaydba
    New Member
    • Oct 2009
    • 5

    #2
    Hi,

    I'm a bit surprised that this query is working in its current form. When I tried to run it i got the following:
    -----------------------------------------------------------------------------------------------
    SQL0338N An ON clause associated with a JOIN operator or in a MERGE statement is not valid.

    Explanation:

    An ON clause associated with a JOIN operator or in a MERGE
    statement is not valid for one of the following reasons.

    o The ON clause cannot include any subqueries.
    o Column references in an ON clause must only reference columns
    of tables that are in the scope of the ON clause.
    o Scalar fullselects are not allowed in the expressions of an
    ON clause.
    o A function referenced in an ON clause of a full outer join
    must be deterministic and have no external action.
    o A dereference operation (->) cannot be used.
    o A SQL function or SQL method cannot be used.
    o The ON clause cannot include an XMLQUERY or XMLEXISTS
    expression.
    ---------------------------------------------------------------------------------------

    1. Are you sure this is a DB2 query? if not then this might be in the wrong place.

    2. Assuming your tables are like that below.

    Code:
    A                                        B                         
    Seq         Order                      Seq    order
    1               1                      1         1
    2               2                      1         2 
                                           1         3
                                           1         4
                                           2         1 
                                           2         2
                                           2         3
                                           2         4 
    
    And the desired output is
    
    Seq         Order
    1                4
    2                4
    
    try..
    -------------------------------------------
    select  r2.seq, r2.ord from 
    (
          select r1.seq, r1.ord  from
          (
    	select distinct b.seq, max(b.ord)  from b   
                       group by b.seq
           ) r1
    left outer join a on r1."ord" = a."ord" 
    )r2;
    -------------------------------------------
    The advantage here is that you are not doing subselects for each row satisfying the join condition. Subselects are expensive their use should be minimized where possible.

    Comment

    • wellhole
      New Member
      • Jan 2008
      • 14

      #3
      I tried something similar, but it runs slower than the original. I'm guessing its because these temporary tables don't use indexes. The query has to return the rows back from both tables, and TABLEA has to be on the left of the outer join which was why I couldn't just use what you wrote above.
      Code:
      select *
      from TABLEA
      left outer join (select * from TABLEB r2                  
                    left outer join (select ceseq mseq, max(ceeord) mord 
                    from TABLEB group by ceseq) r1              
                    on r2.ceseq = r1.mseq and r2.ceeord = mord           
                                  ) r3 on cmseq = ceseq and ceeord = mord

      Comment

      • lowpaydba
        New Member
        • Oct 2009
        • 5

        #4
        Are you using the DB2 Control Center to run this query? If so can you do a visual explain and attach that to the thread, that should show which indexes are used by the db2 optimizer as well as costs of doing joins, sorts etc to help find ways to optimize this query.

        If you do not have access to the DB2 Control Center you can use the command line dynexpln or db2exfmt tools to gather the same information.

        Comment

        • wellhole
          New Member
          • Jan 2008
          • 14

          #5
          There is no control center and those commands don't work on the iSeries and as/400.

          Comment

          • lowpaydba
            New Member
            • Oct 2009
            • 5

            #6
            Info on gathering explain data for AS/400:

            Comment

            • cburnett
              New Member
              • Aug 2007
              • 57

              #7
              In general, I have found that the OLAP functions perform well in this regard. You might want to try something like the following sample code:

              Code:
              create table A(seq int, id char(1));
              insert into A values (1,'A');
              insert into A values (2,'B');
              insert into A values (3,'C');
              
              create table B(seq int, order int);
              insert into B values (1,1);
              insert into B values (1,2);
              insert into B values (1,3);
              insert into B values (1,4);
              insert into B values (2,2);
              insert into B values (2,3);
              insert into B values (2,4);
              insert into B values (2,5);
              
              select distinct A.seq, A.id, max(B.order) over(partition by B.seq) as max_order
                from A
                left outer join
                     B
                  on A.seq = B.seq;

              Comment

              Working...