Performance Complex SQL Issue

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Tim Smith

    Performance Complex SQL Issue

    Hi,
    I have a table ORDER_DETAIL with 22 million rows which has an index
    of
    (person_id, code_id, created_dtt)

    I have another ORDER table with 5 million rows which has an index
    of
    (order_dtt, person_id)

    I have a small CODES table with 1000 rows which allows me to get
    the 50 or so codes I need. My query needs to be something like this:

    select od.person_id, od.code_id
    from order_detail od, order o, codes c
    where o.order_dtt between sysdate-365 and sysdate
    and o.person_id = od.person_id
    and od.code_id in (select code_id from codes where code_type =
    'MYCODE')
    and od.create_dtt between sysdate-365 and sysdate

    But is this using the full index on the ORDER_DETAIL table? Should I
    be using EXISTS in some fashion instead?

    Accessing the ORDER_DETAIL table is a pain because it is so large, the
    code_id's I need are a relatively small number but the date range is
    about 25% of the table - same with the ORDER table. However it is the
    only way I can get to filtering down to the code_id.

    This is a simplification of the problem - but accurate - adding
    additional indexes is not an option.

    thanks!

    Tim
  • Erland Sommarskog

    #2
    Re: Performance Complex SQL Issue

    [posted and mailed, please reply in news]

    Tim Smith (timasmith@hotm ail.com) writes:[color=blue]
    > I have a table ORDER_DETAIL with 22 million rows which has an index of
    > (person_id, code_id, created_dtt)
    >
    > I have another ORDER table with 5 million rows which has an index of
    > (order_dtt, person_id)
    >
    > I have a small CODES table with 1000 rows which allows me to get
    > the 50 or so codes I need. My query needs to be something like this:
    >
    > select od.person_id, od.code_id
    > from order_detail od, order o, codes c
    > where o.order_dtt between sysdate-365 and sysdate
    > and o.person_id = od.person_id
    > and od.code_id in (select code_id from codes where code_type =
    > 'MYCODE')
    > and od.create_dtt between sysdate-365 and sysdate
    >
    > But is this using the full index on the ORDER_DETAIL table? Should I
    > be using EXISTS in some fashion instead?[/color]

    First, what is sysdate supposed to be? I ask because there is nothing
    called sysdate in MS SQL Server. While this forum is for MS SQL Server,
    it has happened before, that people have asked questions that have
    applied to other database engines. While some SQL questions are fairly
    generic, performance questions are often engine specific, because
    different DB engines uses difference strategies.

    The query as written contains a superfluous occurance of codes in
    the FROM clause. This could lead to a cartesian join between codes
    and the rest of the result set.

    The simplest way to involve codes in the query would be:

    select od.person_id, od.code_id
    from order_detail od, order o, codes c
    where o.order_dtt between sysdate-365 and sysdate
    and o.person_id = od.person_id
    and od.code_id = c.code_id
    and c.code_type = 'MYCODE'
    and od.create_dtt between sysdate-365 and sysdate

    Now, assuming that you are using MS SQL Server 2000, I think you would get
    better performance without involving the orders table at all. Had you been
    able to narrow down the range to say 1% of the table with the condition on
    orders.order_dt t, it would be another issue.

    To join orders and order_details, there are three different join
    strategies that SQL Server can use Nested Loops, Merge Join and
    Hash Join. Nested Loops, means that for each matching rows in
    Orders, look up a row in Order Details. When you look up 25% of
    the rows, you will access several data pages more than once, and
    you will get more reads than for a plain table scan on Order Details.

    Merge Join and Hash Join both involves scanning the tables, but only
    doing it once. I don't know if a merge join is possible in this case.

    Of course, here I am discussing the example query you posted. The
    actual query you have may be different.

    --
    Erland Sommarskog, SQL Server MVP, sommar@algonet. se

    Books Online for SQL Server SP3 at
    SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

    Comment

    Working...