Looking for query plan determination information

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Thomas R. Hummel

    Looking for query plan determination information

    Hi,

    I was just helping a coworker optimize a query. He had two versions:
    one which used UNION for each value for which he was tallying results
    and another query which used GROUP BY. Here is an aproximation of what
    they were:

    Query #1:
    ---------
    SELECT 12 AS [Row],
    ISNULL(SUM(CASE WHEN T.my_date BETWEEN @week_start_dat e AND
    DATEADD(d, 1, @week_start_dat e) THEN 1 ELSE 0 END), 0) AS [Monday],
    ISNULL(SUM(CASE WHEN T.my_date BETWEEN DATEADD(d, 1,
    @week_start_dat e) AND DATEADD(d, 2, @week_start_dat e) THEN 1 ELSE 0
    END), 0) AS [Tuesday]
    FROM My_Table T
    INNER JOIN Another_Table T2 ON T2.col1 = T.col1
    WHERE T.my_date BETWEEN @week_start_dat e AND @week_end_date
    AND T.col2 = 5
    UNION
    SELECT 13 AS [Row],
    ISNULL(SUM(CASE WHEN T.my_date BETWEEN @week_start_dat e AND
    DATEADD(d, 1, @week_start_dat e) THEN 1 ELSE 0 END), 0) AS [Monday],
    ISNULL(SUM(CASE WHEN T.my_date BETWEEN DATEADD(d, 1,
    @week_start_dat e) AND DATEADD(d, 2, @week_start_dat e) THEN 1 ELSE 0
    END), 0) AS [Tuesday]
    FROM My_Table T
    INNER JOIN Another_Table T2 ON T2.col1 = T.col1
    WHERE T.my_date BETWEEN @week_start_dat e AND @week_end_date
    AND T.col2 = 6

    Query #2:
    ---------
    SELECT R.row_num AS [Row],
    ISNULL(SUM(CASE WHEN T.my_date BETWEEN @week_start_dat e AND
    DATEADD(d, 1, @week_start_dat e) THEN 1 ELSE 0 END), 0) AS [Monday],
    ISNULL(SUM(CASE WHEN T.my_date BETWEEN DATEADD(d, 1,
    @week_start_dat e) AND DATEADD(d, 2, @week_start_dat e) THEN 1 ELSE 0
    END), 0) AS [Tuesday]
    FROM My_Table T
    INNER JOIN Another_Table T2 ON T2.col1 = T.col1
    INNER JOIN Report_Rows R ON R.col2 = T.col2
    WHERE T.my_date BETWEEN @week_start_dat e AND @week_end_date
    GROUP BY ALL R.row_num
    ORDER BY R.row_num

    The Report_Rows table in this case would have had two rows mapping row
    12 to a column value of 5 and row 13 to a column value of 6. The
    second query was performing horribly until I noticed the ALL keyword
    in the GROUP BY, which I didn't think was necessary. When I removed
    that it performed more like I expected it to perform.

    Before I had noticed that I was scouring over the query plans and
    couldn't figure out why in one instance the query optimizer chose to
    join My_Table and Another_Table, yet when the ALL keyword was there it
    chose to return all of the records from Another_Table (a rather large
    table) and join it to the Report_Rows table before then joining to
    My_Table, which had the date criteria in the WHERE clause.

    So, if you've read this far without giving up...

    1. Why would the ALL keyword cause this? I understand the
    functionality of ALL, but I still don't see why that caused the
    reordering of the joins.

    2. (more importantly) Are there any good resources that you know of
    that explain how the query optimizer choices its query paths? Do the
    "Inside SQL Server" books go into that much detail? Any good online
    resources?

    Thanks!
    -Tom.
  • Simon Hayes

    #2
    Re: Looking for query plan determination information


    "Thomas R. Hummel" <tom_hummel@hot mail.com> wrote in message
    news:a2c0eeb8.0 309160626.90117 7@posting.googl e.com...[color=blue]
    > Hi,
    >
    > I was just helping a coworker optimize a query. He had two versions:
    > one which used UNION for each value for which he was tallying results
    > and another query which used GROUP BY. Here is an aproximation of what
    > they were:
    >
    > Query #1:
    > ---------
    > SELECT 12 AS [Row],
    > ISNULL(SUM(CASE WHEN T.my_date BETWEEN @week_start_dat e AND
    > DATEADD(d, 1, @week_start_dat e) THEN 1 ELSE 0 END), 0) AS [Monday],
    > ISNULL(SUM(CASE WHEN T.my_date BETWEEN DATEADD(d, 1,
    > @week_start_dat e) AND DATEADD(d, 2, @week_start_dat e) THEN 1 ELSE 0
    > END), 0) AS [Tuesday]
    > FROM My_Table T
    > INNER JOIN Another_Table T2 ON T2.col1 = T.col1
    > WHERE T.my_date BETWEEN @week_start_dat e AND @week_end_date
    > AND T.col2 = 5
    > UNION
    > SELECT 13 AS [Row],
    > ISNULL(SUM(CASE WHEN T.my_date BETWEEN @week_start_dat e AND
    > DATEADD(d, 1, @week_start_dat e) THEN 1 ELSE 0 END), 0) AS [Monday],
    > ISNULL(SUM(CASE WHEN T.my_date BETWEEN DATEADD(d, 1,
    > @week_start_dat e) AND DATEADD(d, 2, @week_start_dat e) THEN 1 ELSE 0
    > END), 0) AS [Tuesday]
    > FROM My_Table T
    > INNER JOIN Another_Table T2 ON T2.col1 = T.col1
    > WHERE T.my_date BETWEEN @week_start_dat e AND @week_end_date
    > AND T.col2 = 6
    >
    > Query #2:
    > ---------
    > SELECT R.row_num AS [Row],
    > ISNULL(SUM(CASE WHEN T.my_date BETWEEN @week_start_dat e AND
    > DATEADD(d, 1, @week_start_dat e) THEN 1 ELSE 0 END), 0) AS [Monday],
    > ISNULL(SUM(CASE WHEN T.my_date BETWEEN DATEADD(d, 1,
    > @week_start_dat e) AND DATEADD(d, 2, @week_start_dat e) THEN 1 ELSE 0
    > END), 0) AS [Tuesday]
    > FROM My_Table T
    > INNER JOIN Another_Table T2 ON T2.col1 = T.col1
    > INNER JOIN Report_Rows R ON R.col2 = T.col2
    > WHERE T.my_date BETWEEN @week_start_dat e AND @week_end_date
    > GROUP BY ALL R.row_num
    > ORDER BY R.row_num
    >
    > The Report_Rows table in this case would have had two rows mapping row
    > 12 to a column value of 5 and row 13 to a column value of 6. The
    > second query was performing horribly until I noticed the ALL keyword
    > in the GROUP BY, which I didn't think was necessary. When I removed
    > that it performed more like I expected it to perform.
    >
    > Before I had noticed that I was scouring over the query plans and
    > couldn't figure out why in one instance the query optimizer chose to
    > join My_Table and Another_Table, yet when the ALL keyword was there it
    > chose to return all of the records from Another_Table (a rather large
    > table) and join it to the Report_Rows table before then joining to
    > My_Table, which had the date criteria in the WHERE clause.
    >
    > So, if you've read this far without giving up...
    >
    > 1. Why would the ALL keyword cause this? I understand the
    > functionality of ALL, but I still don't see why that caused the
    > reordering of the joins.
    >
    > 2. (more importantly) Are there any good resources that you know of
    > that explain how the query optimizer choices its query paths? Do the
    > "Inside SQL Server" books go into that much detail? Any good online
    > resources?
    >
    > Thanks!
    > -Tom.[/color]

    It's almost impossible (at least for me) to know why the optimizer chose a
    particular plan without knowing the table structures, indexes and amount of
    data, and even with that knowledge, it may not be clear at all. So I can't
    say much about your first question, but I can definitely recommend Inside
    SQL Server 2000 for a great explanation of what the optimizer considers when
    it produces a query plan. There's a lot of detail, including how to go about
    using query plans to tune individual queries. Another useful book is
    Advanced Transact SQL for SQL Server 2000, which also explains many of the
    examples with reference to their query plans.

    Simon


    Comment

    • Erland Sommarskog

      #3
      Re: Looking for query plan determination information

      Thomas R. Hummel (tom_hummel@hot mail.com) writes:[color=blue]
      > Before I had noticed that I was scouring over the query plans and
      > couldn't figure out why in one instance the query optimizer chose to
      > join My_Table and Another_Table, yet when the ALL keyword was there it
      > chose to return all of the records from Another_Table (a rather large
      > table) and join it to the Report_Rows table before then joining to
      > My_Table, which had the date criteria in the WHERE clause.[/color]

      I can only echo Simon's reply that without table definitions etc, this
      is difficult to tell. In fact, even with all information available,
      this might be difficult to tell. Understanding the output of a cost-
      based optimizer is by no means an easy task.
      [color=blue]
      > 2. (more importantly) Are there any good resources that you know of
      > that explain how the query optimizer choices its query paths? Do the
      > "Inside SQL Server" books go into that much detail? Any good online
      > resources?[/color]

      Certainly, you learn a great deal from Kalen's book. But I also like to
      add that that experience counts a lot too. And some creative thinking.
      The basic thing to understand is why a table scan may be better than
      an index seek. This is something which also can be extended to joins.
      That is a scan + merge/hash join may be faser than seek + loop join.

      But then there are all such wild things which includes parallelism that
      I find myself understanding only fragments of.

      --
      Erland Sommarskog, SQL Server MVP, sommar@algonet. se

      Books Online for SQL Server SP3 at
      Transform your business with a unified data platform. SQL Server 2019 comes with Apache Spark and Hadoop Distributed File System (HDFS) for intelligence over all your data.

      Comment

      • Thomas R. Hummel

        #4
        Re: Looking for query plan determination information

        Thank you both for the input. I had tried to duplicate the effect with
        test tables, but as you know, the query optimizer takes a lot into
        account and I couldn't find an example that would be practical for
        posting here.

        I will give the two books that Simon suggested a more thorough read.
        I've seen adverts for Kalen's online webinars as well, so perhaps I'll
        look into those.
        [color=blue]
        > But then there are all such wild things which includes parallelism that
        > I find myself understanding only fragments of.[/color]
        I hate to think of something that is complex enough that you have
        trouble understanding it Erland... ;-)

        Thanks again!
        -Tom.

        Comment

        Working...