A Question on query efficiency

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • mansoorm

    A Question on query efficiency

    I've a db containing two tables which every item in table one is
    linked to 1--200 items in table 2.
    I want to generate a report like this:

    Item in table 1
    ---------------------------------------
    1st-related Item in table2
    2nd related Item in table2
    3rd related Item in table2
    ....
    Nth related Item in table2
    ----------------------------------------


    to do, I've generated a third table with a text-column to hold the
    report text. Using an small procedure, I'v generated reports for all
    items in table 1 and stored the text in the respective column.

    Now, querying the third table take 100 times longer than joining
    first and second tables.
    the third table contains a single ID field which is a unique index.
    the query for two table scenario and single table scenario are:


    q1: select table2.* from table1 inner join table2 on table1.Id=
    table2.Id where table2.id = <anid>

    q2: select table3.* from table3 where Id = <anid>

    Any Idea about the poor performance of the second query ?
  • nidaar

    #2
    Re: A Question on query efficiency

    To have a better idea for the comparison of the queries, it would be
    nice to have some information on the index structures of table1 and
    table2. I assume they are also indexed based on their id columns. This
    gives the query engine the option to construct the result set at the
    same time when it uses the indexes to join and filter the id columns.
    It's like a covering index and sql server takes good advantage of this
    index combining feature.
    When it comes to table3, there are two cons:
    1. table3 is much more wider, because of the second column. Especially
    if it exceeds 8K (which is the size of each page), then extended
    reads occur to return the value from the second column.
    2. And because the second column is not part of the index, it should
    be read from the disk, while when two narrow tables were joined, as
    mentioned above, the list of related items to form the second column
    of the result set is likely to be already in the cache, when running
    the query.

    If the intenion is having a outline-style report which shows 1-200
    items from table2 for each single item from table1, it would be worth
    to consider this kind of outlining on the client side (e.e. using
    reporting services) instead of creating table3.

    Comment

    • nidaar

      #3
      Re: A Question on query efficiency

      To have a better idea for the comparison of the queries, it would be
      nice to have some information on the index structures of table1 and
      table2. I assume they are also indexed based on their id columns.
      This
      gives the query engine the option to construct the result set at the
      same time when it uses the indexes to join and filter the id columns.
      It's like a covering index and sql server takes good advantage of
      this
      index combining feature.

      When it comes to table3, there are two cons:
      1. table3 is much more wider, because of the text-column to hold the
      report text. Especially if it exceeds 8K (which is the size of each
      page), then extended
      reads occur to return the value from that column.

      2. And because the text-column to hold the
      report text is not part of the index, it should
      be read from the disk, while when two narrow tables were joined, as
      mentioned above, the list of related items to form the report text
      column
      of the result set is likely to be already in the cache, while running
      the query.

      If the intention is having an outlined-style report which shows 1-200
      items from table2 for each single item from table1, it would be worth
      to consider this kind of outlining at middle tier or client side (e.g.
      using
      reporting services) instead of creating table3.

      Comment

      • Erland Sommarskog

        #4
        Re: A Question on query efficiency

        mansoorm (muharram@gmail .com) writes:
        I've a db containing two tables which every item in table one is
        linked to 1--200 items in table 2.
        I want to generate a report like this:
        >
        Item in table 1
        ---------------------------------------
        1st-related Item in table2
        2nd related Item in table2
        3rd related Item in table2
        ...
        Nth related Item in table2
        ----------------------------------------
        >
        >
        to do, I've generated a third table with a text-column to hold the
        report text. Using an small procedure, I'v generated reports for all
        items in table 1 and stored the text in the respective column.
        >
        Now, querying the third table take 100 times longer than joining
        first and second tables.
        the third table contains a single ID field which is a unique index.
        the query for two table scenario and single table scenario are:
        >
        >
        q1: select table2.* from table1 inner join table2 on table1.Id=
        table2.Id where table2.id = <anid>
        >
        q2: select table3.* from table3 where Id = <anid>
        >
        Any Idea about the poor performance of the second query ?
        First of all, it's not obvious why you should have this third table
        in the first place.

        Second, did you define any primary key for table3 that includes the id?



        --
        Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

        Links for SQL Server Books Online:
        SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
        SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
        SQL 2000: http://www.microsoft.com/sql/prodinf...ons/books.mspx

        Comment

        • mansoorm

          #5
          Re: A Question on query efficiency

          1- I thought that generating the report text and storing it may speed
          up the report generation upon request.
          2-Yes ; all the tables have their id column defined as primary key.

          Comment

          • mansoorm

            #6
            Re: A Question on query efficiency

            Thanks for the comments. then the disk I/O is the problem ?

            Comment

            • Erland Sommarskog

              #7
              Re: A Question on query efficiency

              mansoorm (muharram@gmail .com) writes:
              1- I thought that generating the report text and storing it may speed
              up the report generation upon request.
              It does not seem like it did.
              2-Yes ; all the tables have their id column defined as primary key.
              I think you need to post the CREATE TABLE statements and your queries.
              Else we will just keep on guessing.
              --
              Erland Sommarskog, SQL Server MVP, esquel@sommarsk og.se

              Links for SQL Server Books Online:
              SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
              SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
              SQL 2000: http://www.microsoft.com/sql/prodinf...ons/books.mspx

              Comment

              • Philipp Post

                #8
                Re: A Question on query efficiency

                Mansoorm,
                1- I thought that generating the report text and storing it may speed up the report generation upon request. <
                The downside is that you always have to populate this third table and
                if the underlying data in table one and two changed, your report is
                outdated.

                Report creators such as microsoft report viewer or crystal reports can
                easily work without such an intermediate table. You would usually pass
                something like

                Table1Value | DependingTable2 Value

                Then a grouping can be done in the front end (i. e. the report
                creator).

                brgds

                Philipp Post

                Comment

                • --CELKO--

                  #9
                  Re: A Question on query efficiency

                  >I've a db containing two tables which every item in table one is linked [sic] to 1--200 items in table 2. <<

                  In RDBMS we have references; links is a concept from older network
                  databases.
                  > I want to generate a report like this:
                  Item in table 1
                  ---------------------------------------
                  1st-related Item in table2
                  2nd related Item in table2 <<

                  Another conceptual error! Tables have no ordering, so ordinal
                  numbering makes no sense. That is from file systems or network
                  databases.
                  > I've generated a third table with a text-column to hold the report text. Using an small procedure, .. <<
                  The purpose of SQL is to return data. Period. It is not for
                  formatting it for display. You are still writing COBOL programs in
                  RDBMS. This is why we have report writers and front end
                  applications.
                  >the third table contains a single ID field [sic: columns are not fields] which is a unique index. <<
                  An index is not exposed to the user; did you mimic a sequential file
                  by using some silly auto-numbering scheme? There is no such thing as
                  a magical, universal "id" data element in a correct data model; each
                  identifier is for a *particular* kind of entity.

                  Please stop writing 1950's COBOL in SQL. What you have done is take
                  an automobile and hitched your old horse to it.

                  Comment

                  • --CELKO--

                    #10
                    Re: A Question on query efficiency

                    > 2-Yes ; all the tables have their id column defined as primary key. <<

                    Yep, you have no idea what RDBMS is and are still writing file systems
                    in SQL. Please stop coding until you can read a book or take a class
                    on the basics.

                    And if this magical, universal "id" is an IDENTITY it is not a column;
                    it is a physical table property, which you should call "physical
                    sequential insertion attempt count" instead.

                    Comment

                    Working...