SQL performance: Nested SELECT vs. INNER JOIN

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Brian

    SQL performance: Nested SELECT vs. INNER JOIN

    Hello All -

    I am wondering if anyone has any thoughts on which is better from a
    performance perspective: a nested Select statement or an Inner Join.


    For example, I could do either of the following:

    SELECT supplier_name
    FROM supplier
    WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
    product_ID =22);

    VS.

    SELECT supplier.suppli er_name
    FROM supplier INNER JOIN products ON supplier.suppli er_ID =
    products.suppli er_ID
    WHERE products.produc t_ID = 22;


    Which is faster?

    Thanks.
    Brian
  • Trevor Best

    #2
    Re: SQL performance: Nested SELECT vs. INNER JOIN

    On 25 Sep 2003 11:37:05 -0700 in comp.databases. ms-access,
    brianlappin@yah oo.com (Brian) wrote:
    [color=blue]
    >Hello All -
    >
    >I am wondering if anyone has any thoughts on which is better from a
    >performance perspective: a nested Select statement or an Inner Join.
    >
    >
    >For example, I could do either of the following:[/color]

    (air code below, be sure to post your findings :)

    Dim strSQL(1 to 2) As String
    Dim i as long, j as long
    Dim varStart As variant
    Dim rs As Recordset
    Dim db As Database

    strSQL(1)="SELE CT supplier_name " & _
    "FROM supplier " & _
    "WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE " & _
    "product_ID =22);"

    strSQL(2)="SELE CT supplier.suppli er_name " & _
    "FROM supplier INNER JOIN products ON supplier.suppli er_ID = " & _
    "products.suppl ier_ID " & _
    "WHERE products.produc t_ID = 22; "

    set db = currentdb
    For i = 1 to 2
    varStart=now()
    for j=1 to 1000
    set rs=db.openrecor dset(strSQL(i), dbopensnapshot)
    rs.close
    set rs=nothing
    next j
    debug.print "Method " & i & " " & DateDiff("s",va rStart,Now())
    Next i
    set db=nothing

    --
    A)bort, R)etry, I)nfluence with large hammer.

    Comment

    • John Winterbottom

      #3
      Re: SQL performance: Nested SELECT vs. INNER JOIN

      "Brian" <brianlappin@ya hoo.com> wrote in message
      news:be1a8433.0 309251037.6218f 39d@posting.goo gle.com...[color=blue]
      > Hello All -
      >
      > I am wondering if anyone has any thoughts on which is better from a
      > performance perspective: a nested Select statement or an Inner Join.
      >
      >
      > For example, I could do either of the following:
      >
      > SELECT supplier_name
      > FROM supplier
      > WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
      > product_ID =22);
      >
      > VS.
      >
      > SELECT supplier.suppli er_name
      > FROM supplier INNER JOIN products ON supplier.suppli er_ID =
      > products.suppli er_ID
      > WHERE products.produc t_ID = 22;
      >
      >
      > Which is faster?
      >[/color]


      Try them and see! A clever optimizer should produce identical query plans.
      However, a mistake that I've seen very often is where the subquery generates
      multiple rows for each supplier, (for example, if you wanted to see all
      suppliers that sold a certain product type). In this case query 1 is
      superior, because the query processor can stop searching for rows as soon as
      it finds the first one. <rant>Some beginners would use query 2 and put a
      DISTINCT clause in to eliminate duplicates. Obviously, this is a bad, bad
      thing to do, yet I've seen it done many times. Not only by beginners, but by
      supposedly experienced developers.</rant>

      BTW, I would, in any case, use EXISTS instead of IN, so query 1 would
      become:

      SELECT s.supplier_name
      FROM supplier AS s
      WHERE EXISTS
      (
      SELECT * FROM products AS p
      WHERE p.product_ID =22
      AND p.Supplier_ID = s.Supplier_ID
      )

      but, as I said, query 2 would be just as good.









      Comment

      • John Winterbottom

        #4
        Re: SQL performance: Nested SELECT vs. INNER JOIN

        "John Winterbottom" <john_winterbot tom@hotmail.com > wrote in message
        news:bkvglm$6m6 hq$1@ID-185006.news.uni-berlin.de...[color=blue]
        > "Brian" <brianlappin@ya hoo.com> wrote in message
        > news:be1a8433.0 309251037.6218f 39d@posting.goo gle.com...[color=green]
        > > Hello All -
        > >
        > > I am wondering if anyone has any thoughts on which is better from a
        > > performance perspective: a nested Select statement or an Inner Join.
        > >
        > >
        > > For example, I could do either of the following:
        > >
        > > SELECT supplier_name
        > > FROM supplier
        > > WHERE supplier_ID IN (SELECT supplier_ID FROM products WHERE
        > > product_ID =22);
        > >
        > > VS.
        > >
        > > SELECT supplier.suppli er_name
        > > FROM supplier INNER JOIN products ON supplier.suppli er_ID =
        > > products.suppli er_ID
        > > WHERE products.produc t_ID = 22;
        > >
        > >
        > > Which is faster?
        > >[/color]
        >
        >[/color]


        Just to add to what I was syaing earlier, these two examples both use the
        Northwind database to find suppliers that sell seafood products.

        The right way
        -------------------------------------
        select s.CompanyName
        from Suppliers as s
        where exists
        (
        select * from products p
        inner join categories c on p.CategoryID = c.CategoryID
        where c.CategoryName = "Seafood"
        and p.SupplierID=s. SupplierID
        )
        ----------------------------------------


        The wrong way
        ------------------------------------------------
        SELECT DISTINCT Suppliers.Compa nyName
        FROM Suppliers INNER JOIN (Categories INNER JOIN Products ON
        Categories.Cate goryID = Products.Catego ryID) ON Suppliers.Suppl ierID =
        Products.Suppli erID
        WHERE (((Categories.C ategoryName)="S eafood"))
        ------------------------------------------------
        Last edited by Niheel; Jun 2 '11, 07:55 AM.

        Comment

        • Peter Miller

          #5
          Re: SQL performance: Nested SELECT vs. INNER JOIN


          John,

          On Thu, 25 Sep 2003 15:59:53 -0400, "John Winterbottom"
          <john_winterbot tom@hotmail.com > wrote in comp.databases. ms-access:
          [color=blue]
          >
          >Just to add to what I was syaing earlier, these two examples both use the
          >Northwind database to find suppliers that sell seafood products.
          >
          >The right way
          >-------------------------------------
          >select s.CompanyName
          >from Suppliers as s
          >where exists
          > (
          > select * from products p
          > inner join categories c on p.CategoryID = c.CategoryID
          > where c.CategoryName = "Seafood"
          > and p.SupplierID=s. SupplierID
          > )
          >----------------------------------------
          >
          >
          >The wrong way
          >------------------------------------------------
          >SELECT DISTINCT Suppliers.Compa nyName
          >FROM Suppliers INNER JOIN (Categories INNER JOIN Products ON
          >Categories.Cat egoryID = Products.Catego ryID) ON Suppliers.Suppl ierID =
          >Products.Suppl ierID
          >WHERE (((Categories.C ategoryName)="S eafood"))
          >------------------------------------------------[/color]

          I find it very interesting that you would say this.

          I thought it was well understood and accepted that Jet is very good at
          optimizing multi-table sql statements that utilize joins, and very
          poor at handling subqueries. I would kindly suggest that you test
          your two example queries, because I would hazard a guess that you'll
          find that the performance is the exact opposite of what you expect.

          Of course, with rdbms' more generally, subqueries are fine and often
          preferable in situations like this. But its not that way with Jet.

          Peter Miller
          _______________ _______________ _______________ _______________
          PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
          Free quotes, Guaranteed lowest prices and best results
          www.pksolutions.com 1.800.987.7716 1.619.839.3900

          Comment

          • John Winterbottom

            #6
            Re: SQL performance: Nested SELECT vs. INNER JOIN

            "Peter Miller" <pmiller@pksolu tions.com> wrote in message
            news:o7k6nvckb8 8hcbghp24b79frl siisepjb1@4ax.c om...[color=blue]
            >
            > John,
            >
            > On Thu, 25 Sep 2003 15:59:53 -0400, "John Winterbottom"
            > <john_winterbot tom@hotmail.com > wrote in comp.databases. ms-access:
            >[color=green]
            > >
            > >Just to add to what I was syaing earlier, these two examples both use the
            > >Northwind database to find suppliers that sell seafood products.
            > >
            > >The right way
            > >-------------------------------------
            > >select s.CompanyName
            > >from Suppliers as s
            > >where exists
            > > (
            > > select * from products p
            > > inner join categories c on p.CategoryID = c.CategoryID
            > > where c.CategoryName = "Seafood"
            > > and p.SupplierID=s. SupplierID
            > > )
            > >----------------------------------------
            > >
            > >
            > >The wrong way
            > >------------------------------------------------
            > >SELECT DISTINCT Suppliers.Compa nyName
            > >FROM Suppliers INNER JOIN (Categories INNER JOIN Products ON
            > >Categories.Cat egoryID = Products.Catego ryID) ON Suppliers.Suppl ierID =
            > >Products.Suppl ierID
            > >WHERE (((Categories.C ategoryName)="S eafood"))
            > >------------------------------------------------[/color]
            >
            > I find it very interesting that you would say this.
            >
            > I thought it was well understood and accepted that Jet is very good at
            > optimizing multi-table sql statements that utilize joins, and very
            > poor at handling subqueries. I would kindly suggest that you test
            > your two example queries, because I would hazard a guess that you'll
            > find that the performance is the exact opposite of what you expect.
            >
            > Of course, with rdbms' more generally, subqueries are fine and often
            > preferable in situations like this. But its not that way with Jet.
            >[/color]


            Right now I can only test in SQL Server - and it's as I said; query 1 has a
            lower cost than query 2. I'd be very surprised if Jet was different. The
            reason is because of how a select works, (or how it shold work). Joe Celko
            has posted this several times - I'll see if I can find it if you like.

            Basically, when you use SELECT DISTINCT...WHER E, (or a GROUP BY clause), the
            query engine first builds a working table with all the rows satisfying the
            WHERE condition. Only then can it aggregate them. This is more expensive. In
            the first example, (using EXISTS), the processor only needs to test for the
            existence of a single row. In other words, if you have 5000 seafood products
            from the same supplier, all it needs to do is find one of them and the
            EXISTS condition is satisfied. The processor can move on to the next
            supplier.














            Comment

            • Peter Miller

              #7
              Re: SQL performance: Nested SELECT vs. INNER JOIN


              On Thu, 25 Sep 2003 16:56:36 -0400, "John Winterbottom"
              <john_winterbot tom@hotmail.com > wrote in comp.databases. ms-access:
              [color=blue]
              >Right now I can only test in SQL Server - and it's as I said; query 1 has a
              >lower cost than query 2. I'd be very surprised if Jet was different. The
              >reason is because of how a select works, (or how it shold work). Joe Celko
              >has posted this several times - I'll see if I can find it if you like.[/color]

              Hey, Joe's the man, but your statement doesn't follow...

              To wit:
              [color=blue]
              >I'd be very surprised if Jet was different.[/color]

              Understood.
              [color=blue]
              >The reason is because of how a select works, (or how it shold work).[/color]

              But that's precisely the point I was making. Jet is fine at joins,
              and weak at subqueries. It's got nothing to do with what 'should' be
              the case. It's been clear through the history of Jet that it is weak
              in this area.
              [color=blue]
              >Basically, when you use SELECT DISTINCT...WHER E, (or a GROUP BY clause), the
              >query engine first builds a working table with all the rows satisfying the
              >WHERE condition. Only then can it aggregate them. This is more expensive. In
              >the first example, (using EXISTS), the processor only needs to test for the
              >existence of a single row. In other words, if you have 5000 seafood products
              >from the same supplier, all it needs to do is find one of them and the
              >EXISTS condition is satisfied. The processor can move on to the next
              >supplier.[/color]

              ....yes, all well understood. But What you'll find with jet is that
              EXISTS clauses don't work well, because they rely on subqueries, and
              subqueries are poorly handled by Jet.

              For example, take your 'correct' example, and break it down again by
              using another exists statement (ie, no joins at all, just a query with
              a subquery and a nested subquery beneath the subquery.

              It is my understanding that Jet actually works the full subquery (or
              subqueries in this case) up from the bottom, and essentially does the
              join implicitly only once the subquery has been processed (to the
              extent possible). At the same time, DISTINCT is nicely optimized to
              an effective TOP 1 / GROUP BY combo (ie, no processing occurs beyond
              finding the first matching instance at each level). So you see little
              penalty in Jet by using DISTINCT and a large penalty for using the
              subquery.

              As I'm sure Celko would agree, despite the obvious benefits of generic
              sql that's 100% standards compliant, certain sql implementations will
              have certain nuances that simply can't be ignored in real-world
              database applications. Jet's poor handling of subqueries is a case in
              point.

              Peter Miller
              _______________ _______________ _______________ _______________
              PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
              Free quotes, Guaranteed lowest prices and best results
              www.pksolutions.com 1.800.987.7716 1.619.839.3900

              Comment

              • david epsom dot com dot au

                #8
                Re: SQL performance: Nested SELECT vs. INNER JOIN

                With all due respect to Joe Celko, he has never had
                much time for JET...

                (david)


                Comment

                • John Winterbottom

                  #9
                  Re: SQL performance: Nested SELECT vs. INNER JOIN

                  "Peter Miller" <pmiller@pksolu tions.com> wrote in message
                  news:ojl6nvkgj4 30tsifhivnu4fog flp930sos@4ax.c om...[color=blue]
                  >
                  > ...yes, all well understood. But What you'll find with jet is that
                  > EXISTS clauses don't work well, because they rely on subqueries, and
                  > subqueries are poorly handled by Jet.
                  >
                  > For example, take your 'correct' example, and break it down again by
                  > using another exists statement (ie, no joins at all, just a query with
                  > a subquery and a nested subquery beneath the subquery.
                  >
                  > It is my understanding that Jet actually works the full subquery (or
                  > subqueries in this case) up from the bottom, and essentially does the
                  > join implicitly only once the subquery has been processed (to the
                  > extent possible). At the same time, DISTINCT is nicely optimized to
                  > an effective TOP 1 / GROUP BY combo (ie, no processing occurs beyond
                  > finding the first matching instance at each level). So you see little
                  > penalty in Jet by using DISTINCT and a large penalty for using the
                  > subquery.[/color]

                  Peter, you're absolutely right. I tested this and there's no difference
                  whatsoever in execution time for either query running against a Jet
                  database. I haven't looked at the query plans to see if they are the same.
                  I'm pretty amazed at this. If you run the two in SQL Server you'll see a
                  major improvement for query 1.

                  Live and learn. Thanks for the correction Pater, and the explanation. That's
                  twice in one day I've made a claim that's incorrect, so now it's time for me
                  to go and have a beer.














                  Comment

                  • John Winterbottom

                    #10
                    QLRe: SQL performance: Nested SELECT vs. INNER JOIN


                    "david epsom dot com dot au" <david@epsomdot comdotau> wrote in message
                    news:uTSD0X7gDH A.2188@TK2MSFTN GP10.phx.gbl...[color=blue]
                    > With all due respect to Joe Celko, he has never had
                    > much time for JET...
                    >[/color]



                    True. This may be one of the reasons why!

                    I had just assumed, since the designers of Jet had chosen to implement the
                    exists clause, that they would have optimized it as well, (as do Oracle, DB2
                    and SQL Server). Otehrwise what's the point? Oh well, as I said to Peter,
                    live and learn.


                    Comment

                    • Peter Miller

                      #11
                      Re: SQL performance: Nested SELECT vs. INNER JOIN

                      On Thu, 25 Sep 2003 20:06:04 -0400, "John Winterbottom"
                      <john_winterbot tom@hotmail.com > wrote in comp.databases. ms-access:
                      [color=blue]
                      >I tested this and there's no difference
                      >whatsoever in execution time for either query running against a Jet
                      >database.[/color]

                      Hmm. I tested it here too, but found a 33% performance benefit using
                      query 2 over query 1. Did you iterate enough times to get meaningful
                      results? I found that using a 20,000 iteration loop on query 1
                      against the Northwind tables, I saw 29 seconds for query 1 and 19
                      seconds for query 2. If I broke query 1 down into two exists/subquery
                      clauses instead of just one and a join, the time required escalated to
                      38 seconds for 20,000 iterations.

                      In other words, there's very much a difference, and its decidedly
                      faster to join that to use subqueries.
                      [color=blue]
                      >I'm pretty amazed at this. If you run the two in SQL Server you'll see a
                      >major improvement for query 1.[/color]

                      As you would with Oracle, DB2 or Sybase. Just not Jet.

                      Peter Miller
                      _______________ _______________ _______________ _______________
                      PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
                      Free quotes, Guaranteed lowest prices and best results
                      www.pksolutions.com 1.800.987.7716 1.619.839.3900

                      Comment

                      • david epsom dot com dot au

                        #12
                        Re: SQL performance: Nested SELECT vs. INNER JOIN

                        I like to think that, at least originally, Jet was optimised
                        for naive users...


                        (david)



                        "John Winterbottom" <john_winterbot tom@hotmail.com > wrote in message
                        news:bl00fd$6m8 md$1@ID-185006.news.uni-berlin.de...[color=blue]
                        >
                        > "david epsom dot com dot au" <david@epsomdot comdotau> wrote in message
                        > news:uTSD0X7gDH A.2188@TK2MSFTN GP10.phx.gbl...[color=green]
                        > > With all due respect to Joe Celko, he has never had
                        > > much time for JET...
                        > >[/color]
                        >
                        >
                        >
                        > True. This may be one of the reasons why!
                        >
                        > I had just assumed, since the designers of Jet had chosen to implement the
                        > exists clause, that they would have optimized it as well, (as do Oracle,[/color]
                        DB2[color=blue]
                        > and SQL Server). Otehrwise what's the point? Oh well, as I said to Peter,
                        > live and learn.
                        >
                        >[/color]


                        Comment

                        • Larry  Linson

                          #13
                          Re: SQL performance: Nested SELECT vs. INNER JOIN

                          "david epsom dot com dot au" wrote
                          [color=blue]
                          > With all due respect to Joe Celko, he
                          > has never had much time for JET...[/color]

                          Joe and I have crossed words (not swords) in the past, and then I decided he
                          was an "elitist" who didn't seem to have either much time or respect for
                          anything he considered a mere desktop database. He's certainly not the only
                          one in that category, of course.


                          Comment

                          • rkc

                            #14
                            Re: SQL performance: Nested SELECT vs. INNER JOIN


                            "Larry Linson" <bouncer@localh ost.net> wrote in message
                            news:LkLdb.2489 8$ZR1.20236@nwr ddc01.gnilink.n et...[color=blue]
                            > "david epsom dot com dot au" wrote
                            >[color=green]
                            > > With all due respect to Joe Celko, he
                            > > has never had much time for JET...[/color]
                            >
                            > Joe and I have crossed words (not swords) in the past, and then I decided[/color]
                            he[color=blue]
                            > was an "elitist" who didn't seem to have either much time or respect for
                            > anything he considered a mere desktop database. He's certainly not the[/color]
                            only[color=blue]
                            > one in that category, of course.[/color]


                            It would be pointless to spend time learning the idiosyncrasies of something
                            you
                            have no use for. That's not elitist. That's sensible.




                            Comment

                            • Larry  Linson

                              #15
                              Re: SQL performance: Nested SELECT vs. INNER JOIN

                              It is "elitist" to denigrate a category of products _in your field_ without
                              understanding them and imply that because they aren't something else (in
                              this case, server databases) that they couldn't be worth anyone's time.

                              There are situations in which a server database is the proper solution;
                              there are other situations in which a simple desktop database is the proper
                              solution. If you are '"in the database business", you really ought to be a
                              pragmatist and understand both kinds of tool sufficiently well to make an
                              informed decision as to which one is appropriate.

                              And, you aren't nearly as likely to be thought an insufferable egotist
                              rather than just elitist, either.

                              On the other hand, it might well be pointless for someone in the database
                              business to spend the time learning the idiosyncracies of image processing
                              software.

                              Larry

                              "rkc" <rkc@yabba.dabb a.do.rochester. rr.com> wrote in message
                              news:k1Mdb.1486 9$pe7.11126@twi ster.nyroc.rr.c om...[color=blue]
                              >
                              > "Larry Linson" <bouncer@localh ost.net> wrote in message
                              > news:LkLdb.2489 8$ZR1.20236@nwr ddc01.gnilink.n et...[color=green]
                              > > "david epsom dot com dot au" wrote
                              > >[color=darkred]
                              > > > With all due respect to Joe Celko, he
                              > > > has never had much time for JET...[/color]
                              > >
                              > > Joe and I have crossed words (not swords) in the past, and then I[/color][/color]
                              decided[color=blue]
                              > he[color=green]
                              > > was an "elitist" who didn't seem to have either much time or respect for
                              > > anything he considered a mere desktop database. He's certainly not the[/color]
                              > only[color=green]
                              > > one in that category, of course.[/color]
                              >
                              >
                              > It would be pointless to spend time learning the idiosyncrasies of[/color]
                              something[color=blue]
                              > you
                              > have no use for. That's not elitist. That's sensible.
                              >
                              >
                              >
                              >[/color]


                              Comment

                              Working...