Join efficiency

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Russ Brown

    Join efficiency

    Hello all,

    Recently a post on this list made me think a bit about the way in which I
    write my queries.

    I have always written queries with ordinary joins in this manner:

    SELECT * FROM a, b WHERE a.x=b.x;

    However I recently saw an laternative syntax:

    SELECT * FROM a JOIN b ON a.x=b.x;

    Is there any difference between these queries in terms of the speed of
    planning or the quality of the plan untimately used? I'd imagine that the
    second form provides more information that the planner may be able to use
    to make a better plan (or make a good plan more easily), but I've never
    had any problems with the first form.

    It also seems to me that the second form is more self-documenting, which
    is something I'm always in favour of.

    I'd appreciate anyone's thought/insight.

    Thanks.

    --

    Russell Brown

    ---------------------------(end of broadcast)---------------------------
    TIP 9: the planner will ignore your desire to choose an index scan if your
    joining column's datatypes do not match

  • terry@ashtonwoodshomes.com

    #2
    Re: Join efficiency

    NOTE: The first way cannot support OUTER joins, the second way can. Hence sometimes one has to use
    the second way for at least some of the joins.

    PREVIOUSLY: The second way can allow one to tell the planner a "better way" to join the tables.
    Likewise it can also enable the programmer to force the planner into a worse way. Oops!
    NOW: I believe that the latest version of postgres (7.4.x) the planner will override the 2nd methods
    requested join method if it knows of a better way and can do the better way. (Outer joins need to
    be done last, by the nature of them, and so cannot be changed much, there may be other cases where
    the planner cannot change the requested plan).

    I am not an expert, but this is what I recall from following the list.

    Terry Fielder
    Manager Software Development and Deployment
    Great Gulf Homes / Ashton Woods Homes
    terry@greatgulf homes.com
    Fax: (416) 441-9085

    [color=blue]
    > -----Original Message-----
    > From: pgsql-general-owner@postgresq l.org
    > [mailto:pgsql-general-owner@postgresq l.org]On Behalf Of Russ Brown
    > Sent: Wednesday, September 01, 2004 7:55 AM
    > To: pgsql-general@postgre sql.org
    > Subject: [GENERAL] Join efficiency
    >
    >
    > Hello all,
    >
    > Recently a post on this list made me think a bit about the
    > way in which I
    > write my queries.
    >
    > I have always written queries with ordinary joins in this manner:
    >
    > SELECT * FROM a, b WHERE a.x=b.x;
    >
    > However I recently saw an laternative syntax:
    >
    > SELECT * FROM a JOIN b ON a.x=b.x;
    >
    > Is there any difference between these queries in terms of the
    > speed of
    > planning or the quality of the plan untimately used? I'd
    > imagine that the
    > second form provides more information that the planner may be
    > able to use
    > to make a better plan (or make a good plan more easily), but
    > I've never
    > had any problems with the first form.
    >
    > It also seems to me that the second form is more
    > self-documenting, which
    > is something I'm always in favour of.
    >
    > I'd appreciate anyone's thought/insight.
    >
    > Thanks.
    >
    > --
    >
    > Russell Brown
    >
    > ---------------------------(end of
    > broadcast)---------------------------
    > TIP 9: the planner will ignore your desire to choose an index
    > scan if your
    > joining column's datatypes do not match
    >[/color]


    ---------------------------(end of broadcast)---------------------------
    TIP 9: the planner will ignore your desire to choose an index scan if your
    joining column's datatypes do not match

    Comment

    • Richard Huxton

      #3
      Re: Join efficiency

      Russ Brown wrote:[color=blue]
      >
      > I have always written queries with ordinary joins in this manner:
      >
      > SELECT * FROM a, b WHERE a.x=b.x;
      >
      > However I recently saw an laternative syntax:
      >
      > SELECT * FROM a JOIN b ON a.x=b.x;
      >
      > Is there any difference between these queries in terms of the speed of
      > planning or the quality of the plan untimately used? I'd imagine that
      > the second form provides more information that the planner may be able
      > to use to make a better plan (or make a good plan more easily), but
      > I've never had any problems with the first form.[/color]

      The first form allows PG to plan however it sees fit. The second will
      force the join order to be the same as you specify in the query. This
      doesn't matter here, but might with a more complicated query.

      With v7.4 and higher, I believe this join forcing is configurable
      (join_collapse_ limit).
      [color=blue]
      > It also seems to me that the second form is more self-documenting,
      > which is something I'm always in favour of.[/color]

      I tend to prefer the WHERE form, but that might just be me.

      --
      Richard Huxton
      Archonet Ltd

      ---------------------------(end of broadcast)---------------------------
      TIP 7: don't forget to increase your free space map settings

      Comment

      • Russ Brown

        #4
        (NONE)

        Hi, thanks for your reply,

        On Wed, 1 Sep 2004 08:10:52 -0400, <terry@ashtonwo odshomes.com> wrote:
        [color=blue]
        > NOTE: The first way cannot support OUTER joins, the second way can.
        > Hence sometimes one has to use
        > the second way for at least some of the joins.
        >[/color]

        Yes, I've always done OUTER joins the second way. I suppose it's just the
        way I was taught SQL: I was initially taught now to do 'ordinary' joins
        using the first syntax, and then taught 'LEFT' joins using the second
        syntax when I came to need to use them (I very much leaned SQL 'on the
        job', though I know of people who *always* use OUTER joins in their
        queries). I'd never considered that there was another syntax!
        [color=blue]
        > PREVIOUSLY: The second way can allow one to tell the planner a "better
        > way" to join the tables.
        > Likewise it can also enable the programmer to force the planner into a
        > worse way. Oops!
        > NOW: I believe that the latest version of postgres (7.4.x) the planner
        > will override the 2nd methods
        > requested join method if it knows of a better way and can do the better
        > way. (Outer joins need to
        > be done last, by the nature of them, and so cannot be changed much,
        > there may be other cases where
        > the planner cannot change the requested plan).
        >[/color]

        That being the case, would it be true to say that with recent versions of
        PostgreSQL they both perform identically, meaning the second could be
        considered preferable due to its self-documenting nature (and consistency
        with the OUTER JOIN syntax)?
        [color=blue]
        > I am not an expert, but this is what I recall from following the list.
        >
        > Terry Fielder
        > Manager Software Development and Deployment
        > Great Gulf Homes / Ashton Woods Homes
        > terry@greatgulf homes.com
        > Fax: (416) 441-9085
        >
        >[color=green]
        >> -----Original Message-----
        >> From: pgsql-general-owner@postgresq l.org
        >> [mailto:pgsql-general-owner@postgresq l.org]On Behalf Of Russ Brown
        >> Sent: Wednesday, September 01, 2004 7:55 AM
        >> To: pgsql-general@postgre sql.org
        >> Subject: [GENERAL] Join efficiency
        >>
        >>
        >> Hello all,
        >>
        >> Recently a post on this list made me think a bit about the
        >> way in which I
        >> write my queries.
        >>
        >> I have always written queries with ordinary joins in this manner:
        >>
        >> SELECT * FROM a, b WHERE a.x=b.x;
        >>
        >> However I recently saw an laternative syntax:
        >>
        >> SELECT * FROM a JOIN b ON a.x=b.x;
        >>
        >> Is there any difference between these queries in terms of the
        >> speed of
        >> planning or the quality of the plan untimately used? I'd
        >> imagine that the
        >> second form provides more information that the planner may be
        >> able to use
        >> to make a better plan (or make a good plan more easily), but
        >> I've never
        >> had any problems with the first form.
        >>
        >> It also seems to me that the second form is more
        >> self-documenting, which
        >> is something I'm always in favour of.
        >>
        >> I'd appreciate anyone's thought/insight.
        >>
        >> Thanks.
        >>
        >> --
        >>
        >> Russell Brown
        >>
        >> ---------------------------(end of
        >> broadcast)---------------------------
        >> TIP 9: the planner will ignore your desire to choose an index
        >> scan if your
        >> joining column's datatypes do not match
        >>[/color]
        >[/color]



        --

        Russell Brown

        ---------------------------(end of broadcast)---------------------------
        TIP 4: Don't 'kill -9' the postmaster

        Comment

        • John Sidney-Woollett

          #5
          Re: Join efficiency

          Does anyone know if there is a postgres shorthand for Oracle's (+)
          notation to denote an outer join?

          eg

          SELECT * from a, b where a.x = b.x (+)

          John Sidney-Woollett

          Richard Huxton wrote:[color=blue]
          > Russ Brown wrote:
          >[color=green]
          >>
          >> I have always written queries with ordinary joins in this manner:
          >>
          >> SELECT * FROM a, b WHERE a.x=b.x;
          >>
          >> However I recently saw an laternative syntax:
          >>
          >> SELECT * FROM a JOIN b ON a.x=b.x;
          >>
          >> Is there any difference between these queries in terms of the speed
          >> of planning or the quality of the plan untimately used? I'd imagine
          >> that the second form provides more information that the planner may
          >> be able to use to make a better plan (or make a good plan more
          >> easily), but I've never had any problems with the first form.[/color]
          >
          >
          > The first form allows PG to plan however it sees fit. The second will
          > force the join order to be the same as you specify in the query. This
          > doesn't matter here, but might with a more complicated query.
          >
          > With v7.4 and higher, I believe this join forcing is configurable
          > (join_collapse_ limit).
          >[color=green]
          >> It also seems to me that the second form is more self-documenting,
          >> which is something I'm always in favour of.[/color]
          >
          >
          > I tend to prefer the WHERE form, but that might just be me.
          >[/color]

          ---------------------------(end of broadcast)---------------------------
          TIP 4: Don't 'kill -9' the postmaster

          Comment

          • Richard Huxton

            #6
            Re: Join efficiency

            John Sidney-Woollett wrote:[color=blue]
            > Does anyone know if there is a postgres shorthand for Oracle's (+)
            > notation to denote an outer join?
            >
            > eg
            >
            > SELECT * from a, b where a.x = b.x (+)[/color]

            Just the standard LEFT JOIN ... afaik

            --
            Richard Huxton
            Archonet Ltd

            ---------------------------(end of broadcast)---------------------------
            TIP 2: you can get off all lists at once with the unregister command
            (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

            Comment

            • Michael Paesold

              #7
              Re: Join efficiency

              Russ Brown wrote:
              [color=blue][color=green][color=darkred]
              > >> SELECT * FROM a, b WHERE a.x=b.x;[/color][/color][/color]
              [color=blue][color=green][color=darkred]
              > >> SELECT * FROM a JOIN b ON a.x=b.x;[/color][/color][/color]
              [color=blue]
              > That being the case, would it be true to say that with recent versions of
              > PostgreSQL they both perform identically, meaning the second could be
              > considered preferable due to its self-documenting nature (and consistency
              > with the OUTER JOIN syntax)?[/color]

              Assuming join_collapse_l imit is at it's default or set higher...

              As far as I can say from reading the documentation, following the hackers
              list and trying out myself: yes, both versions should yield the same
              optimized query plan and are therefore equal performance wise.

              You can just use the one you prefer.

              Best Regards,
              Michael Paesold


              ---------------------------(end of broadcast)---------------------------
              TIP 2: you can get off all lists at once with the unregister command
              (send "unregister YourEmailAddres sHere" to majordomo@postg resql.org)

              Comment

              • Tom Lane

                #8
                Re: Join efficiency

                "Russ Brown" <postgres@dot4d ot.plus.com> writes:[color=blue]
                > Is there any difference between these queries in terms of the speed of
                > planning or the quality of the plan untimately used?[/color]






                depending on which version you are using. (I think 7.1-7.3
                are essentially alike, but 7.4 is not.)

                regards, tom lane

                ---------------------------(end of broadcast)---------------------------
                TIP 1: subscribe and unsubscribe commands go to majordomo@postg resql.org

                Comment

                • Russ Brown

                  #9
                  Re: Join efficiency

                  On Wed, 01 Sep 2004 10:31:07 -0400, Tom Lane <tgl@sss.pgh.pa .us> wrote:
                  [color=blue]
                  > "Russ Brown" <postgres@dot4d ot.plus.com> writes:[color=green]
                  >> Is there any difference between these queries in terms of the speed of
                  >> planning or the quality of the plan untimately used?[/color]
                  >
                  > http://www.postgresql.org/docs/7.4/s...cit-joins.html
                  > http://www.postgresql.org/docs/7.3/s...cit-joins.html
                  > http://www.postgresql.org/docs/7.2/s...cit-joins.html
                  > http://www.postgresql.org/docs/7.1/s...cit-joins.html
                  >
                  > depending on which version you are using. (I think 7.1-7.3
                  > are essentially alike, but 7.4 is not.)
                  >
                  > regards, tom lane
                  >[/color]

                  Thanks for that: very informative.

                  I should have spotted that in the manual myself, though it has been nice
                  reading other people's opinions on the subject too.

                  Regards.

                  --

                  Russell Brown

                  ---------------------------(end of broadcast)---------------------------
                  TIP 7: don't forget to increase your free space map settings

                  Comment

                  • Laura Vance

                    #10
                    Re: Join efficiency

                    This thread also brings up the question... what ever happened to the *
                    notation of the SQL2 standard for LEFT and RIGHT outer joins?

                    To pull all rows from table 'a' and only those from table 'b' that match
                    the column criteria.
                    SELECT * FROM a, b WHERE a.x*=b.x;

                    To pull all rows from table 'b' and only those from table 'a' that match
                    the column criteria.
                    SELECT * FROM a, b WHERE a.x=*b.x;

                    This notation was always easy to remember when you think of the asterisk
                    as being a wildcard (or 'all') for its side of the expression.

                    Russ Brown wrote:
                    [color=blue]
                    > Hello all,
                    >
                    > Recently a post on this list made me think a bit about the way in
                    > which I write my queries.
                    >
                    > I have always written queries with ordinary joins in this manner:
                    >
                    > SELECT * FROM a, b WHERE a.x=b.x;
                    >
                    > However I recently saw an laternative syntax:
                    >
                    > SELECT * FROM a JOIN b ON a.x=b.x;
                    >
                    > Is there any difference between these queries in terms of the speed
                    > of planning or the quality of the plan untimately used? I'd imagine
                    > that the second form provides more information that the planner may
                    > be able to use to make a better plan (or make a good plan more
                    > easily), but I've never had any problems with the first form.
                    >
                    > It also seems to me that the second form is more self-documenting,
                    > which is something I'm always in favour of.
                    >
                    > I'd appreciate anyone's thought/insight.
                    >
                    > Thanks.
                    >[/color]

                    --
                    Thanks,
                    Laura Vance
                    Systems Engineer
                    Winfree Academy Charter Schools
                    6221 Riverside Dr. Suite 110
                    Irving, Tx 75039
                    Web: www.winfreeacademy.com



                    ---------------------------(end of broadcast)---------------------------
                    TIP 8: explain analyze is your friend

                    Comment

                    • Jeff Boes

                      #11
                      Re: Join efficiency

                      Russ Brown wrote:
                      [color=blue]
                      > Is there any difference between these queries in terms of the speed of
                      > planning or the quality of the plan untimately used? I'd imagine that
                      > the second form provides more information that the planner may be able
                      > to use to make a better plan (or make a good plan more easily), but
                      > I've never had any problems with the first form.[/color]

                      Use EXPLAIN:



                      --
                      (Posted from an account used as a SPAM dump. If you really want to get
                      in touch with me, dump the 'jboes' and substitute 'mur'.)
                      ________
                      Jeffery Boes <>< jboes@qtm.net

                      Comment

                      Working...