Selecting 1 Record per Foreign Key

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Ben Gribaudo

    Selecting 1 Record per Foreign Key

    Hello!

    I am trying to query a MySQL table and retrieve either the earliest or the
    latest message in each thread. The messages are in a table with MsgID being
    the primary key and fThreadID identifying to which thread the message
    belongs.

    Examples below.

    Any ideas on the SQL I need to use here?

    Thank you!
    Ben :-)
    Ben Gribaudo - Baltimore, MD - www.bengribaudo.com

    "For God so loved the world, that he gave his only begotten Son, that
    whosoever believeth in him should not perish, but have everlasting life."
    John 3:16

    Example Table
    =============
    MsgID (primary key) fThreadID ReceivedTimesta mp (DateTime)
    1 99 2004-01-02 11:10:15
    2 99 2004-02-01 01:11:59
    3 99 2004-05-17 18:15:01
    4 102 2004-07-01 11:11:29
    4 103 2004-07-01 11:11:11


    Example Desired Result Set #1
    (earliest msg from each thread)
    =============== =============== =
    MsgID, fThreadID, ReceivedTimesta mp
    1 99 2004-01-02 11:10:15
    4 103 2004-07-01 11:11:11


    Example Desired Result Set #2
    (most recent msg from each thread)
    =============== =============== ====
    MsgID fThreadID ReceivedTimesta mp
    3 99 2004-05-17 18:15:01
    4 102 2004-07-01 11:11:29


  • Ben Gribaudo

    #2
    Re: Selecting 1 Record per Foreign Key

    Hi Bill,

    Thank you for your reply!
    [color=blue]
    > Maybe not likely in the real world, but as far as your
    > database knows, there's nothing preventing it from occurring.[/color]
    And our job, as programmers, is to design our scripts/programs to handle
    such things.
    [color=blue]
    > Is this a typo? Do you mean to list another value such as 5 as the
    > MsgID in the last row?[/color]
    Yes.
    [color=blue]
    > You had three distinct fThreadID values in the example dataset. I would
    > expect three rows in the result set of this query.[/color]
    I meant for fThreadID 103 to be 102, making two unique fThreadIDs. Today
    must be my day for typos. :o)
    [color=blue]
    > I think this is a case that is necessary to solve in two queries:
    > select t1.fThreadID, max(t1.Received Timestamp) as
    > from MyTable t1
    > group by t1.fThreadID;
    >
    > Then loop through that result set and get the MsgID values matching each
    > one:
    > select t2.MsgID
    > from MyTable t2
    > where t2.fThreadID = ? and t2.ReceivedTime stamp = ?
    > limit 1;[/color]
    Thank you. I hadn't thought of doing the second query that way.

    I may have to drop the whole idea of doing this. The goal was to be able to
    sort a set of threads in alphabetical order by the from name of either the
    earliest or last message in that thread for a forum script. I'm not sure if
    the feature will give enough advantage to justify the extra looped querying
    (modifying the object to support it, query performance, etc.).

    Thank you, again, for your help!
    -Ben

    "Bill Karwin" <bill@karwin.co m> wrote in message
    news:cdjkrp0ta9 @enews1.newsguy .com...
    [snip][color=blue]
    > This is one of those tricky cases in SQL, and this sort of problem comes
    > up periodically. You want to aggregate using MIN() or MAX(), use GROUP
    > BY to get a row of output per fThreadID, but you also want an attribute
    > of the row that contained that MAX value. So one could imagine this
    > hypothetical SQL:
    >
    > select
    > (select t2.MsgID
    > from MyTable t2
    > where t2.fThreadID = t1.fThreadID
    > and t2.ReceivedTime stamp = max(t1.Received Timestamp)) as MsgID,
    > t1.fThreadID, max(t1.Received Timestamp)
    > from MyTable t1
    > group by t1.fThreadID;
    >
    > But this makes no sense. The result of the subquery is not guaranteed
    > to be a singleton value, because even though MsgID is a primary key and
    > must be unique, it is possible that two different MsgID's have identical
    > timestamps. Maybe not likely in the real world, but as far as your
    > database knows, there's nothing preventing it from occurring.
    >
    > I think this is a case that is necessary to solve in two queries:
    > select t1.fThreadID, max(t1.Received Timestamp) as
    > from MyTable t1
    > group by t1.fThreadID;
    >
    > Then loop through that result set and get the MsgID values matching each
    > one:
    > select t2.MsgID
    > from MyTable t2
    > where t2.fThreadID = ? and t2.ReceivedTime stamp = ?
    > limit 1;
    >[color=green]
    > > Example Table
    > > =============
    > > MsgID (primary key) fThreadID ReceivedTimesta mp (DateTime)
    > > 1 99 2004-01-02 11:10:15
    > > 2 99 2004-02-01 01:11:59
    > > 3 99 2004-05-17 18:15:01
    > > 4 102 2004-07-01 11:11:29
    > > 4 103 2004-07-01 11:11:11[/color]
    >
    > Is this a typo? Do you mean to list another value such as 5 as the
    > MsgID in the last row?
    >[color=green]
    > > Example Desired Result Set #1
    > > (earliest msg from each thread)
    > > =============== =============== =
    > > MsgID, fThreadID, ReceivedTimesta mp
    > > 1 99 2004-01-02 11:10:15
    > > 4 103 2004-07-01 11:11:11[/color]
    >
    > You had three distinct fThreadID values in the example dataset. I would
    > expect three rows in the result set of this query.[/color]
    [snip]


    Comment

    • Bill Karwin

      #3
      Re: Selecting 1 Record per Foreign Key

      Ben Gribaudo wrote:[color=blue]
      > Hello!
      >
      > I am trying to query a MySQL table and retrieve either the earliest or the
      > latest message in each thread. The messages are in a table with MsgID being
      > the primary key and fThreadID identifying to which thread the message
      > belongs.
      >
      > Examples below.
      >
      > Any ideas on the SQL I need to use here?[/color]

      This is one of those tricky cases in SQL, and this sort of problem comes
      up periodically. You want to aggregate using MIN() or MAX(), use GROUP
      BY to get a row of output per fThreadID, but you also want an attribute
      of the row that contained that MAX value. So one could imagine this
      hypothetical SQL:

      select
      (select t2.MsgID
      from MyTable t2
      where t2.fThreadID = t1.fThreadID
      and t2.ReceivedTime stamp = max(t1.Received Timestamp)) as MsgID,
      t1.fThreadID, max(t1.Received Timestamp)
      from MyTable t1
      group by t1.fThreadID;

      But this makes no sense. The result of the subquery is not guaranteed
      to be a singleton value, because even though MsgID is a primary key and
      must be unique, it is possible that two different MsgID's have identical
      timestamps. Maybe not likely in the real world, but as far as your
      database knows, there's nothing preventing it from occurring.

      I think this is a case that is necessary to solve in two queries:
      select t1.fThreadID, max(t1.Received Timestamp) as
      from MyTable t1
      group by t1.fThreadID;

      Then loop through that result set and get the MsgID values matching each
      one:
      select t2.MsgID
      from MyTable t2
      where t2.fThreadID = ? and t2.ReceivedTime stamp = ?
      limit 1;
      [color=blue]
      > Example Table
      > =============
      > MsgID (primary key) fThreadID ReceivedTimesta mp (DateTime)
      > 1 99 2004-01-02 11:10:15
      > 2 99 2004-02-01 01:11:59
      > 3 99 2004-05-17 18:15:01
      > 4 102 2004-07-01 11:11:29
      > 4 103 2004-07-01 11:11:11[/color]

      Is this a typo? Do you mean to list another value such as 5 as the
      MsgID in the last row?
      [color=blue]
      > Example Desired Result Set #1
      > (earliest msg from each thread)
      > =============== =============== =
      > MsgID, fThreadID, ReceivedTimesta mp
      > 1 99 2004-01-02 11:10:15
      > 4 103 2004-07-01 11:11:11[/color]

      You had three distinct fThreadID values in the example dataset. I would
      expect three rows in the result set of this query.

      Regards,
      Bill K.

      Comment

      • Bill Karwin

        #4
        Re: Selecting 1 Record per Foreign Key

        Ben Gribaudo wrote:[color=blue]
        > I may have to drop the whole idea of doing this. The goal was to be able to
        > sort a set of threads in alphabetical order by the from name of either the
        > earliest or last message in that thread for a forum script.[/color]

        You might want to check out a book called "SQL for Smarties" by Joe
        Celko. He has a few chapters on operating on recursive or heirarchical
        data structures in SQL. Interesting stuff!

        Regards,
        Bill K.

        Comment

        • Isaac Blank

          #5
          Re: Selecting 1 Record per Foreign Key

          "Bill Karwin" <bill@karwin.co m> wrote in message
          news:cdk6kf0tm5 @enews2.newsguy .com...[color=blue]
          > Ben Gribaudo wrote:[color=green]
          > > I may have to drop the whole idea of doing this. The goal was to be able[/color][/color]
          to[color=blue][color=green]
          > > sort a set of threads in alphabetical order by the from name of either[/color][/color]
          the[color=blue][color=green]
          > > earliest or last message in that thread for a forum script.[/color]
          >
          > You might want to check out a book called "SQL for Smarties" by Joe
          > Celko. He has a few chapters on operating on recursive or heirarchical
          > data structures in SQL. Interesting stuff!
          >[/color]
          He's also got a brand new book on that same topic:


          Comment

          • Ben Gribaudo

            #6
            Re: Selecting 1 Record per Foreign Key

            Hi Bill,

            Thank you for your reply!
            [color=blue]
            > Maybe not likely in the real world, but as far as your
            > database knows, there's nothing preventing it from occurring.[/color]
            And our job, as programmers, is to design our scripts/programs to handle
            such things.
            [color=blue]
            > Is this a typo? Do you mean to list another value such as 5 as the
            > MsgID in the last row?[/color]
            Yes.
            [color=blue]
            > You had three distinct fThreadID values in the example dataset. I would
            > expect three rows in the result set of this query.[/color]
            I meant for fThreadID 103 to be 102, making two unique fThreadIDs. Today
            must be my day for typos. :o)
            [color=blue]
            > I think this is a case that is necessary to solve in two queries:
            > select t1.fThreadID, max(t1.Received Timestamp) as
            > from MyTable t1
            > group by t1.fThreadID;
            >
            > Then loop through that result set and get the MsgID values matching each
            > one:
            > select t2.MsgID
            > from MyTable t2
            > where t2.fThreadID = ? and t2.ReceivedTime stamp = ?
            > limit 1;[/color]
            Thank you. I hadn't thought of doing the second query that way.

            I may have to drop the whole idea of doing this. The goal was to be able to
            sort a set of threads in alphabetical order by the from name of either the
            earliest or last message in that thread for a forum script. I'm not sure if
            the feature will give enough advantage to justify the extra looped querying
            (modifying the object to support it, query performance, etc.).

            Thank you, again, for your help!
            -Ben

            "Bill Karwin" <bill@karwin.co m> wrote in message
            news:cdjkrp0ta9 @enews1.newsguy .com...
            [snip][color=blue]
            > This is one of those tricky cases in SQL, and this sort of problem comes
            > up periodically. You want to aggregate using MIN() or MAX(), use GROUP
            > BY to get a row of output per fThreadID, but you also want an attribute
            > of the row that contained that MAX value. So one could imagine this
            > hypothetical SQL:
            >
            > select
            > (select t2.MsgID
            > from MyTable t2
            > where t2.fThreadID = t1.fThreadID
            > and t2.ReceivedTime stamp = max(t1.Received Timestamp)) as MsgID,
            > t1.fThreadID, max(t1.Received Timestamp)
            > from MyTable t1
            > group by t1.fThreadID;
            >
            > But this makes no sense. The result of the subquery is not guaranteed
            > to be a singleton value, because even though MsgID is a primary key and
            > must be unique, it is possible that two different MsgID's have identical
            > timestamps. Maybe not likely in the real world, but as far as your
            > database knows, there's nothing preventing it from occurring.
            >
            > I think this is a case that is necessary to solve in two queries:
            > select t1.fThreadID, max(t1.Received Timestamp) as
            > from MyTable t1
            > group by t1.fThreadID;
            >
            > Then loop through that result set and get the MsgID values matching each
            > one:
            > select t2.MsgID
            > from MyTable t2
            > where t2.fThreadID = ? and t2.ReceivedTime stamp = ?
            > limit 1;
            >[color=green]
            > > Example Table
            > > =============
            > > MsgID (primary key) fThreadID ReceivedTimesta mp (DateTime)
            > > 1 99 2004-01-02 11:10:15
            > > 2 99 2004-02-01 01:11:59
            > > 3 99 2004-05-17 18:15:01
            > > 4 102 2004-07-01 11:11:29
            > > 4 103 2004-07-01 11:11:11[/color]
            >
            > Is this a typo? Do you mean to list another value such as 5 as the
            > MsgID in the last row?
            >[color=green]
            > > Example Desired Result Set #1
            > > (earliest msg from each thread)
            > > =============== =============== =
            > > MsgID, fThreadID, ReceivedTimesta mp
            > > 1 99 2004-01-02 11:10:15
            > > 4 103 2004-07-01 11:11:11[/color]
            >
            > You had three distinct fThreadID values in the example dataset. I would
            > expect three rows in the result set of this query.[/color]
            [snip]


            Comment

            • Bill Karwin

              #7
              Re: Selecting 1 Record per Foreign Key

              Ben Gribaudo wrote:[color=blue]
              > I may have to drop the whole idea of doing this. The goal was to be able to
              > sort a set of threads in alphabetical order by the from name of either the
              > earliest or last message in that thread for a forum script.[/color]

              You might want to check out a book called "SQL for Smarties" by Joe
              Celko. He has a few chapters on operating on recursive or heirarchical
              data structures in SQL. Interesting stuff!

              Regards,
              Bill K.

              Comment

              • Isaac Blank

                #8
                Re: Selecting 1 Record per Foreign Key

                "Bill Karwin" <bill@karwin.co m> wrote in message
                news:cdk6kf0tm5 @enews2.newsguy .com...[color=blue]
                > Ben Gribaudo wrote:[color=green]
                > > I may have to drop the whole idea of doing this. The goal was to be able[/color][/color]
                to[color=blue][color=green]
                > > sort a set of threads in alphabetical order by the from name of either[/color][/color]
                the[color=blue][color=green]
                > > earliest or last message in that thread for a forum script.[/color]
                >
                > You might want to check out a book called "SQL for Smarties" by Joe
                > Celko. He has a few chapters on operating on recursive or heirarchical
                > data structures in SQL. Interesting stuff!
                >[/color]
                He's also got a brand new book on that same topic:


                Comment

                • Ben Gribaudo

                  #9
                  Re: Selecting 1 Record per Foreign Key

                  Thank you all for your help!
                  -Ben


                  Comment

                  • steve

                    #10
                    Re: Re: Selecting 1 Record per Foreign Key

                    "Ben Gribaudo" wrote:[color=blue]
                    > Thank you all for your help!
                    > -Ben[/color]

                    Ben, you said "thread", so I presume you are talking about a
                    discussion group or bulletin board. In that case, you want to
                    "denormaliz e" which means that you would want to keep the id of the
                    latest and earliest posts written to the topics table. This way, you
                    avoid doing the costly on-the-fly query that has been talked about
                    here.

                    So every time, there is a new post, the topics table is updated. If a
                    post is deleted, also the topics table is updated. If this approach
                    is not taken, it would mean serious performance issues, as more lead
                    is introduced.

                    --
                    http://www.dbForumz.com/ This article was posted by author's request
                    Articles individually checked for conformance to usenet standards
                    Topic URL: http://www.dbForumz.com/mySQL-Select...ict131294.html
                    Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=444060

                    Comment

                    • Ben Gribaudo

                      #11
                      Re: Selecting 1 Record per Foreign Key

                      Thank you all for your help!
                      -Ben


                      Comment

                      • Ben Gribaudo

                        #12
                        Re: Re: Selecting 1 Record per Foreign Key

                        Hi Steve,

                        You are correct--this is for discussion group, or, to be exact, a Web
                        interface to some e-mail lists. Thank you for that idea.

                        Do you know of any articles which deal with denormalizing for this
                        application?

                        Thank you!
                        -Ben

                        "steve" wrote...[color=blue]
                        > Ben, you said "thread", so I presume you are talking about a
                        > discussion group or bulletin board. In that case, you want to
                        > "denormaliz e" which means that you would want to keep the id of the
                        > latest and earliest posts written to the topics table. This way, you
                        > avoid doing the costly on-the-fly query that has been talked about
                        > here.[/color]



                        Comment

                        • steve

                          #13
                          Re: Re: Re: Selecting 1 Record per Foreign Key

                          "Ben Gribaudo" wrote:[color=blue]
                          > Hi Steve,
                          >
                          > You are correct--this is for discussion group, or, to be exact, a[/color]
                          Web[color=blue]
                          > interface to some e-mail lists. Thank you for that idea.
                          >
                          > Do you know of any articles which deal with denormalizing for this
                          > application?
                          >
                          > Thank you!
                          > -Ben
                          >
                          > "steve" wrote...[color=green]
                          > > Ben, you said "thread", so I presume you are talking about a
                          > > discussion group or bulletin board. In that case, you want to
                          > > "denormaliz e" which means that you would want to keep the id of[/color]
                          > the[color=green]
                          > > latest and earliest posts written to the topics table. This way,[/color]
                          > you[color=green]
                          > > avoid doing the costly on-the-fly query that has been talked[/color]
                          > about[color=green]
                          > > here.</font>[/color][/color]

                          Ben, don’t know articles. But you can download phpbb, and see their
                          data structures (topics table, posts table), and also look at the
                          script and see how it is done. If you are setting up a discussion
                          group, I suggest using either phpbb or maybe invision (for free code),
                          and start building from there. Bulletin board is a complex piece of
                          code (you’ll see once you get into it), and I would not attempt to
                          build what I can get for free.

                          --
                          http://www.dbForumz.com/ This article was posted by author's request
                          Articles individually checked for conformance to usenet standards
                          Topic URL: http://www.dbForumz.com/mySQL-Select...ict131294.html
                          Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=445076

                          Comment

                          • steve

                            #14
                            Re: Re: Selecting 1 Record per Foreign Key

                            "Ben Gribaudo" wrote:[color=blue]
                            > Thank you all for your help!
                            > -Ben[/color]

                            Ben, you said "thread", so I presume you are talking about a
                            discussion group or bulletin board. In that case, you want to
                            "denormaliz e" which means that you would want to keep the id of the
                            latest and earliest posts written to the topics table. This way, you
                            avoid doing the costly on-the-fly query that has been talked about
                            here.

                            So every time, there is a new post, the topics table is updated. If a
                            post is deleted, also the topics table is updated. If this approach
                            is not taken, it would mean serious performance issues, as more lead
                            is introduced.

                            --
                            http://www.dbForumz.com/ This article was posted by author's request
                            Articles individually checked for conformance to usenet standards
                            Topic URL: http://www.dbForumz.com/mySQL-Select...ict131294.html
                            Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=444060

                            Comment

                            • Ben Gribaudo

                              #15
                              Re: Re: Re: Selecting 1 Record per Foreign Key

                              Steve,

                              Thanks for that idea. Bulletin boards *are* complex! I've already spend
                              about two hundred hours on this and it will probably require another one to
                              two hundred hours. The specs for this system differ a bit from a traditional
                              discussion forum, so this is being programmed from scratch.

                              Thanks again!

                              Ben

                              "steve" wrote...[color=blue]
                              > Ben, don't know articles. But you can download phpbb, and see their
                              > data structures (topics table, posts table), and also look at the
                              > script and see how it is done. If you are setting up a discussion
                              > group, I suggest using either phpbb or maybe invision (for free code),
                              > and start building from there. Bulletin board is a complex piece of
                              > code (you'll see once you get into it), and I would not attempt to
                              > build what I can get for free.[/color]


                              Comment

                              Working...