PhP database design question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • chris.withers@gmail.com

    PhP database design question

    I have built a web portal and was considering adding a 'user mail'
    feature for users to message each other. I'm a bit uncertain on how to
    design the DB, so was wondering if there is a standard to do this. I
    was figuring perhaps a table storing all user messages (but this might
    become very large very fast?) or would a seperate db storing
    sent/saved/recieved etc messages be better?

    Any advice would be appreciated,


    Chris


    Ps I'm unsure if this is the right group, but couldn;t find anything
    concerning php and Dbs which seemed more appropriate.

  • J2be

    #2
    Re: PhP database design question


    <chris.withers@ gmail.com> wrote in message
    news:1143627975 .829456.318230@ t31g2000cwb.goo glegroups.com.. .[color=blue]
    >I have built a web portal and was considering adding a 'user mail'
    > feature for users to message each other. I'm a bit uncertain on how to
    > design the DB, so was wondering if there is a standard to do this. I
    > was figuring perhaps a table storing all user messages (but this might
    > become very large very fast?) or would a seperate db storing
    > sent/saved/recieved etc messages be better?[/color]

    Yes, a separate Table for messages sent,saved,rece ived could give better
    results
    than a whole table with all the messages, expecially with tons of users'
    messages.

    If users messages are really more than expected think about upgrading
    hardware and doing load balancing.


    Regards


    --
    Leonardo Armando Iarrusso - J2Be
    www: http://www.J2be.com - e-mail: info[at]J2Be.com


    Comment

    • Jerry Stuckle

      #3
      Re: PhP database design question

      chris.withers@g mail.com wrote:[color=blue]
      > I have built a web portal and was considering adding a 'user mail'
      > feature for users to message each other. I'm a bit uncertain on how to
      > design the DB, so was wondering if there is a standard to do this. I
      > was figuring perhaps a table storing all user messages (but this might
      > become very large very fast?) or would a seperate db storing
      > sent/saved/recieved etc messages be better?
      >
      > Any advice would be appreciated,
      >
      >
      > Chris
      >
      >
      > Ps I'm unsure if this is the right group, but couldn;t find anything
      > concerning php and Dbs which seemed more appropriate.
      >[/color]

      Hi, Chris,

      First of all, no, this isn't the "wrong group" for the question. I don't know
      what a "right group" would be :-).

      There's no real standard for designing the DB for something like this. It's up
      to you how to do it.

      If I were doing it, I would keep the body of the message in a separate table,
      something like:

      First table:
      msgid
      from
      to
      date
      subject
      status (sent/received/saved)

      Second table:
      msgid
      msgtext

      The reason for keeping the message itself separate is speed. As you note, the
      file could become very big. You don't want the messages every time you scan -
      for instance, when you display the list of messages available to the user.
      Keeping the text in a separate table will speed up these requests, at least with
      MySQL. Of course, displaying the actual message will be slightly slower, but
      you won't be doing that as much as just displaying to/from/subject.

      --
      =============== ===
      Remove the "x" from my email address
      Jerry Stuckle
      JDS Computer Training Corp.
      jstucklex@attgl obal.net
      =============== ===

      Comment

      • Roman Ziak

        #4
        Re: PhP database design question

        Jerry Stuckle wrote:[color=blue]
        > chris.withers@g mail.com wrote:[color=green]
        >> I have built a web portal and was considering adding a 'user mail'
        >> feature for users to message each other. I'm a bit uncertain on how to
        >> design the DB, so was wondering if there is a standard to do this. I
        >> was figuring perhaps a table storing all user messages (but this might
        >> become very large very fast?) or would a seperate db storing[/color]
        >
        > [snip]
        >
        > If I were doing it, I would keep the body of the message in a separate
        > table, something like:
        >
        > First table:
        > msgid
        > from
        > to
        > date
        > subject
        > status (sent/received/saved)
        >
        > Second table:
        > msgid
        > msgtext
        >
        > The reason for keeping the message itself separate is speed. As you
        > note, the file could become very big. You don't want the messages every
        > time you scan - for instance, when you display the list of messages
        > available to the user. Keeping the text in a separate table will speed
        > up these requests, at least with MySQL. Of course, displaying the
        > actual message will be slightly slower, but you won't be doing that as
        > much as just displaying to/from/subject.
        >[/color]

        Jerry, can you elaborate on the field msgtext slowing down MySQL ?

        Say for inbox retrieval for user "johndoe" we use:

        SELECT msgid,from,to,d ate,subject,sta tus
        FROM messages
        WHERE to='johndoe'


        instead of

        SELECT *
        FROM messages
        WHERE to='johndoe'

        (It will be more appropriate to use user's id for "from" and "to"
        instead of name. Also one would want to use slightly different field
        names because some of the suggested ones are SQL keywords).

        I am assuming indexed "from" and "to" queries, so db engine will most of
        the time work on these indexes instead of crunching through the table.
        Indexes will slow down the INSERT but that will be used less often than
        SELECT.

        In my opinion the field "msgtext" should not load the database if it is
        not required in the query, but I am by no means a database expert. Maybe
        an idea for quick experiment :)

        Roman

        Comment

        • Jerry Stuckle

          #5
          Re: PhP database design question

          Roman Ziak wrote:[color=blue]
          > Jerry Stuckle wrote:
          >[color=green]
          >>chris.withers @gmail.com wrote:
          >>[color=darkred]
          >>>I have built a web portal and was considering adding a 'user mail'
          >>>feature for users to message each other. I'm a bit uncertain on how to
          >>>design the DB, so was wondering if there is a standard to do this. I
          >>>was figuring perhaps a table storing all user messages (but this might
          >>>become very large very fast?) or would a seperate db storing[/color]
          >>
          >>[snip]
          >>
          >>If I were doing it, I would keep the body of the message in a separate
          >>table, something like:
          >>
          >>First table:
          >> msgid
          >> from
          >> to
          >> date
          >> subject
          >> status (sent/received/saved)
          >>
          >>Second table:
          >> msgid
          >> msgtext
          >>
          >>The reason for keeping the message itself separate is speed. As you
          >>note, the file could become very big. You don't want the messages every
          >>time you scan - for instance, when you display the list of messages
          >>available to the user. Keeping the text in a separate table will speed
          >>up these requests, at least with MySQL. Of course, displaying the
          >>actual message will be slightly slower, but you won't be doing that as
          >>much as just displaying to/from/subject.
          >>[/color]
          >
          >
          > Jerry, can you elaborate on the field msgtext slowing down MySQL ?
          >
          > Say for inbox retrieval for user "johndoe" we use:
          >
          > SELECT msgid,from,to,d ate,subject,sta tus
          > FROM messages
          > WHERE to='johndoe'
          >
          >
          > instead of
          >
          > SELECT *
          > FROM messages
          > WHERE to='johndoe'
          >
          > (It will be more appropriate to use user's id for "from" and "to"
          > instead of name. Also one would want to use slightly different field
          > names because some of the suggested ones are SQL keywords).
          >
          > I am assuming indexed "from" and "to" queries, so db engine will most of
          > the time work on these indexes instead of crunching through the table.
          > Indexes will slow down the INSERT but that will be used less often than
          > SELECT.
          >
          > In my opinion the field "msgtext" should not load the database if it is
          > not required in the query, but I am by no means a database expert. Maybe
          > an idea for quick experiment :)
          >
          > Roman[/color]

          Roman,

          But it WILL slow down the database.

          Everything on one table is typically kept in a single file. So, let's say
          you're going to list the received messages for 'johndoe'. Are you going to
          display all the message text with each one? Typically they display a header -
          only to, from, subject and date sent (and maybe a message number). A query
          would look like:

          SELECT from, subject, sentdate
          FROM messages
          WHERE to = 'johndoe';

          Now - it can use an index to locate the records where to = 'johndoe'. No
          problem there. However, it must still read the table to get subject and sent date.

          If the message text is in the same table, it still has to read the message text
          (or at least part of it), even though it wasn't requested. And there is more
          data to buffer, making it less likely that the next message is in the same
          physical block of data.

          OTOH, if the message text is in a different table, it doesn't have to be read
          with the rest of the header information. There is a greater likelihood that the
          next message is in the same buffer. And more message headers can be buffered in
          the same amount of memory.

          Of course, there will be slightly more overhead when you fetch the text of the
          message because you now have to join two tables. But in a typical system that's
          not done nearly as often as displaying a list of messages.

          Just because you're not returning all the data doesn't mean there isn't
          additional overhead.

          --
          =============== ===
          Remove the "x" from my email address
          Jerry Stuckle
          JDS Computer Training Corp.
          jstucklex@attgl obal.net
          =============== ===

          Comment

          • Jerry Stuckle

            #6
            Re: PhP database design question

            Jerry Stuckle wrote:


            That should be:

            SELECT sentfrom, subject, sentdate
            FROM messages
            WHERE sentto = 'johndoe';



            --
            =============== ===
            Remove the "x" from my email address
            Jerry Stuckle
            JDS Computer Training Corp.
            jstucklex@attgl obal.net
            =============== ===

            Comment

            • NC

              #7
              Re: PhP database design question

              J2be wrote:[color=blue]
              > <chris.withers@ gmail.com> wrote in message
              > news:1143627975 .829456.318230@ t31g2000cwb.goo glegroups.com.. .[color=green]
              > >I have built a web portal and was considering adding a 'user mail'
              > > feature for users to message each other. I'm a bit uncertain on how to
              > > design the DB, so was wondering if there is a standard to do this. I
              > > was figuring perhaps a table storing all user messages (but this might
              > > become very large very fast?) or would a seperate db storing
              > > sent/saved/recieved etc messages be better?[/color]
              >
              > Yes, a separate Table for messages sent,saved,rece ived could give
              > better results than a whole table with all the messages,[/color]

              I respectfully disagree. First, this would imply that a message sent
              by user A to user B will have to exist in two instances (in the `sent`
              table for user A and in the `received` table for user B). If a user
              broadcasts a message to 100 other users, you'll have to store 101
              copies of that message... Second, changing message status, say, from
              received to saved would require an INSERT INTO `saved` and a DELETE
              FROM `received` (probably inside a transaction, just to make sure we
              don't end up with twins), as opposed to something much more mundane,
              such as UPDATE `messages` SET status='saved'.

              Personally, I like Jerry Stuckle's suggestion:

              First table:
              msgid
              from
              to
              date
              subject
              status (sent/received/saved)

              Second table:
              msgid
              msgtext

              This is a good architecture, which may require a little tweaking if
              users are allowed to delete their messages and/or send messages to
              multiple users.

              Cheers,
              NC

              Comment

              • Andy Jeffries

                #8
                Re: PhP database design question

                On Wed, 29 Mar 2006 09:53:54 -0800, NC wrote:[color=blue][color=green]
                >> Yes, a separate Table for messages sent,saved,rece ived could give better
                >> results than a whole table with all the messages,[/color]
                >
                > I respectfully disagree. First, this would imply that a message sent by
                > user A to user B will have to exist in two instances (in the `sent` table
                > for user A and in the `received` table for user B). If a user broadcasts
                > a message to 100 other users, you'll have to store 101 copies of that
                > message...[/color]

                OK, if you only have one table, how do you cope with the situation above
                when 50 people delete it from their inbox, but the others want to keep the
                message.

                In a logical sense it absolutely does exist in each person's inbox/sent
                items as each person could apply their own choice of actions to it
                (deleting it is one, it could also be marked that you've replied and when
                or flagged for follow-up or marked as spam) - these aren't global for
                every copy of the message.

                Cheers,


                Andy

                --
                Andy Jeffries MBCS CITP ZCE | gPHPEdit Lead Developer
                http://www.gphpedit.org | PHP editor for Gnome 2
                http://www.andyjeffries.co.uk | Personal site and photos

                Comment

                • NC

                  #9
                  Re: PhP database design question

                  Andy Jeffries wrote:[color=blue]
                  > On Wed, 29 Mar 2006 09:53:54 -0800, NC wrote:[color=green][color=darkred]
                  > >> Yes, a separate Table for messages sent,saved,rece ived could give better
                  > >> results than a whole table with all the messages,[/color]
                  > >
                  > > I respectfully disagree. First, this would imply that a message sent by
                  > > user A to user B will have to exist in two instances (in the `sent` table
                  > > for user A and in the `received` table for user B). If a user broadcasts
                  > > a message to 100 other users, you'll have to store 101 copies of that
                  > > message...[/color]
                  >
                  > OK, if you only have one table, how do you cope with the situation above
                  > when 50 people delete it from their inbox, but the others want to keep the
                  > message.[/color]

                  Simple, really; per Jerry Stuckle's suggestion, let's adopt the
                  following data model with a few tweaks:

                  Table `messages`:
                  msgid (primary key)
                  msgtext

                  Table `headers`:
                  id (primary key)
                  msgid (index, non-unique; links to `messsages`.`ms gid`)
                  owner (index, non-unique; links to user ID)
                  from
                  to
                  date
                  subject
                  status ('draft'/'sent'/'received'/'saved'/'deleted')

                  Now, let's say user A sends a message to B, C and D. Now we have
                  something like this (partial rendering of the `headers` table):

                  id | msgid | owner | from | to | status
                  11 | 8 | A | A |B,C,D| sent
                  12 | 8 | B | A |B,C,D| received
                  13 | 8 | C | A |B,C,D| received
                  14 | 8 | D | A |B,C,D| received

                  Viewing A's outbox would look like this:

                  SELECT [fields] FROM headers
                  WHERE owner='A' AND status='sent';

                  Viewing B's inbox would look like this:

                  SELECT [fields] FROM headers
                  WHERE owner='B' AND status='receive d';

                  Deleting a message by C would look like this:

                  UPDATE headers SET status='deleted '
                  WHERE msgid=8 AND owner='C';

                  Pretty straightforward , I think...

                  Cheers,
                  NC

                  Comment

                  • Andy Jeffries

                    #10
                    Re: PhP database design question

                    On Wed, 29 Mar 2006 12:07:54 -0800, NC wrote:[color=blue][color=green][color=darkred]
                    >> > I respectfully disagree. First, this would imply that a message sent
                    >> > by user A to user B will have to exist in two instances (in the `sent`
                    >> > table for user A and in the `received` table for user B). If a user
                    >> > broadcasts a message to 100 other users, you'll have to store 101
                    >> > copies of that message...[/color]
                    >>
                    >> OK, if you only have one table, how do you cope with the situation above
                    >> when 50 people delete it from their inbox, but the others want to keep
                    >> the message.[/color]
                    >
                    > Simple, really[/color]

                    The worst thing was as soon as you wrote "Simple, really" it twigged in my
                    mind. Ignore me, obviously having a dopey day earlier...

                    (On the upside, I was just going to slink away in embarrassment - at least
                    I've been man enough to post on here and admit I was being a twat!)

                    Cheers,


                    Andy

                    --
                    Andy Jeffries MBCS CITP ZCE | gPHPEdit Lead Developer
                    http://www.gphpedit.org | PHP editor for Gnome 2
                    http://www.andyjeffries.co.uk | Personal site and photos

                    Comment

                    • Richard Levasseur

                      #11
                      Re: PhP database design question

                      Couple things:

                      - Since this is an internal system, delivery is gaurented and status
                      should would be better as read/unread, and perhaps 'deleted' if you
                      want to keep old messages around.
                      - To nit pick a bit: normalized (read: more or less 'proper') database
                      design doesn't allow for multiple values to be within a single field
                      (1st form violation? Can't recall exactly). This also eliminates the
                      need for the "owner" field, allows better indexing of "to", selection
                      with to, aggregate functions for things involving the "to" field.
                      - There doesn't need to be a record with status sent, that is implied
                      by someone receiving the message.
                      - If each row has a primary key id, then it would be better to perform
                      operations on the rows using the primary key, rather than a combination
                      of other keys that is unique (or just call that combination the primary
                      key, i personally feel its easier pass around a single id rather than a
                      combination of multiple fields)

                      So the table headers would become, in the instance A sends a message to
                      B, C, and D:

                      id | msgid | from | to | status
                      12 | 8 | A | B | unread
                      13 | 8 | A | C | read
                      14 | 8 | A | D | read



                      You're now storing 3 rows instead 4 and have eliminated a whole column.

                      Comment

                      Working...