Storing list data in MySQL

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • cheesecaker
    New Member
    • Feb 2007
    • 66

    Storing list data in MySQL

    How would a set of related values be stored together in a MySQL db?

    For example, if I wanted to store a user's buddy list, then the way I can think of would be something like this:

    user | friend
    John | Jim
    John | Bob
    John | Amy
    Bob | Jim
    Bob | Amy

    So John's friendlist would include Jim, Bob, and Amy, and Bob's friendlist would include Jim and Amy. But it seems really inefficient to create a new row every time a new entry in the list is needed

    I want to be able to store values like this:

    user | buddylist
    John | Jim,Bob,Amy
    Bob | Jim,Amy

    What would be the most efficient way to do this?
  • Atli
    Recognized Expert Expert
    • Nov 2006
    • 5062

    #2
    Hi.

    The second method you mentioned should never be used. No field should ever contain more than a single piece of data.

    A simple method to do something like that would be to create two tables. One for the user info and a second for the relationship between each user.

    It might look something like:
    Code:
    User
    ----------------
    UserID UserName
    1		John
    2		Jim
    3		Amy
    ----------------
    
    UserRelation
    ----------------
    UserID FriendID
    1		2
    1		3
    2		1
    3		2
    ----------------
    Where both the fields of the UserRelation talble reference the UserID field in the User table. You could even join them as the Primary Key to avoid duplicate entries.

    Comment

    • cheesecaker
      New Member
      • Feb 2007
      • 66

      #3
      Originally posted by Atli
      Hi.

      The second method you mentioned should never be used. No field should ever contain more than a single piece of data.

      A simple method to do something like that would be to create two tables. One for the user info and a second for the relationship between each user.

      It might look something like:
      Code:
      User
      ----------------
      UserID UserName
      1		John
      2		Jim
      3		Amy
      ----------------
      
      UserRelation
      ----------------
      UserID FriendID
      1		2
      1		3
      2		1
      3		2
      ----------------
      Where both the fields of the UserRelation talble reference the UserID field in the User table. You could even join them as the Primary Key to avoid duplicate entries.
      This is pretty much the setup I currently have, but I thought it was inefficient. So just for clarification, this is the best way to store a list-type data set? If I wanted to store, say, a list of items that the user owned, it'd be the same thing? A table with itemID/itemName and another with userID/itemOwned?

      One other thing. On that second table, what would the unique primary key be? Would I just have an arbitrary one such as entryID that would serve no purpose? Because tools like MySQL Query Browser require a primary key to be able to manipulate a table.

      Comment

      • Atli
        Recognized Expert Expert
        • Nov 2006
        • 5062

        #4
        Yes. This is pretty much the basis of a N:M relationship.

        This is also far more efficient that what you were suggesting in your first post.

        Firstly, your first example stored everything as text fields. It is much faster to search through integers than text, even if the text fields are indexed. Not to mention the amount of disk space all that text takes, compared to the integers.

        You second example, while using a less complex table structure, stores multiple pieces of data in each field as a string, which has to be processed and split into each individual part every time any one of them needs to be used.
        The overhead of that is far beyond any overhead a simple JOIN would cause, especially using a simple structure, like the one I posted.

        The second table doesn't need the traditional one-column integer primary key we use on most tables.
        You can use the two foreign key columns as the primary key:
        [code=mysql]
        CREATE TABLE UserRelation (
        UserID_FK INT Unsigned Not Null References User(UserID),
        FriendID_FK Int Unsigned Not Null References User(UserID),
        Primary Key (UserID_FK, FriendID_FK)
        );
        [/code]
        This will also protect the table from duplicate entries, like linking one user twice to the same friend.

        Comment

        • cheesecaker
          New Member
          • Feb 2007
          • 66

          #5
          Originally posted by Atli
          Yes. This is pretty much the basis of a N:M relationship.

          This is also far more efficient that what you were suggesting in your first post.

          Firstly, your first example stored everything as text fields. It is much faster to search through integers than text, even if the text fields are indexed. Not to mention the amount of disk space all that text takes, compared to the integers.

          You second example, while using a less complex table structure, stores multiple pieces of data in each field as a string, which has to be processed and split into each individual part every time any one of them needs to be used.
          The overhead of that is far beyond any overhead a simple JOIN would cause, especially using a simple structure, like the one I posted.

          The second table doesn't need the traditional one-column integer primary key we use on most tables.
          You can use the two foreign key columns as the primary key:
          [code=mysql]
          CREATE TABLE UserRelation (
          UserID_FK INT Unsigned Not Null References User(UserID),
          FriendID_FK Int Unsigned Not Null References User(UserID),
          Primary Key (UserID_FK, FriendID_FK)
          );
          [/code]
          This will also protect the table from duplicate entries, like linking one user twice to the same friend.
          I'm not familiar with the syntax for creating a table which references foreign keys. Could you explain the use of two column names with Primary Key()? I've only seen Primary Key (x), not Primary Key (x, x).

          Thanks for the help.

          Comment

          • Atli
            Recognized Expert Expert
            • Nov 2006
            • 5062

            #6
            Sure.

            Simply put, if you specify multiple columns in the Primary Key clause, MySQL will create a joint key, including all of them.

            The actual value of the Primary Key would be something like:
            Code:
            col1-col2-col3-colN
            Where the col1-colN values would be replaced by the value of each column in the given row.

            Because of this, you can not create a row containing the exact combination of values for the PK columns as any previous row. The Primary Key value must be unique, even tho each of the columns that make up the Primary Key can have duplicate entries.

            Which makes this ideal for the exact situation we were discussing.

            Comment

            • cheesecaker
              New Member
              • Feb 2007
              • 66

              #7
              So if the User and UserRelation tables were implemented in an actual application, in order to display John's buddylist, you'd have to do the following:

              Code:
              SELECT UserID FROM `User` WHERE UserName = 'John'
              And we'd get 1. Then we'd do...
              Code:
              SELECT FriendID FROM `UserRelation` WHERE UserID = 1
              And we'd get 2 and 3. Then we'd do...
              Code:
              SELECT UserName FROM `UserRelation` WHERE UserID = 2
              SELECT UserName FROM `UserRelation` WHERE UserID = 3
              And we'd finally get John's buddylist, Jim and Amy. Is all of this really faster than just having:
              Code:
              Friend
              ----------------
              UserName FriendName
              John     Jim
              John     Amy
              Jim      John
              Amy      John
              ----------------
              Where we could simply run SELECT FriendName FROM `Friend` WHERE UserName = 'John'?

              Comment

              • Atli
                Recognized Expert Expert
                • Nov 2006
                • 5062

                #8
                Yes. Matching text is usually a lot slower than matching numbers, although your approach is a very simplistic one.
                Not to mention that the way you are suggesting would waste infinitely more disk-space than the proper way.

                You need to use the tools available in a relational database.
                A SELECT query is not just limited to the SELECT FROM WHERE syntax. You can have it fetch data from more than one table, and have it filter this data based on more then a simple boolean x = y check.

                This is how such a database should be used:
                [code=mysql]
                SELECT `User`.`UserNam e` AS 'Friend Name'
                FROM `User`
                INNER JOIN `UserRelation`
                ON `User`.`UserID` = `UserRelation`. `FriendID_FK`
                AND `UserRelation`. `UserID_FK` = (
                SELECT `UserID` FROM `User`
                WHERE `UserName` = 'John'
                )
                [/code]
                It Joins the two tables and filters the data to show the friends of the user selected by the subquery.
                Note that I assume that the UserName is unique, which is why I use it in the subquery without a LIMIT clause.

                It does exactly what all of your SELECT queries do, but in the proper way.
                This is the true power of a relational database, it's ability to join tables and filter the data based on more than just a single table SELECT FROM WHERE statement.

                If you don't get the syntax, I suggest you read about Join and Subquery syntax.
                If it's all new to you, you may want to look up some tutorials on those concepts.

                Comment

                • NetDynamic
                  New Member
                  • Feb 2011
                  • 27

                  #9
                  This is all fine and good until the friend has more than one friend lol

                  Comment

                  • Atli
                    Recognized Expert Expert
                    • Nov 2006
                    • 5062

                    #10
                    Originally posted by NetDynamic
                    This is all fine and good until the friend has more than one friend lol
                    How do you mean? The example tables I posted in post #2 show John being friends with both Amy and Jim.

                    The whole purpose of the concepts we are talking about here is to link one row to multiple other rows. (Read: to allow one person to be friends with multiple persons.)

                    Comment

                    Working...