Grouped links between tables.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • KevHill
    New Member
    • Jun 2007
    • 6

    Grouped links between tables.

    I am trying to create a bibliographic database, working with MySQL and Python, and I have a MySQL based question.

    I understand how to link two tables together on a row by row basis using a scheme like this. However, references have a bit more complex structure. On any one reference you might have, say more than one author, and each author might contribute to more than one reference.

    I could just have a table with rows for each author/work pairing, but that would make a messy table very quicky, and probably be a bear to query. What I would rather is some sleek way to allow for a ref_table that has a column for authors, with multiple authors listed by id, then a author_table with their id, name, and possibly a reverse reference to all works they authored.

    I guess I could do this with some complex string parsing in Python, but it would seem to involve sending the entire author_table over the sql connection every time I wanted to update the ref_table. I was wondering if there were an easier more elegant way to do this within the MySQL server itself. Any ideas? Thanks.
  • Atli
    Recognized Expert Expert
    • Nov 2006
    • 5062

    #2
    Hi KevHill, and welcome to TSDN.

    You should never put more than one piece of data in a single field, and you should never put more than one column with the same data per table.
    These are the first two of three rules of database normalization.
    You can read more about Database Normalisation and Table structures in this article

    Anyways, if I am understanding you correctly, you will need to create a table that links your 'Author' table with your 'Refrence' table.
    This is a standard way to create a N/M (many to many) relationship between tables, and is used in most, if not all, normalized relational databases.

    This is how I would (and have) designed such a database.

    Lets say I have this user table:
    [code=sql]
    CREATE TABLE tblUser
    (
    UserID BIGINT UNSIGNED AUTO_INCREMENT,
    UserName VARCHAR(255) NOT NULL,
    PRIMARY KEY(UserID)
    )
    [/code]

    And this article table:
    [code=sql]
    CREATE TABLE tblArticle
    (
    ArticleID BIGINT UNSIGNED AUTO_INCREMENT,
    ArticleName VARCHAR(255) NOT NULL,
    ArticleText LONGTEXT,
    PRIMARY KEY(ArticleID)
    )
    [/code]

    To be able to link each user to many articles, and each article to many users I will have to create a table that links the users and the articles:
    [code=sql]
    CREATE TABLE tblUserArticle
    (
    UserID BIGINT UNSIGNED REFERENCES tblUser(UserID) ,
    ArticleID BIGINT UNSIGNED REFERENCES tblArticle(Arti cleID),
    PRIMARY KEY(UserID, ArticleID)
    )
    [/code]
    Using both the UserID and the ArticleID as the primary key prevents duplicate rows.

    Now, if you want to see all articles that any user has contributed to, you could do something like this:
    [code=sql]
    SELECT a.ArticleName FROM tblArticle AS a
    INNER JOIN tblUserArticle AS ua
    ON ua.ArticleID = a.ArticleID
    INNER JOIN tblUser AS u
    ON u.UserID = ua.UserID
    WHERE u.UserID = 1;
    [/code]

    Let us know if you have any questions or problems we can help with.

    Comment

    • KevHill
      New Member
      • Jun 2007
      • 6

      #3
      It seems odd to me that that is the most efficient way to code that.

      Imagine if I had 100 references, and each article had an average of 4 authors. This would lead to 400 unique author/article pairings

      A very common job of a reference manager would be to sort all of the articles by authors, which requires finding out all the authors on a given article. Such info should always be available, but that would require searching through 100*400=40,000 tuples every time to refresh the database. If it was 300 articles with an average of 6 authors per article (which is also not unreasonable for scientific articles used for a thesis project), that mushrooms up to 270,000 tuple searches just to refresh a UI.

      Is mysql really that fast? Fast enough that the process would be faster than a python loop which just took out a string something like "id_1, id_2, id_3" and then parsed that string and performed searches on just the pertinent author tuples, which I think would be processing 400 tuples in the first example above.

      Do INNER JOINs somehow get around this inefficiency?

      I can see how it would be easier to search for all articles a single author had written, as you showed. If you just had many-authors-per-article, that would involve sending a relatively large amount of data over the connection compared to the 1NF standard, but that could be solved by a reverse many-articles-per-author column in the author's table. Keeping the two paired columns updated would be a bit of extra effort, but still seems to me more efficient...

      So, is there something I'm missing?

      Comment

      • MMcCarthy
        Recognized Expert MVP
        • Aug 2006
        • 14387

        #4
        Hi Kevin,

        The way it works is this. tblAuthor and tblArticle have a many to many relationship. By adding tblAuthorPerArt icle they each have a one to many relationship with the new table.

        tblAuthor
        AuthorID (Primary key, Autonumber)
        AuthorName

        tblArticle
        ArticleID (Primary key, Autonumber)
        ArticleName

        tblAuthorPerArt icle
        AuthorID (Composite Primary key, Foreign key referencing tblAuthor)
        ArticleID (Composite Primary key, Foreign key referencing tblArticle)

        Query to get all articles by author lets assume we have a form (frmSeach) with a listbox (lstAuthors) based on the tblAuthor table. The AuthorName field displayed and the AuthorID field bound.

        [CODE=sql]
        SELECT ArticleName
        FROM tblAuthorPerArt icle INNER JOIN tblArticle
        ON tblAuthorPerArt icle.ArticleID = tblArticle.Arti cleID
        WHERE tblAuthorPerArt icle.AuthorID = [Forms]![frmSearch]![lstAuthors]
        [/CODE]

        Query to get all authors of an article we use the same form (frmSeach) with a listbox (lstArticles) based on the tblArticle table. The ArticleName field displayed and the ArticleID field bound.

        [CODE=sql]
        SELECT AuthorName
        FROM tblAuthorPerArt icle INNER JOIN tblAuthor
        ON tblAuthorPerArt icle.AuthorID = tblArticle.Auth orID
        WHERE tblAuthorPerArt icle.ArticleID = [Forms]![frmSearch]![lstArticles]
        [/CODE]

        To add/Edit an record you would have a main form (frmArticle) based on tblArticle and a subform based on the following query and bound to the main form based on ArticleID.

        [CODE=sql]
        SELECT tblAuthorPerArt icle.ArticleID, tblAuthor.*
        FROM tblAuthorPerArt icle INNER JOIN tblAuthor
        ON tblAuthorPerArt icle.AuthorID = tblAuthor.Autho rID
        [/CODE]

        This is actually the most efficient way of handling the data. If you create your queries as saved queries and reference them by name then they will be compiled once by the Jet Engine and should work efficiently thereafter.

        Mary

        Comment

        • KevHill
          New Member
          • Jun 2007
          • 6

          #5
          It'll take me awhile to get my newbie head wrapped around that one, but thanks!

          Comment

          • MMcCarthy
            Recognized Expert MVP
            • Aug 2006
            • 14387

            #6
            Originally posted by KevHill
            It'll take me awhile to get my newbie head wrapped around that one, but thanks!
            Not a problem, you're welcome.

            Comment

            • MMcCarthy
              Recognized Expert MVP
              • Aug 2006
              • 14387

              #7
              Sorry, just looked at this again. Ignore my references to listboxes and forms. I didn't realise this question was in the MySQL forum. Too used to posting in the Access forum.

              The structure is sound and the sql syntax is correct just not the reference to form controls.

              Comment

              • Atli
                Recognized Expert Expert
                • Nov 2006
                • 5062

                #8
                Yea, I also just realized that in my example I didn't really need to put a JOIN to the authors table, as I didn't acctually use any data from there.
                It would be usefull, however, if you were to search for a Author name.

                Comment

                Working...