Recursive Query Quesiton - Table Function

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Tim Pascoe

    Recursive Query Quesiton - Table Function

    I have a database which stores information about organisms collected
    during sediment toxicology research. For each sample, organisms in
    sediment are collected and identified taxonomically (Order, Family,
    Genus, Species).

    Taxonomy lookup information in the database is stored in a recursive
    table in the form:

    TSN (taxa serial number)
    Rank (Order, Family, Genus, Species)
    Name
    Parent_TSN (related Taxa at higher taxonomic level)

    When the number of a particlar organism collected is entered into the
    database, the count is stored along with the lowest level TSN the
    organisms were identified to.

    Okay - now the problem. Depending on the type of analysis being done,
    a user may want organism counts at the lowest level, or rolled up to a
    higher taxonomic level (usually Family). Can I write a recursive
    function which will cycle through the Taxonomy database, and provide
    the name of the organism at the appropriate taxonomic level? Is this a
    reasonable approach with regard to speed and efficiency?

    Something Like:
    SELECT sample_id, 'Get Name Function(Rank, TSN)', Sum([count]) AS
    NoTaxa FROM dbo.tblbenthic

    Results could then be grouped and summed on the Name, to summarise
    data for each sample/taxa.

    Is this a reasonable approach? Or is there a better one? Did I explain
    the problem well enough?

    Thanks in advance,

    Tim
  • --CELKO--

    #2
    Re: Recursive Query Quesiton - Table Function

    I do not kow what a TSN (taxa serial number) looks like, but are there
    organism that do not have (Order, Family, Genus, Species)? It would
    seem be a better design to make them into columns then use NULLs for
    the missing classification data.

    CREATE TABLE LabNotes
    (..
    order INTEGER,
    family INTEGER,
    genus INTEGER,
    species INTEGER,
    CHECK (...),
    org_counts INTEGER NOT NULL CHECK (org_c0tns >= 0),
    ..);

    [color=blue][color=green]
    >> Okay - now the problem. Depending on the type of analysis being[/color][/color]
    done,
    a user may want organism counts at the lowest level, or rolled up to a
    higher taxonomic level (usually Family). <<

    Now that is easy to do

    SELECT .. org_count
    FROM LabNotes
    WHERE <level> IS NOT NULL;

    Did I miss something?

    Comment

    • --CELKO--

      #3
      Re: Recursive Query Quesiton - Table Function

      Sorry, I forgot to post the constraint and test data:

      CREATE TABLE LabNotes
      (ord INTEGER,
      family INTEGER,
      genus INTEGER,
      species INTEGER,
      CHECK (CASE WHEN (ord + family + genus + species) IS NOT NULL THEN 1
      WHEN COALESCE (ord, family, genus, species) IS NULL THEN 1
      WHEN COALESCE(specie s, family, genus) IS NULL THEN 1
      WHEN COALESCE(specie s, genus) IS NULL THEN 1
      WHEN species IS NULL THEN 1
      ELSE 0 END = 1)
      );
      SELECT * FROM LabNotes;

      --good data
      INSERT INTO Labnotes VALUES (1, 2, 3, 4);
      INSERT INTO Labnotes VALUES (1, 2, 3, NULL);
      INSERT INTO Labnotes VALUES (1, 2, NULL, NULL);
      INSERT INTO Labnotes VALUES (1, NULL, NULL, NULL);
      INSERT INTO Labnotes VALUES (NULL, NULL, NULL, NULL);
      -- bad data
      INSERT INTO Labnotes VALUES (1, 2, NULL, 4);
      INSERT INTO Labnotes VALUES (1, NULL, 3, 4);
      INSERT INTO Labnotes VALUES (1, 2, NULL, 3);
      INSERT INTO Labnotes VALUES (NULL, 2, NULL, 4);
      INSERT INTO Labnotes VALUES (NULL, NULL, NULL, 4);

      Comment

      • Erland Sommarskog

        #4
        Re: Recursive Query Quesiton - Table Function

        Tim Pascoe (tim.pascoe@cci w.ca) writes:[color=blue]
        > Okay - now the problem. Depending on the type of analysis being done,
        > a user may want organism counts at the lowest level, or rolled up to a
        > higher taxonomic level (usually Family). Can I write a recursive
        > function which will cycle through the Taxonomy database, and provide
        > the name of the organism at the appropriate taxonomic level? Is this a
        > reasonable approach with regard to speed and efficiency?
        >
        > Something Like:
        > SELECT sample_id, 'Get Name Function(Rank, TSN)', Sum([count]) AS
        > NoTaxa FROM dbo.tblbenthic
        >
        > Results could then be grouped and summed on the Name, to summarise
        > data for each sample/taxa.
        >
        > Is this a reasonable approach? Or is there a better one? Did I explain
        > the problem well enough?[/color]

        I don't think so, I only understand bits of it. :-)

        It may help if you post:

        o CREATE TABLE statement for your table.
        o INSERT statements with sample data.
        o The desired result from that sample data.


        --
        Erland Sommarskog, SQL Server MVP, sommar@algonet. se

        Books Online for SQL Server SP3 at
        SQL Server 2025 redefines what's possible for enterprise data. With developer-first features and integration with analytics and AI models, SQL Server 2025 accelerates AI innovation using the data you already have.

        Comment

        • tim pascoe

          #5
          Re: Recursive Query Quesiton - Table Function


          CELKO,

          The move to a recursive table structue was to get away from the example
          you suggested as an alternative :) Since organisms may only be IDed down
          to one of the possible taxonomic levels, data may be entered to Order,
          Family, Genus, or Species levels. If a row is used to identify this,
          with a NULL for the 'missing' data, you end up needing to store 4
          records for the complete teaxonomy of a single organism, and names must
          be repeated hundreds of times (say the family has 10 Genus, then that is
          20 repetitions of the Family name, to store Genus and Species)

          E.G. Order, NULL, NULL, NULL
          Order Family, NULL, NULL
          Oder Family Genus, NULL etc.

          It becomes a management nightmare when an organism changes from one
          Family to another, or one Genus to another (it happens very often,
          supprisingly).

          4 records are still required for each level in the recursion, but there
          are no NULL values, fewer columns, and the changes in taxonomy can be
          altered with the change of a single relational value (the Parent TSN).
          Also, repetition of higher categories does not occur, due to the
          relaitonal nature of the links.

          Thanks for the reply, however. I will need to look up the COALESCE
          key-word, as I'm sure it will come in handy!

          Tim

          *** Sent via Developersdex http://www.developersdex.com ***
          Don't just participate in USENET...get rewarded for it!

          Comment

          • --CELKO--

            #6
            Re: Recursive Query Quesiton - Table Function

            >> Since organisms may only be IDed down to one of the possible
            taxonomic levels, data may be entered to Order, Family, Genus, or
            Species levels. If a row is used to identify this, with a NULL for the
            'missing' data, you end up needing to store 4 records [sic] for the
            complete taxonomy of a single organism, and names must be repeated
            hundreds of times (say the family has 10 Genus, then that is 20
            repetitions of the Family name, to store Genus and Species) <<

            If you get more detailed information on an organism, then you update
            the NULLs with the values you just discovered, don't you? One way or
            the other, each organism is going to be modeled once in my table
            design.
            [color=blue][color=green]
            >> It becomes a management nightmare when an organism changes from one[/color][/color]
            Family to another, or one Genus to another (it happens very often,
            surprisingly). <<

            So that is just one update on poor old "Omosis Jones" to his new
            taxonomy. If I have to switch him to another family and I don't know
            any more about him yet, I just fill in (genus, species) with NULLs in
            the same single update.

            I think I might see what I am missing in this problem. Off to the
            side of the lab work, you can keep a nested sets model of the taxonomy
            apart from particular organisms. You can Google the basics on that
            model (or I can bore the regulars by posting it again).

            Comment

            • tim pascoe

              #7
              Re: Recursive Query Quesiton - Table Function

              CELKO,

              It is indeed a Nested Sets type of a problem. While there are more
              elegant ways of storing the data, a simple heirarchy in this case is
              very functional. The original question was not intended to ask if the
              table structure was effective (I've goen through that exercise already),
              but what the implications were for applying a user-defined function to
              extract recursive data when a user requests summary counts at a level
              higher than what the data was entered at (e.g. data entered at
              Genus/Species, but summary counts requested for Orders).

              I have built a SP which returns the appropriate taxonomic name at the
              requested level, now I need to figure out a way to modify this into a
              function, and use it in place of a column name in a query for the result
              set.

              I think I'm on the right track - thanks again for your input.

              Tim


              *** Sent via Developersdex http://www.developersdex.com ***
              Don't just participate in USENET...get rewarded for it!

              Comment

              Working...