Nested Coalescing possible in SQL?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jeff Lanfield

    Nested Coalescing possible in SQL?

    First of all, I apologize if coalescing is not the right term to
    describe my problem. I have a tree where each node has the same set of
    attributes (is the same entity) but child nodes should inherit
    attribute values from parent node.

    for example, say I have the following table:

    (nodeId int , color varchar, phone varchar) with two rows

    5, "GREEN", "555-1212"
    7, NULL, "777-5555"
    8, NULL, NULL
    9, "BLUE", NULL

    in addition there is a tree structure that specifies that node 5 is
    the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
    there is many ways to make trees in SQL but as a simple example let's
    say the tree is:

    id, parentid
    8, 7
    9, 7
    7, 5

    Thus in this case, node 7 inherits the value "GREEN" from node 5 for
    attribute "color", but provides its own value "777-5555" for attribute
    "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
    (really from node 5 since 7 did not specify its own) and "777-5555"
    for "phone" from node 7. Node 9 provides its own value for "color" and
    inherits the one for "phone" from Node 7.

    So the runtime values in the application are:

    Node 5: "GREEN", "555-1212"
    Node 7: "GREEN", "777-5555"
    Node 8: "GREEN", "777-5555"
    Node 9: "BLUE", "777-5555"

    Question 1: Is there a single SQL statement that for a given node can
    replace the NULLs with inherited values from the parent node?

    Question 2: Is there a better way to structure such data in SQL as to
    make answer to question 1 possible?

    I would restate the problem as follows:

    In a nested structure child nodes inherit values from parent nodes _by
    reference_ or specify their own. "By reference" is the key word here.
    If it wasn't for that you could just duplicate the necessary values
    from the parent entitity upon creation.

    Thanks!

    - Jeff
  • Mikito Harakiri

    #2
    Re: Nested Coalescing possible in SQL?


    "Jeff Lanfield" <jlanfield2003@ yahoo.com> wrote in message
    news:235c483f.0 406011716.37d00 399@posting.goo gle.com...[color=blue]
    > First of all, I apologize if coalescing is not the right term to
    > describe my problem. I have a tree where each node has the same set of
    > attributes (is the same entity) but child nodes should inherit
    > attribute values from parent node.
    >
    > for example, say I have the following table:
    >
    > (nodeId int , color varchar, phone varchar) with two rows
    >
    > 5, "GREEN", "555-1212"
    > 7, NULL, "777-5555"
    > 8, NULL, NULL
    > 9, "BLUE", NULL
    >
    > in addition there is a tree structure that specifies that node 5 is
    > the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
    > there is many ways to make trees in SQL but as a simple example let's
    > say the tree is:
    >
    > id, parentid
    > 8, 7
    > 9, 7
    > 7, 5
    >
    > Thus in this case, node 7 inherits the value "GREEN" from node 5 for
    > attribute "color", but provides its own value "777-5555" for attribute
    > "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
    > (really from node 5 since 7 did not specify its own) and "777-5555"
    > for "phone" from node 7. Node 9 provides its own value for "color" and
    > inherits the one for "phone" from Node 7.
    >
    > So the runtime values in the application are:
    >
    > Node 5: "GREEN", "555-1212"
    > Node 7: "GREEN", "777-5555"
    > Node 8: "GREEN", "777-5555"
    > Node 9: "BLUE", "777-5555"
    >
    > Question 1: Is there a single SQL statement that for a given node can
    > replace the NULLs with inherited values from the parent node?
    >
    > Question 2: Is there a better way to structure such data in SQL as to
    > make answer to question 1 possible?
    >
    > I would restate the problem as follows:
    >
    > In a nested structure child nodes inherit values from parent nodes _by
    > reference_ or specify their own. "By reference" is the key word here.
    > If it wasn't for that you could just duplicate the necessary values
    > from the parent entitity upon creation.[/color]

    It looks easy. Find a closest node in the chain of ancestors that has
    property not NULL.


    Comment

    • Lennart Jonsson

      #3
      Re: Nested Coalescing possible in SQL?

      jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406011716.37d0 0399@posting.go ogle.com>...[color=blue]
      > First of all, I apologize if coalescing is not the right term to
      > describe my problem. I have a tree where each node has the same set of
      > attributes (is the same entity) but child nodes should inherit
      > attribute values from parent node.
      >
      > for example, say I have the following table:
      >
      > (nodeId int , color varchar, phone varchar) with two rows
      >
      > 5, "GREEN", "555-1212"
      > 7, NULL, "777-5555"
      > 8, NULL, NULL
      > 9, "BLUE", NULL
      >
      > in addition there is a tree structure that specifies that node 5 is
      > the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
      > there is many ways to make trees in SQL but as a simple example let's
      > say the tree is:ancestor a
      >
      > id, parentid
      > 8, 7
      > 9, 7
      > 7, 5
      >
      > Thus in this case, node 7 inherits the value "GREEN" from node 5 for
      > attribute "color", but provides its own value "777-5555" for attribute
      > "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
      > (really from node 5 since 7 did not specify its own) and "777-5555"
      > for "phone" from node 7. Node 9 provides its own value for "color" and
      > inherits the one for "phone" from Node 7.
      >
      > So the runtime values in the application are:
      >
      > Node 5: "GREEN", "555-1212"
      > Node 7: "GREEN", "777-5555"
      > Node 8: "GREEN", "777-5555"
      > Node 9: "BLUE", "777-5555"
      >
      > Question 1: Is there a single SQL statement that for a given node can
      > replace the NULLs with inherited values from the parent node?
      >
      > Question 2: Is there a better way to structure such data in SQL as to
      > make answer to question 1 possible?
      >
      > I would restate the problem as follows:
      >
      > In a nested structure child nodes inherit values from parent nodes _by
      > reference_ or specify their own. "By reference" is the key word here.
      > If it wasn't for that you could just duplicate the necessary values
      > from the parent entitity upon creation.
      >
      > Thanks!
      >
      > - Jeff[/color]


      If you would like to do it in a single query, you will have to extend
      your tree, or if your db supports it, you can use recursion. There are
      several ways of extending your tree, nested set, transitive closure,
      etc. If you google comp.database and comp.database.t heory you will
      find several threads regarding this.

      Assuming you can "calculate" the set of ancestors for any given node,
      define the suspects as "ancestors with property p". The property you
      are looking for can be found in the suspect with the largest depth*.

      *depth = number of ancestors


      HTH
      /Lennart

      Comment

      • --CELKO--

        #4
        Re: Nested Coalescing possible in SQL?

        >> Question 2: Is there a better way to structure such data in SQL as
        to make answer to question 1 possible? <<

        Here is the link on Amazon.com for my new book on "Trees & Hierarchies
        in SQL"



        The classic scenario calls for a root class with all the common
        attributes and then specialized sub-classes under it. As an example,
        let's take the class of Vehicles and find an industry standard
        identifier (VIN), and add two mutually exclusive sub-classes, Sport
        utility vehicles and sedans ('SUV', 'SED').

        CREATE TABLE Vehicles
        (vin CHAR(17) NOT NULL PRIMARY KEY,
        vehicle_type CHAR(3) NOT NULL
        CHECK(vehicle_t ype IN ('SUV', 'SED')),
        UNIQUE (vin, vehicle_type),
        ..);

        Notice the overlapping candidate keys. I then use a compound candidate
        key (vin, vehicle_type) and a constraint in each sub-class table to
        assure that the vehicle_type is locked and agrees with the Vehicles
        table. Add some DRI actions and you are done:

        CREATE TABLE SUV
        (vin CHAR(17) NOT NULL PRIMARY KEY,
        vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
        CHECK(vehicle_t ype = 'SUV'),
        UNIQUE (vin, vehicle_type),
        FOREIGN KEY (vin, vehicle_type)
        REFERENCES Vehicles(vin, vehicle_type)
        ON UPDATE CASCADE
        ON DELETE CASCADE,
        ..);

        CREATE TABLE Sedans
        (vin CHAR(17) NOT NULL PRIMARY KEY,
        vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
        CHECK(vehicle_t ype = 'SED'),
        UNIQUE (vin, vehicle_type),
        FOREIGN KEY (vin, vehicle_type)
        REFERENCES Vehicles(vin, vehicle_type)
        ON UPDATE CASCADE
        ON DELETE CASCADE,
        ..);

        I can continue to build a hierarchy like this. For example, if I had
        a Sedans table that broke down into two-door and four-door sedans, I
        could a schema like this:

        CREATE TABLE Sedans
        (vin CHAR(17) NOT NULL PRIMARY KEY,
        vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
        CHECK(vehicle_t ype IN ('2DR', '4DR', 'SED')),
        UNIQUE (vin, vehicle_type),
        FOREIGN KEY (vin, vehicle_type)
        REFERENCES Vehicles(vin, vehicle_type)
        ON UPDATE CASCADE
        ON DELETE CASCADE,
        ..);

        CREATE TABLE TwoDoor
        (vin CHAR(17) NOT NULL PRIMARY KEY,
        vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
        CHECK(vehicle_t ype = '2DR'),
        UNIQUE (vin, vehicle_type),
        FOREIGN KEY (vin, vehicle_type)
        REFERENCES Sedans(vin, vehicle_type)
        ON UPDATE CASCADE
        ON DELETE CASCADE,
        ..);

        CREATE TABLE FourDoor
        (vin CHAR(17) NOT NULL PRIMARY KEY,
        vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
        CHECK(vehicle_t ype = '4DR'),
        UNIQUE (vin, vehicle_type),
        FOREIGN KEY (vin, vehicle_type)
        REFERENCES Sedans (vin, vehicle_type)
        ON UPDATE CASCADE
        ON DELETE CASCADE,
        ..);

        The idea is to build a chain of identifiers and types in a UNIQUE()
        constraint that go up the tree when you use a REFERENCES constraint.
        Obviously, you can do variants of this trick to get different class
        structures.

        If an entity doesn't have to be exclusively one subtype, you play with
        the root of the class hierarchy:

        CREATE TABLE Vehicles
        (vin CHAR(17) NOT NULL,
        vehicle_type CHAR(3) NOT NULL
        CHECK(vehicle_t ype IN ('SUV', 'SED')),
        PRIMARY KEY (vin, vehicle_type),
        ..);

        Now start hiding all this stuff in VIEWs immediately and add an
        INSTEAD OF trigger to those VIEWs.

        Comment

        • Lennart Jonsson

          #5
          Re: Nested Coalescing possible in SQL?

          jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...

          [...]
          [color=blue]
          >
          > So say the tree is specified like this:
          >
          > (nodeId int, parentId int, color varchar, phone varchar)
          >
          > 5, 0,"GREEN", "555-1212"
          > 7, 5, NULL, "777-5555"
          > 8, 7 NULL, NULL
          > 9, 7 "BLUE", NULL
          >
          >
          > That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
          > 7. Node 5 is the top level node.
          >
          > I want to run a query that would give the following result set:
          >
          > select nodeId, color, phone from ...
          >
          > 5,"GREEN", "555-1212"
          > 7,"GREEN", "777-5555"
          > 8,"GREEN", "777-5555"
          > 9,"BLUE", "777-5555"
          >
          > Can such a query be constructed?
          >[/color]

          If you dont have a maximum depth of your tree (and cannot use
          recursion), then no. The reason is that you cannot express queries
          like "gimmie the ancestors of x". What you need to do is to inform
          your system that 5 is an ancestor of 9. There are several ways of
          doing this. Troels Arvin has a page with links to articles on the
          subject:



          1. Nested set: google for Celko and nested. There are also variants of
          this.
          2. Store the path in each node. In your case something like:

          5,'',"GREEN", "555-1212"
          7,'5',"GREEN", "777-5555"
          8,'5.7',"GREEN" , "777-5555"
          9,'5.7',"BLUE", "777-5555"

          In your case I dont think this is an option

          3. Store a separate ancestor relation. In your case

          create table ancestor (
          nodeid int not null
          references tree,
          ancestorid int not null
          references tree,
          primary key (nodeid, ancestorid)
          );

          insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

          Main drawback is that you have to maintain this relation as you
          modifies your tree. I have some notes on how that can be achieved:





          Anyhow, once you have a way of asking for ancestors for a given node,
          you can do what you want in a single query, namely:

          find the ancestor at the largest depth with property p

          Assuming the following ddl

          create table tree (
          nodeid int not null primary key,
          parent int not null
          references tree
          on delete restrict
          );

          create table data (
          nodeid int not null primary key
          references tree
          on delete cascade,
          color varchar(10),
          phone varchar(10)
          );

          insert into tree values (5,5), (7,5), (8,7), (9,7);
          insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
          '777-5555'), (8, NUL
          L, NULL), (9, 'BLUE', NULL);

          create table ancestor (
          nodeid int not null
          references tree,
          ancestorid int not null
          references tree,
          primary key (nodeid, ancestorid)
          );

          insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

          You could use a query like:

          with suspects (nodeid, suspectid, depth) as (
          select
          a.nodeid, a.ancestorid,
          (select count(1) from ancestor where nodeid = a.ancestorid) as
          depth
          from ancestor a, data d
          where a.ancestorid = d.nodeid
          and d.color is not null
          union all
          select
          d.nodeid, d.nodeid,
          (select count(1) from ancestor where nodeid = d.nodeid) as
          depth
          from data d
          where d.color is not null
          and not exists (
          select 1 from ancestor where ancestorid = d.nodeid
          )
          )
          select s.nodeid, d.color from suspects s, data d
          where d.nodeid = s.suspectid
          and depth = (select max(depth) from suspects where nodeid =
          s.nodeid)

          NODEID COLOR
          ----------- ----------
          7 GREEN
          8 GREEN
          9 BLUE

          3 record(s) selected.

          As you can see node 5 is missing, but I'll leave that for you ;-)


          HTH
          /Lennart

          [color=blue]
          > - Jeff[/color]

          Comment

          • Jeff Lanfield

            #6
            Re: Nested Coalescing possible in SQL?

            Thanks --CELKO--, this is a very useful suggestion and I will keep it
            in mind. However, it is designed to handle an hiearchy of types (e.g.
            representing class hierarchy). My case is slightly different: I don't
            have types and subtypes, I have only one table representing one
            entity. The inheritance of values is in the following sense: if an
            entity instance (one row) does not specify a value for one of its
            fields it should inherit the values from its nearest parent that does
            have a value for this field. Note that the immediate parent may have a
            NULL in that field too, so you'd have to go the parent's parent and so
            on till you find a non-NULL value for the field in question. I think I
            should be able to use COALESCE to do it somehow but I can't come up
            with a *single* query.

            So the exact example is this. I have a tree structure of values, there
            is only one type of entity. Per Lennart's suggestion I'm keeping the
            full path pre-computed for the sake of simplicity. There is only this
            one table:

            (nodeId int, varchar parent_path, color varchar, phone varchar)

            5, "0", "GREEN", "555-1212"
            7, "0,5", NULL, "777-5555"
            8, "0,5,7" NULL, NULL
            9, "0,5,7" "BLUE", NULL

            I want to run a query that would give the following result set:

            select nodeId, color, phone from ...

            5,"GREEN", "555-1212"
            7,"GREEN", "777-5555"
            8,"GREEN", "777-5555"
            9,"BLUE", "777-5555"

            Is it possible to have such a query?

            - Jeff




            jcelko212@earth link.net (--CELKO--) wrote in message news:<18c7b3c2. 0406021241.119e 23ec@posting.go ogle.com>...[color=blue][color=green][color=darkred]
            > >> Question 2: Is there a better way to structure such data in SQL as[/color][/color]
            > to make answer to question 1 possible? <<
            >
            > Here is the link on Amazon.com for my new book on "Trees & Hierarchies
            > in SQL"
            >
            > http://www.amazon.com/exec/obidos/tg...roduct-details
            >
            > The classic scenario calls for a root class with all the common
            > attributes and then specialized sub-classes under it. As an example,
            > let's take the class of Vehicles and find an industry standard
            > identifier (VIN), and add two mutually exclusive sub-classes, Sport
            > utility vehicles and sedans ('SUV', 'SED').
            >
            > CREATE TABLE Vehicles
            > (vin CHAR(17) NOT NULL PRIMARY KEY,
            > vehicle_type CHAR(3) NOT NULL
            > CHECK(vehicle_t ype IN ('SUV', 'SED')),
            > UNIQUE (vin, vehicle_type),
            > ..);
            >
            > Notice the overlapping candidate keys. I then use a compound candidate
            > key (vin, vehicle_type) and a constraint in each sub-class table to
            > assure that the vehicle_type is locked and agrees with the Vehicles
            > table. Add some DRI actions and you are done:
            >
            > CREATE TABLE SUV
            > (vin CHAR(17) NOT NULL PRIMARY KEY,
            > vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
            > CHECK(vehicle_t ype = 'SUV'),
            > UNIQUE (vin, vehicle_type),
            > FOREIGN KEY (vin, vehicle_type)
            > REFERENCES Vehicles(vin, vehicle_type)
            > ON UPDATE CASCADE
            > ON DELETE CASCADE,
            > ..);
            >
            > CREATE TABLE Sedans
            > (vin CHAR(17) NOT NULL PRIMARY KEY,
            > vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
            > CHECK(vehicle_t ype = 'SED'),
            > UNIQUE (vin, vehicle_type),
            > FOREIGN KEY (vin, vehicle_type)
            > REFERENCES Vehicles(vin, vehicle_type)
            > ON UPDATE CASCADE
            > ON DELETE CASCADE,
            > ..);
            >
            > I can continue to build a hierarchy like this. For example, if I had
            > a Sedans table that broke down into two-door and four-door sedans, I
            > could a schema like this:
            >
            > CREATE TABLE Sedans
            > (vin CHAR(17) NOT NULL PRIMARY KEY,
            > vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
            > CHECK(vehicle_t ype IN ('2DR', '4DR', 'SED')),
            > UNIQUE (vin, vehicle_type),
            > FOREIGN KEY (vin, vehicle_type)
            > REFERENCES Vehicles(vin, vehicle_type)
            > ON UPDATE CASCADE
            > ON DELETE CASCADE,
            > ..);
            >
            > CREATE TABLE TwoDoor
            > (vin CHAR(17) NOT NULL PRIMARY KEY,
            > vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
            > CHECK(vehicle_t ype = '2DR'),
            > UNIQUE (vin, vehicle_type),
            > FOREIGN KEY (vin, vehicle_type)
            > REFERENCES Sedans(vin, vehicle_type)
            > ON UPDATE CASCADE
            > ON DELETE CASCADE,
            > ..);
            >
            > CREATE TABLE FourDoor
            > (vin CHAR(17) NOT NULL PRIMARY KEY,
            > vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
            > CHECK(vehicle_t ype = '4DR'),
            > UNIQUE (vin, vehicle_type),
            > FOREIGN KEY (vin, vehicle_type)
            > REFERENCES Sedans (vin, vehicle_type)
            > ON UPDATE CASCADE
            > ON DELETE CASCADE,
            > ..);
            >
            > The idea is to build a chain of identifiers and types in a UNIQUE()
            > constraint that go up the tree when you use a REFERENCES constraint.
            > Obviously, you can do variants of this trick to get different class
            > structures.
            >
            > If an entity doesn't have to be exclusively one subtype, you play with
            > the root of the class hierarchy:
            >
            > CREATE TABLE Vehicles
            > (vin CHAR(17) NOT NULL,
            > vehicle_type CHAR(3) NOT NULL
            > CHECK(vehicle_t ype IN ('SUV', 'SED')),
            > PRIMARY KEY (vin, vehicle_type),
            > ..);
            >
            > Now start hiding all this stuff in VIEWs immediately and add an
            > INSTEAD OF trigger to those VIEWs.[/color]

            Comment

            • Jeff Lanfield

              #7
              Re: Nested Coalescing possible in SQL?

              Thanks Lennart. I think your third suggestion might do the trick. If
              you have a chance to answer I'm curious:

              1) Why do you think storing the path (suggestion #2) is not an option?
              Or if it is option, why is it a bad idea compared to storing a
              separate ancestor relation? (aside from the usual normal form
              reasoning)

              2) How would you structure the query if you stored the path
              (suggestion #2).

              Thanks!

              - Jeff



              lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=blue]
              > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...
              >
              > [...]
              >[color=green]
              > >
              > > So say the tree is specified like this:
              > >
              > > (nodeId int, parentId int, color varchar, phone varchar)
              > >
              > > 5, 0,"GREEN", "555-1212"
              > > 7, 5, NULL, "777-5555"
              > > 8, 7 NULL, NULL
              > > 9, 7 "BLUE", NULL
              > >
              > >
              > > That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
              > > 7. Node 5 is the top level node.
              > >
              > > I want to run a query that would give the following result set:
              > >
              > > select nodeId, color, phone from ...
              > >
              > > 5,"GREEN", "555-1212"
              > > 7,"GREEN", "777-5555"
              > > 8,"GREEN", "777-5555"
              > > 9,"BLUE", "777-5555"
              > >
              > > Can such a query be constructed?
              > >[/color]
              >
              > If you dont have a maximum depth of your tree (and cannot use
              > recursion), then no. The reason is that you cannot express queries
              > like "gimmie the ancestors of x". What you need to do is to inform
              > your system that 5 is an ancestor of 9. There are several ways of
              > doing this. Troels Arvin has a page with links to articles on the
              > subject:
              >
              > http://troels.arvin.dk/db/rdbms/links/#hierarchical
              >
              > 1. Nested set: google for Celko and nested. There are also variants of
              > this.
              > 2. Store the path in each node. In your case something like:
              >
              > 5,'',"GREEN", "555-1212"
              > 7,'5',"GREEN", "777-5555"
              > 8,'5.7',"GREEN" , "777-5555"
              > 9,'5.7',"BLUE", "777-5555"
              >
              > In your case I dont think this is an option
              >
              > 3. Store a separate ancestor relation. In your case
              >
              > create table ancestor (
              > nodeid int not null
              > references tree,
              > ancestorid int not null
              > references tree,
              > primary key (nodeid, ancestorid)
              > );
              >
              > insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
              >
              > Main drawback is that you have to maintain this relation as you
              > modifies your tree. I have some notes on how that can be achieved:
              >
              > http://fungus.teststation.com/~jon/t...eeHandling.htm
              >
              >
              >
              > Anyhow, once you have a way of asking for ancestors for a given node,
              > you can do what you want in a single query, namely:
              >
              > find the ancestor at the largest depth with property p
              >
              > Assuming the following ddl
              >
              > create table tree (
              > nodeid int not null primary key,
              > parent int not null
              > references tree
              > on delete restrict
              > );
              >
              > create table data (
              > nodeid int not null primary key
              > references tree
              > on delete cascade,
              > color varchar(10),
              > phone varchar(10)
              > );
              >
              > insert into tree values (5,5), (7,5), (8,7), (9,7);
              > insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
              > '777-5555'), (8, NUL
              > L, NULL), (9, 'BLUE', NULL);
              >
              > create table ancestor (
              > nodeid int not null
              > references tree,
              > ancestorid int not null
              > references tree,
              > primary key (nodeid, ancestorid)
              > );
              >
              > insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
              >
              > You could use a query like:
              >
              > with suspects (nodeid, suspectid, depth) as (
              > select
              > a.nodeid, a.ancestorid,
              > (select count(1) from ancestor where nodeid = a.ancestorid) as
              > depth
              > from ancestor a, data d
              > where a.ancestorid = d.nodeid
              > and d.color is not null
              > union all
              > select
              > d.nodeid, d.nodeid,
              > (select count(1) from ancestor where nodeid = d.nodeid) as
              > depth
              > from data d
              > where d.color is not null
              > and not exists (
              > select 1 from ancestor where ancestorid = d.nodeid
              > )
              > )
              > select s.nodeid, d.color from suspects s, data d
              > where d.nodeid = s.suspectid
              > and depth = (select max(depth) from suspects where nodeid =
              > s.nodeid)
              >
              > NODEID COLOR
              > ----------- ----------
              > 7 GREEN
              > 8 GREEN
              > 9 BLUE
              >
              > 3 record(s) selected.
              >
              > As you can see node 5 is missing, but I'll leave that for you ;-)
              >
              >
              > HTH
              > /Lennart
              >
              >[color=green]
              > > - Jeff[/color][/color]

              Comment

              • Lennart Jonsson

                #8
                Re: Nested Coalescing possible in SQL?

                lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=blue]
                > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...[/color]

                [...]

                Hmmm, I must have been just a little bit tired there ;-) Should read
                (diff in snd part of union):

                with suspects (nodeid, suspectid, depth) as (
                select
                a.nodeid, a.ancestorid,
                (select count(1) from ancestor where nodeid = a.ancestorid)
                as depth
                from ancestor a, data d
                where a.ancestorid = d.nodeid
                and d.color is not null
                union all
                select
                t.nodeid, t.nodeid,
                (select count(1) from ancestor where nodeid = t.nodeid) as
                depth
                from data d, tree t
                where d.nodeid = t.nodeid
                and d.color is not null
                )
                select s.nodeid, d.color from suspects s, data d
                where d.nodeid = s.suspectid
                and depth = (select max(depth) from suspects where nodeid =
                s.nodeid)
                ;


                NODEID COLOR
                ----------- ----------
                5 GREEN
                7 GREEN
                8 GREEN
                9 BLUE


                /Lennart

                Comment

                • Lennart Jonsson

                  #9
                  Re: Nested Coalescing possible in SQL?

                  jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406031837.62f2 701@posting.goo gle.com>...[color=blue]
                  > Thanks Lennart. I think your third suggestion might do the trick. If
                  > you have a chance to answer I'm curious:
                  >
                  > 1) Why do you think storing the path (suggestion #2) is not an option?
                  > Or if it is option, why is it a bad idea compared to storing a
                  > separate ancestor relation? (aside from the usual normal form
                  > reasoning)
                  >[/color]

                  Since youre mainly concerned in retrieving ancestors, (not
                  investigating subtree
                  ), I think it is a bit clumsy. Assume that you have a node x with path
                  1.2.3.56.89.112 and want to investigate which nodes in the path that
                  have property p. How would you do that? Subtree queries are much
                  easier since you can query: where path like '1.2.4.6.%'

                  [color=blue]
                  > 2) How would you structure the query if you stored the path
                  > (suggestion #2).
                  >[/color]

                  I would figure out a way to retrive ancestors and then use the same
                  way.

                  If I where in your shoes I would encapsulate the ancestor stuff in
                  views or table functions, and then use those in my queries. If you
                  decide to go for another aproach, you rewrite the functions/views and
                  can continue to use the queries. Simple example using table functions
                  in db2:


                  -- ancestor table aproach

                  CREATE FUNCTION SUBTREE (ID INTEGER)
                  RETURNS TABLE (NODEID INTEGER)
                  LANGUAGE SQL
                  READS SQL DATA
                  NO EXTERNAL ACTION
                  DETERMINISTIC
                  RETURN
                  select nodeid from ancestor where ancestorid = ID
                  @

                  CREATE FUNCTION SUBTREE_SELF (ID INTEGER)
                  RETURNS TABLE (NODEID INTEGER)
                  LANGUAGE SQL
                  READS SQL DATA
                  NO EXTERNAL ACTION
                  DETERMINISTIC
                  RETURN
                  select s.nodeid from table(subtree(I D)) as s
                  union all
                  values (ID)
                  @


                  -- changed my mind decided to go for path aproach instead

                  drop ...

                  CREATE FUNCTION SUBTREE (ID INTEGER)
                  RETURNS TABLE (NODEID INTEGER)
                  LANGUAGE SQL
                  READS SQL DATA
                  NO EXTERNAL ACTION
                  DETERMINISTIC
                  RETURN
                  select nodeid from tree where path like (ID || '.%')

                  etc

                  The same goes for depth

                  As a bonus it will simplify your original query. I.e

                  with suspects (nodeid, ancestorid, depth) as (
                  select
                  ps.nodeid, ps.ancestorid, depth_func(ps.n odeid) as depth
                  from data d, tree t, table(path_self (t.nodeid)) ps
                  where ps.ancestorid = d.nodeid and d.color is not null
                  )
                  select ...



                  HTH
                  /Lennart


                  [color=blue]
                  > Thanks!
                  >
                  > - Jeff
                  >
                  >
                  >
                  > lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=green]
                  > > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...
                  > >
                  > > [...]
                  > >[color=darkred]
                  > > >
                  > > > So say the tree is specified like this:
                  > > >
                  > > > (nodeId int, parentId int, color varchar, phone varchar)
                  > > >
                  > > > 5, 0,"GREEN", "555-1212"
                  > > > 7, 5, NULL, "777-5555"
                  > > > 8, 7 NULL, NULL
                  > > > 9, 7 "BLUE", NULL[/color][/color][/color]
                  [color=blue][color=green][color=darkred]
                  > > >
                  > > >
                  > > > That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
                  > > > 7. Node 5 is the top level node.
                  > > >
                  > > > I want to run a query that would give the following result set:
                  > > >
                  > > > select nodeId, color, phone from ...
                  > > >
                  > > > 5,"GREEN", "555-1212"
                  > > > 7,"GREEN", "777-5555"
                  > > > 8,"GREEN", "777-5555"
                  > > > 9,"BLUE", "777-5555"
                  > > >
                  > > > Can such a query be constructed?
                  > > >[/color]
                  > >
                  > > If you dont have a maximum depth of your tree (and cannot use
                  > > recursion), then no. The reason is that you cannot express queries
                  > > like "gimmie the ancestors of x". What you need to do is to inform
                  > > your system that 5 is an ancestor of 9. There are several ways of
                  > > doing this. Troels Arvin has a page with links to articles on the
                  > > subject:
                  > >
                  > > http://troels.arvin.dk/db/rdbms/links/#hierarchical
                  > >
                  > > 1. Nested set: google for Celko and nested. There are also variants of
                  > > this.
                  > > 2. Store the path in each node. In your case something like:
                  > >
                  > > 5,'',"GREEN", "555-1212"
                  > > 7,'5',"GREEN", "777-5555"
                  > > 8,'5.7',"GREEN" , "777-5555"
                  > > 9,'5.7',"BLUE", "777-5555"
                  > >
                  > > In your case I dont think this is an option
                  > >
                  > > 3. Store a separate ancestor relation. In your case
                  > >
                  > > create table ancestor (
                  > > nodeid int not null
                  > > references tree,
                  > > ancestorid int not null
                  > > references tree,
                  > > primary key (nodeid, ancestorid)
                  > > );
                  > >
                  > > insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
                  > >
                  > > Main drawback is that you have to maintain this relation as you
                  > > modifies your tree. I have some notes on how that can be achieved:
                  > >
                  > > http://fungus.teststation.com/~jon/t...eeHandling.htm
                  > >
                  > >
                  > >
                  > > Anyhow, once you have a way of asking for ancestors for a given node,
                  > > you can do what you want in a single query, namely:
                  > >
                  > > find the ancestor at the largest depth with property p
                  > >
                  > > Assuming the following ddl
                  > >
                  > > create table tree (
                  > > nodeid int not null primary key,
                  > > parent int not null
                  > > references tree
                  > > on delete restrict
                  > > );
                  > >
                  > > create table data (
                  > > nodeid int not null primary key
                  > > references tree
                  > > on delete cascade,
                  > > color varchar(10),
                  > > phone varchar(10)
                  > > );
                  > >
                  > > insert into tree values (5,5), (7,5), (8,7), (9,7);
                  > > insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
                  > > '777-5555'), (8, NUL
                  > > L, NULL), (9, 'BLUE', NULL);
                  > >
                  > > create table ancestor (
                  > > nodeid int not null
                  > > references tree,
                  > > ancestorid int not null
                  > > references tree,
                  > > primary key (nodeid, ancestorid)
                  > > );
                  > >
                  > > insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
                  > >
                  > > You could use a query like:
                  > >
                  > > with suspects (nodeid, suspectid, depth) as (
                  > > select
                  > > a.nodeid, a.ancestorid,
                  > > (select count(1) from ancestor where nodeid = a.ancestorid) as
                  > > depth
                  > > from ancestor a, data d
                  > > where a.ancestorid = d.nodeid
                  > > and d.color is not null
                  > > union all
                  > > select
                  > > d.nodeid, d.nodeid,
                  > > (select count(1) from ancestor where nodeid = d.nodeid) as
                  > > depth
                  > > from data d
                  > > where d.color is not null
                  > > and not exists (
                  > > select 1 from ancestor where ancestorid = d.nodeid
                  > > )
                  > > )
                  > > select s.nodeid, d.color from suspects s, data d
                  > > where d.nodeid = s.suspectid
                  > > and depth = (select max(depth) from suspects where nodeid =
                  > > s.nodeid)
                  > >
                  > > NODEID COLOR
                  > > ----------- ----------
                  > > 7 GREEN
                  > > 8 GREEN
                  > > 9 BLUE
                  > >
                  > > 3 record(s) selected.
                  > >
                  > > As you can see node 5 is missing, but I'll leave that for you ;-)
                  > >
                  > >
                  > > HTH
                  > > /Lennart
                  > >
                  > >[color=darkred]
                  > > > - Jeff[/color][/color][/color]

                  Comment

                  • Jan

                    #10
                    Re: Nested Coalescing possible in SQL?

                    e.g. in Oracle 9i,

                    you can:

                    Creating demo data

                    CREATE TABLE T
                    (
                    CHILD_ID NUMBER,
                    PARENT_ID NUMBER,
                    ATTR1 VARCHAR2(10),
                    ATTR2 VARCHAR2(10)
                    )

                    INSERT INTO T VALUES (1 NULL,'A','000') ;
                    INSERT INTO T VALUES (2,1, NULL, '111');
                    INSERT INTO T VALUES (3,1, 'B', NULL);
                    INSERT INTO T VALUES (4,2, NULL, NULL);
                    INSERT INTO T VALUES (5, 2, 'C', '999');
                    INSERT INTO T VALUES (6, 5, NULL, NULL);

                    and the query is (it is too ugly with INSTR/SUBSTR, but maybe faster
                    then an inline query per each attribute with another CONNECT BY):

                    ------------------
                    SELECT child_id,PARENT _ID,attr1,attr2 ,tree_path,
                    RTRIM(SUBSTR(al l_attr1,INSTR(a ll_attr1,'/',-1,2)+1),'/')
                    inherit_attr1,
                    RTRIM(SUBSTR(al l_attr2,INSTR(a ll_attr2,'/',-1,2)+1),'/')
                    inherit_attr2
                    FROM

                    (SELECT child_id,parent _id,attr1,attr2 ,
                    SYS_CONNECT_BY_ PATH(TO_CHAR(ch ild_id), '/') tree_path,
                    CASE
                    WHEN attr1 IS NOT NULL THEN '/'||attr1||'/'
                    ELSE REPLACE('/'||SYS_CONNECT_ BY_PATH(attr1,'/'),'//','/')
                    END all_attr1,

                    CASE
                    WHEN attr2 IS NOT NULL THEN '/'||attr2||'/'
                    ELSE REPLACE('/'||SYS_CONNECT_ BY_PATH(attr2,'/'),'//','/')
                    END all_attr2

                    FROM T
                    START WITH parent_id IS NULL
                    CONNECT BY parent_id=PRIOR child_id) v
                    --------------------------------------------------------



                    jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406011716.37d0 0399@posting.go ogle.com>...[color=blue]
                    > First of all, I apologize if coalescing is not the right term to
                    > describe my problem. I have a tree where each node has the same set of
                    > attributes (is the same entity) but child nodes should inherit
                    > attribute values from parent node.
                    >
                    > for example, say I have the following table:
                    >
                    > (nodeId int , color varchar, phone varchar) with two rows
                    >
                    > 5, "GREEN", "555-1212"
                    > 7, NULL, "777-5555"
                    > 8, NULL, NULL
                    > 9, "BLUE", NULL
                    >
                    > in addition there is a tree structure that specifies that node 5 is
                    > the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
                    > there is many ways to make trees in SQL but as a simple example let's
                    > say the tree is:
                    >
                    > id, parentid
                    > 8, 7
                    > 9, 7
                    > 7, 5
                    >
                    > Thus in this case, node 7 inherits the value "GREEN" from node 5 for
                    > attribute "color", but provides its own value "777-5555" for attribute
                    > "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
                    > (really from node 5 since 7 did not specify its own) and "777-5555"
                    > for "phone" from node 7. Node 9 provides its own value for "color" and
                    > inherits the one for "phone" from Node 7.
                    >
                    > So the runtime values in the application are:
                    >
                    > Node 5: "GREEN", "555-1212"
                    > Node 7: "GREEN", "777-5555"
                    > Node 8: "GREEN", "777-5555"
                    > Node 9: "BLUE", "777-5555"
                    >
                    > Question 1: Is there a single SQL statement that for a given node can
                    > replace the NULLs with inherited values from the parent node?
                    >
                    > Question 2: Is there a better way to structure such data in SQL as to
                    > make answer to question 1 possible?
                    >
                    > I would restate the problem as follows:
                    >
                    > In a nested structure child nodes inherit values from parent nodes _by
                    > reference_ or specify their own. "By reference" is the key word here.
                    > If it wasn't for that you could just duplicate the necessary values
                    > from the parent entitity upon creation.
                    >
                    > Thanks!
                    >
                    > - Jeff[/color]

                    Comment

                    • Mikito Harakiri

                      #11
                      Re: Nested Coalescing possible in SQL?


                      "Lennart Jonsson" <lennart@kommun icera.umea.se> wrote in message
                      news:6dae7e65.0 406032300.77888 608@posting.goo gle.com...[color=blue]
                      > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message[/color]
                      news:<235c483f. 0406031837.62f2 701@posting.goo gle.com>...[color=blue]
                      > Since youre mainly concerned in retrieving ancestors, (not
                      > investigating subtree
                      > ), I think it is a bit clumsy. Assume that you have a node x with path
                      > 1.2.3.56.89.112 and want to investigate which nodes in the path that
                      > have property p. How would you do that? Subtree queries are much
                      > easier since you can query: where path like '1.2.4.6.%'[/color]

                      Small procedural part -- table function -- which returns ancestors
                      materialized path encodings set

                      1.2.3.56.89
                      1.2.3.56
                      1.2.3
                      1.2
                      1

                      for the input encoding 1.2.3.56.89.112 could be plugged in into the query
                      returning all the ancestors. This idea translated into the nested intervals
                      encoding (if you prefer to work with numbers instead of string parsing) is
                      implemented at the end of



                      BTW, http://arxiv.org/abs/cs.DB/0402051 is significantly rewritten.
                      Essentially, it is Euclidean algorithm (the mother of all algorithms?) that
                      is employed in the numerical counterpart of the Materialized Path.






                      Comment

                      • Lennart Jonsson

                        #12
                        Re: Nested Coalescing possible in SQL?

                        lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406031939.1ed5 fa49@posting.go ogle.com>...[color=blue]
                        > lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=green]
                        > > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...[/color]
                        >
                        > [...]
                        >
                        > Hmmm, I must have been just a little bit tired there ;-) Should read
                        > (diff in snd part of union):
                        >[/color]

                        and as you probably discovered already, it can be further simplified as

                        [...]
                        union all
                        select
                        d.nodeid, d.nodeid,
                        (select count(1) from ancestor where nodeid = d.nodeid) as
                        depth
                        from data d
                        where d.color is not null
                        )

                        [...]

                        /L

                        Comment

                        • --CELKO--

                          #13
                          Re: Nested Coalescing possible in SQL?

                          >> if an entity instance (one row) does not specify a value for one of
                          its field [sic]s it should inherit the values from its nearest parent
                          that does have a value for this field [sic]. <<

                          Rows are not records; fields are not columns; tables are not files.
                          It drives me nuts to see people screw up the terms and therefore their
                          mental model of how SQL works. Let's put this into a simplified
                          nested set model which has more NULL-able columns than the payroll
                          system of a major auto company:

                          CREATE TABLE Nodes
                          (node_id INTEGER NOT NULL PRIMARY KEY,
                          col_1 CHAR(5),
                          col_2 CHAR(5),
                          ..);

                          CREATE TABLE Tree
                          (node_id INTEGER NOT NULL UNIQUE
                          REFERENCES Nodes(node_id),
                          lft INTEGER NOT NULL,
                          rgt INTEGER NOT NULL,
                          PRIMARY KEY (lft, rgt),
                          << other tree constraints here >>
                          ..);

                          Now update all the rows where any column has a NULL. You will have to
                          run this update once for every level in the tree at most.

                          UPDATE Nodes
                          SET col_1
                          = COALESCE(Nodes. col_1, -- keep non-nul
                          (SELECT B.col_1 -- find parent
                          FROM Tree AS E
                          LEFT OUTER JOIN
                          Tree AS B
                          ON B.lft
                          = (SELECT MAX(lft)
                          FROM Tree AS S
                          WHERE E.lft > S.lft
                          AND E.lft < S.rgt)),
                          col_2
                          = COALESCE(Nodes. col_2,
                          (SELECT B.col_2
                          FROM ..)),

                          Etc.
                          WHERE (col_1 || col_2 || ..||col_n) IS NULL;

                          This probably scared you. The WHERE clause uses the fact that NULLs
                          propagate; we don't care *which* column is NULL, so why look for
                          useless details? That left outer join scalar subquery is how to get
                          the immediate superiors (B= boss, E= employee) in a hierarchy. The
                          COALESCE() will retain an existing non-NULL value.

                          Comment

                          Working...