Nested Coalescing possible in SQL?

**Mikito Harakiri** · Jul 20 '05, 03:51 AM

Re: Nested Coalescing possible in SQL?

"Jeff Lanfield" <jlanfield2003@ yahoo.com> wrote in message
news:235c483f.0 406011716.37d00 399@posting.goo gle.com...[color=blue]
> First of all, I apologize if coalescing is not the right term to
> describe my problem. I have a tree where each node has the same set of
> attributes (is the same entity) but child nodes should inherit
> attribute values from parent node.
>
> for example, say I have the following table:
>
> (nodeId int , color varchar, phone varchar) with two rows
>
> 5, "GREEN", "555-1212"
> 7, NULL, "777-5555"
> 8, NULL, NULL
> 9, "BLUE", NULL
>
> in addition there is a tree structure that specifies that node 5 is
> the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
> there is many ways to make trees in SQL but as a simple example let's
> say the tree is:
>
> id, parentid
> 8, 7
> 9, 7
> 7, 5
>
> Thus in this case, node 7 inherits the value "GREEN" from node 5 for
> attribute "color", but provides its own value "777-5555" for attribute
> "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
> (really from node 5 since 7 did not specify its own) and "777-5555"
> for "phone" from node 7. Node 9 provides its own value for "color" and
> inherits the one for "phone" from Node 7.
>
> So the runtime values in the application are:
>
> Node 5: "GREEN", "555-1212"
> Node 7: "GREEN", "777-5555"
> Node 8: "GREEN", "777-5555"
> Node 9: "BLUE", "777-5555"
>
> Question 1: Is there a single SQL statement that for a given node can
> replace the NULLs with inherited values from the parent node?
>
> Question 2: Is there a better way to structure such data in SQL as to
> make answer to question 1 possible?
>
> I would restate the problem as follows:
>
> In a nested structure child nodes inherit values from parent nodes _by
> reference_ or specify their own. "By reference" is the key word here.
> If it wasn't for that you could just duplicate the necessary values
> from the parent entitity upon creation.[/color]

It looks easy. Find a closest node in the chain of ancestors that has
property not NULL.

**Lennart Jonsson** · Jul 20 '05, 03:51 AM

Re: Nested Coalescing possible in SQL?

jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406011716.37d0 0399@posting.go ogle.com>...[color=blue]
> First of all, I apologize if coalescing is not the right term to
> describe my problem. I have a tree where each node has the same set of
> attributes (is the same entity) but child nodes should inherit
> attribute values from parent node.
>
> for example, say I have the following table:
>
> (nodeId int , color varchar, phone varchar) with two rows
>
> 5, "GREEN", "555-1212"
> 7, NULL, "777-5555"
> 8, NULL, NULL
> 9, "BLUE", NULL
>
> in addition there is a tree structure that specifies that node 5 is
> the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
> there is many ways to make trees in SQL but as a simple example let's
> say the tree is:ancestor a
>
> id, parentid
> 8, 7
> 9, 7
> 7, 5
>
> Thus in this case, node 7 inherits the value "GREEN" from node 5 for
> attribute "color", but provides its own value "777-5555" for attribute
> "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
> (really from node 5 since 7 did not specify its own) and "777-5555"
> for "phone" from node 7. Node 9 provides its own value for "color" and
> inherits the one for "phone" from Node 7.
>
> So the runtime values in the application are:
>
> Node 5: "GREEN", "555-1212"
> Node 7: "GREEN", "777-5555"
> Node 8: "GREEN", "777-5555"
> Node 9: "BLUE", "777-5555"
>
> Question 1: Is there a single SQL statement that for a given node can
> replace the NULLs with inherited values from the parent node?
>
> Question 2: Is there a better way to structure such data in SQL as to
> make answer to question 1 possible?
>
> I would restate the problem as follows:
>
> In a nested structure child nodes inherit values from parent nodes _by
> reference_ or specify their own. "By reference" is the key word here.
> If it wasn't for that you could just duplicate the necessary values
> from the parent entitity upon creation.
>
> Thanks!
>
> - Jeff[/color]

If you would like to do it in a single query, you will have to extend
your tree, or if your db supports it, you can use recursion. There are
several ways of extending your tree, nested set, transitive closure,
etc. If you google comp.database and comp.database.t heory you will
find several threads regarding this.

Assuming you can "calculate" the set of ancestors for any given node,
define the suspects as "ancestors with property p". The property you
are looking for can be found in the suspect with the largest depth*.

*depth = number of ancestors

HTH
/Lennart

**--CELKO--** · Jul 20 '05, 03:55 AM

Re: Nested Coalescing possible in SQL?

>> Question 2: Is there a better way to structure such data in SQL as
to make answer to question 1 possible? <<

Here is the link on Amazon.com for my new book on "Trees & Hierarchies
in SQL"

http://www.amazon.com/exec/obidos/tg/detail/-/1558609202/qid=1080772873/sr=1-1/ref=sr_1_1/102-7683601-6345721?v=glance&s=books#product-details

The classic scenario calls for a root class with all the common
attributes and then specialized sub-classes under it. As an example,
let's take the class of Vehicles and find an industry standard
identifier (VIN), and add two mutually exclusive sub-classes, Sport
utility vehicles and sedans ('SUV', 'SED').

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_t ype IN ('SUV', 'SED')),
UNIQUE (vin, vehicle_type),
..);

Notice the overlapping candidate keys. I then use a compound candidate
key (vin, vehicle_type) and a constraint in each sub-class table to
assure that the vehicle_type is locked and agrees with the Vehicles
table. Add some DRI actions and you are done:

CREATE TABLE SUV
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
CHECK(vehicle_t ype = 'SUV'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_t ype = 'SED'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

I can continue to build a hierarchy like this. For example, if I had
a Sedans table that broke down into two-door and four-door sedans, I
could a schema like this:

CREATE TABLE Sedans
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
CHECK(vehicle_t ype IN ('2DR', '4DR', 'SED')),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Vehicles(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE TwoDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
CHECK(vehicle_t ype = '2DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans(vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

CREATE TABLE FourDoor
(vin CHAR(17) NOT NULL PRIMARY KEY,
vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
CHECK(vehicle_t ype = '4DR'),
UNIQUE (vin, vehicle_type),
FOREIGN KEY (vin, vehicle_type)
REFERENCES Sedans (vin, vehicle_type)
ON UPDATE CASCADE
ON DELETE CASCADE,
..);

The idea is to build a chain of identifiers and types in a UNIQUE()
constraint that go up the tree when you use a REFERENCES constraint.
Obviously, you can do variants of this trick to get different class
structures.

If an entity doesn't have to be exclusively one subtype, you play with
the root of the class hierarchy:

CREATE TABLE Vehicles
(vin CHAR(17) NOT NULL,
vehicle_type CHAR(3) NOT NULL
CHECK(vehicle_t ype IN ('SUV', 'SED')),
PRIMARY KEY (vin, vehicle_type),
..);

Now start hiding all this stuff in VIEWs immediately and add an
INSTEAD OF trigger to those VIEWs.

**Lennart Jonsson** · Jul 20 '05, 03:55 AM

Re: Nested Coalescing possible in SQL?

jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...

[...]
[color=blue]
>
> So say the tree is specified like this:
>
> (nodeId int, parentId int, color varchar, phone varchar)
>
> 5, 0,"GREEN", "555-1212"
> 7, 5, NULL, "777-5555"
> 8, 7 NULL, NULL
> 9, 7 "BLUE", NULL
>
>
> That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
> 7. Node 5 is the top level node.
>
> I want to run a query that would give the following result set:
>
> select nodeId, color, phone from ...
>
> 5,"GREEN", "555-1212"
> 7,"GREEN", "777-5555"
> 8,"GREEN", "777-5555"
> 9,"BLUE", "777-5555"
>
> Can such a query be constructed?
>[/color]

If you dont have a maximum depth of your tree (and cannot use
recursion), then no. The reason is that you cannot express queries
like "gimmie the ancestors of x". What you need to do is to inform
your system that 5 is an ancestor of 9. There are several ways of
doing this. Troels Arvin has a page with links to articles on the
subject:

Troels' links: Relational database systems

http://troels.arvin.dk/db/rdbms/links/#hierarchical

1. Nested set: google for Celko and nested. There are also variants of
this.
2. Store the path in each node. In your case something like:

5,'',"GREEN", "555-1212"
7,'5',"GREEN", "777-5555"
8,'5.7',"GREEN" , "777-5555"
9,'5.7',"BLUE", "777-5555"

In your case I dont think this is an option

3. Store a separate ancestor relation. In your case

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

Main drawback is that you have to maintain this relation as you
modifies your tree. I have some notes on how that can be achieved:

http://fungus.teststation.com/~jon/treehandling/TreeHandling.htm

Anyhow, once you have a way of asking for ancestors for a given node,
you can do what you want in a single query, namely:

find the ancestor at the largest depth with property p

Assuming the following ddl

create table tree (
nodeid int not null primary key,
parent int not null
references tree
on delete restrict
);

create table data (
nodeid int not null primary key
references tree
on delete cascade,
color varchar(10),
phone varchar(10)
);

insert into tree values (5,5), (7,5), (8,7), (9,7);
insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
'777-5555'), (8, NUL
L, NULL), (9, 'BLUE', NULL);

create table ancestor (
nodeid int not null
references tree,
ancestorid int not null
references tree,
primary key (nodeid, ancestorid)
);

insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);

You could use a query like:

with suspects (nodeid, suspectid, depth) as (
select
a.nodeid, a.ancestorid,
(select count(1) from ancestor where nodeid = a.ancestorid) as
depth
from ancestor a, data d
where a.ancestorid = d.nodeid
and d.color is not null
union all
select
d.nodeid, d.nodeid,
(select count(1) from ancestor where nodeid = d.nodeid) as
depth
from data d
where d.color is not null
and not exists (
select 1 from ancestor where ancestorid = d.nodeid
)
)
select s.nodeid, d.color from suspects s, data d
where d.nodeid = s.suspectid
and depth = (select max(depth) from suspects where nodeid =
s.nodeid)

NODEID COLOR
----------- ----------
7 GREEN
8 GREEN
9 BLUE

3 record(s) selected.

As you can see node 5 is missing, but I'll leave that for you ;-)

HTH
/Lennart

[color=blue]
> - Jeff[/color]

**Jeff Lanfield** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

Thanks --CELKO--, this is a very useful suggestion and I will keep it
in mind. However, it is designed to handle an hiearchy of types (e.g.
representing class hierarchy). My case is slightly different: I don't
have types and subtypes, I have only one table representing one
entity. The inheritance of values is in the following sense: if an
entity instance (one row) does not specify a value for one of its
fields it should inherit the values from its nearest parent that does
have a value for this field. Note that the immediate parent may have a
NULL in that field too, so you'd have to go the parent's parent and so
on till you find a non-NULL value for the field in question. I think I
should be able to use COALESCE to do it somehow but I can't come up
with a *single* query.

So the exact example is this. I have a tree structure of values, there
is only one type of entity. Per Lennart's suggestion I'm keeping the
full path pre-computed for the sake of simplicity. There is only this
one table:

(nodeId int, varchar parent_path, color varchar, phone varchar)

5, "0", "GREEN", "555-1212"
7, "0,5", NULL, "777-5555"
8, "0,5,7" NULL, NULL
9, "0,5,7" "BLUE", NULL

I want to run a query that would give the following result set:

select nodeId, color, phone from ...

5,"GREEN", "555-1212"
7,"GREEN", "777-5555"
8,"GREEN", "777-5555"
9,"BLUE", "777-5555"

Is it possible to have such a query?

- Jeff

jcelko212@earth link.net (--CELKO--) wrote in message news:<18c7b3c2. 0406021241.119e 23ec@posting.go ogle.com>...[color=blue][color=green][color=darkred]
> >> Question 2: Is there a better way to structure such data in SQL as[/color][/color]
> to make answer to question 1 possible? <<
>
> Here is the link on Amazon.com for my new book on "Trees & Hierarchies
> in SQL"
>
> http://www.amazon.com/exec/obidos/tg...roduct-details
>
> The classic scenario calls for a root class with all the common
> attributes and then specialized sub-classes under it. As an example,
> let's take the class of Vehicles and find an industry standard
> identifier (VIN), and add two mutually exclusive sub-classes, Sport
> utility vehicles and sedans ('SUV', 'SED').
>
> CREATE TABLE Vehicles
> (vin CHAR(17) NOT NULL PRIMARY KEY,
> vehicle_type CHAR(3) NOT NULL
> CHECK(vehicle_t ype IN ('SUV', 'SED')),
> UNIQUE (vin, vehicle_type),
> ..);
>
> Notice the overlapping candidate keys. I then use a compound candidate
> key (vin, vehicle_type) and a constraint in each sub-class table to
> assure that the vehicle_type is locked and agrees with the Vehicles
> table. Add some DRI actions and you are done:
>
> CREATE TABLE SUV
> (vin CHAR(17) NOT NULL PRIMARY KEY,
> vehicle_type CHAR(3) DEFAULT 'SUV' NOT NULL
> CHECK(vehicle_t ype = 'SUV'),
> UNIQUE (vin, vehicle_type),
> FOREIGN KEY (vin, vehicle_type)
> REFERENCES Vehicles(vin, vehicle_type)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> ..);
>
> CREATE TABLE Sedans
> (vin CHAR(17) NOT NULL PRIMARY KEY,
> vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
> CHECK(vehicle_t ype = 'SED'),
> UNIQUE (vin, vehicle_type),
> FOREIGN KEY (vin, vehicle_type)
> REFERENCES Vehicles(vin, vehicle_type)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> ..);
>
> I can continue to build a hierarchy like this. For example, if I had
> a Sedans table that broke down into two-door and four-door sedans, I
> could a schema like this:
>
> CREATE TABLE Sedans
> (vin CHAR(17) NOT NULL PRIMARY KEY,
> vehicle_type CHAR(3) DEFAULT 'SED' NOT NULL
> CHECK(vehicle_t ype IN ('2DR', '4DR', 'SED')),
> UNIQUE (vin, vehicle_type),
> FOREIGN KEY (vin, vehicle_type)
> REFERENCES Vehicles(vin, vehicle_type)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> ..);
>
> CREATE TABLE TwoDoor
> (vin CHAR(17) NOT NULL PRIMARY KEY,
> vehicle_type CHAR(3) DEFAULT '2DR' NOT NULL
> CHECK(vehicle_t ype = '2DR'),
> UNIQUE (vin, vehicle_type),
> FOREIGN KEY (vin, vehicle_type)
> REFERENCES Sedans(vin, vehicle_type)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> ..);
>
> CREATE TABLE FourDoor
> (vin CHAR(17) NOT NULL PRIMARY KEY,
> vehicle_type CHAR(3) DEFAULT '4DR' NOT NULL
> CHECK(vehicle_t ype = '4DR'),
> UNIQUE (vin, vehicle_type),
> FOREIGN KEY (vin, vehicle_type)
> REFERENCES Sedans (vin, vehicle_type)
> ON UPDATE CASCADE
> ON DELETE CASCADE,
> ..);
>
> The idea is to build a chain of identifiers and types in a UNIQUE()
> constraint that go up the tree when you use a REFERENCES constraint.
> Obviously, you can do variants of this trick to get different class
> structures.
>
> If an entity doesn't have to be exclusively one subtype, you play with
> the root of the class hierarchy:
>
> CREATE TABLE Vehicles
> (vin CHAR(17) NOT NULL,
> vehicle_type CHAR(3) NOT NULL
> CHECK(vehicle_t ype IN ('SUV', 'SED')),
> PRIMARY KEY (vin, vehicle_type),
> ..);
>
> Now start hiding all this stuff in VIEWs immediately and add an
> INSTEAD OF trigger to those VIEWs.[/color]

**Jeff Lanfield** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

Thanks Lennart. I think your third suggestion might do the trick. If
you have a chance to answer I'm curious:

1) Why do you think storing the path (suggestion #2) is not an option?
Or if it is option, why is it a bad idea compared to storing a
separate ancestor relation? (aside from the usual normal form
reasoning)

2) How would you structure the query if you stored the path
(suggestion #2).

Thanks!

- Jeff

lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=blue]
> jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...
>
> [...]
>[color=green]
> >
> > So say the tree is specified like this:
> >
> > (nodeId int, parentId int, color varchar, phone varchar)
> >
> > 5, 0,"GREEN", "555-1212"
> > 7, 5, NULL, "777-5555"
> > 8, 7 NULL, NULL
> > 9, 7 "BLUE", NULL
> >
> >
> > That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
> > 7. Node 5 is the top level node.
> >
> > I want to run a query that would give the following result set:
> >
> > select nodeId, color, phone from ...
> >
> > 5,"GREEN", "555-1212"
> > 7,"GREEN", "777-5555"
> > 8,"GREEN", "777-5555"
> > 9,"BLUE", "777-5555"
> >
> > Can such a query be constructed?
> >[/color]
>
> If you dont have a maximum depth of your tree (and cannot use
> recursion), then no. The reason is that you cannot express queries
> like "gimmie the ancestors of x". What you need to do is to inform
> your system that 5 is an ancestor of 9. There are several ways of
> doing this. Troels Arvin has a page with links to articles on the
> subject:
>
> http://troels.arvin.dk/db/rdbms/links/#hierarchical
>
> 1. Nested set: google for Celko and nested. There are also variants of
> this.
> 2. Store the path in each node. In your case something like:
>
> 5,'',"GREEN", "555-1212"
> 7,'5',"GREEN", "777-5555"
> 8,'5.7',"GREEN" , "777-5555"
> 9,'5.7',"BLUE", "777-5555"
>
> In your case I dont think this is an option
>
> 3. Store a separate ancestor relation. In your case
>
> create table ancestor (
> nodeid int not null
> references tree,
> ancestorid int not null
> references tree,
> primary key (nodeid, ancestorid)
> );
>
> insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
>
> Main drawback is that you have to maintain this relation as you
> modifies your tree. I have some notes on how that can be achieved:
>
> http://fungus.teststation.com/~jon/t...eeHandling.htm
>
>
>
> Anyhow, once you have a way of asking for ancestors for a given node,
> you can do what you want in a single query, namely:
>
> find the ancestor at the largest depth with property p
>
> Assuming the following ddl
>
> create table tree (
> nodeid int not null primary key,
> parent int not null
> references tree
> on delete restrict
> );
>
> create table data (
> nodeid int not null primary key
> references tree
> on delete cascade,
> color varchar(10),
> phone varchar(10)
> );
>
> insert into tree values (5,5), (7,5), (8,7), (9,7);
> insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
> '777-5555'), (8, NUL
> L, NULL), (9, 'BLUE', NULL);
>
> create table ancestor (
> nodeid int not null
> references tree,
> ancestorid int not null
> references tree,
> primary key (nodeid, ancestorid)
> );
>
> insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
>
> You could use a query like:
>
> with suspects (nodeid, suspectid, depth) as (
> select
> a.nodeid, a.ancestorid,
> (select count(1) from ancestor where nodeid = a.ancestorid) as
> depth
> from ancestor a, data d
> where a.ancestorid = d.nodeid
> and d.color is not null
> union all
> select
> d.nodeid, d.nodeid,
> (select count(1) from ancestor where nodeid = d.nodeid) as
> depth
> from data d
> where d.color is not null
> and not exists (
> select 1 from ancestor where ancestorid = d.nodeid
> )
> )
> select s.nodeid, d.color from suspects s, data d
> where d.nodeid = s.suspectid
> and depth = (select max(depth) from suspects where nodeid =
> s.nodeid)
>
> NODEID COLOR
> ----------- ----------
> 7 GREEN
> 8 GREEN
> 9 BLUE
>
> 3 record(s) selected.
>
> As you can see node 5 is missing, but I'll leave that for you ;-)
>
>
> HTH
> /Lennart
>
>[color=green]
> > - Jeff[/color][/color]

**Lennart Jonsson** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=blue]
> jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...[/color]

[...]

Hmmm, I must have been just a little bit tired there ;-) Should read
(diff in snd part of union):

with suspects (nodeid, suspectid, depth) as (
select
a.nodeid, a.ancestorid,
(select count(1) from ancestor where nodeid = a.ancestorid)
as depth
from ancestor a, data d
where a.ancestorid = d.nodeid
and d.color is not null
union all
select
t.nodeid, t.nodeid,
(select count(1) from ancestor where nodeid = t.nodeid) as
depth
from data d, tree t
where d.nodeid = t.nodeid
and d.color is not null
)
select s.nodeid, d.color from suspects s, data d
where d.nodeid = s.suspectid
and depth = (select max(depth) from suspects where nodeid =
s.nodeid)
;

NODEID COLOR
----------- ----------
5 GREEN
7 GREEN
8 GREEN
9 BLUE

/Lennart

**Lennart Jonsson** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406031837.62f2 701@posting.goo gle.com>...[color=blue]
> Thanks Lennart. I think your third suggestion might do the trick. If
> you have a chance to answer I'm curious:
>
> 1) Why do you think storing the path (suggestion #2) is not an option?
> Or if it is option, why is it a bad idea compared to storing a
> separate ancestor relation? (aside from the usual normal form
> reasoning)
>[/color]

Since youre mainly concerned in retrieving ancestors, (not
investigating subtree
), I think it is a bit clumsy. Assume that you have a node x with path
1.2.3.56.89.112 and want to investigate which nodes in the path that
have property p. How would you do that? Subtree queries are much
easier since you can query: where path like '1.2.4.6.%'

[color=blue]
> 2) How would you structure the query if you stored the path
> (suggestion #2).
>[/color]

I would figure out a way to retrive ancestors and then use the same
way.

If I where in your shoes I would encapsulate the ancestor stuff in
views or table functions, and then use those in my queries. If you
decide to go for another aproach, you rewrite the functions/views and
can continue to use the queries. Simple example using table functions
in db2:

-- ancestor table aproach

CREATE FUNCTION SUBTREE (ID INTEGER)
RETURNS TABLE (NODEID INTEGER)
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
select nodeid from ancestor where ancestorid = ID
@

CREATE FUNCTION SUBTREE_SELF (ID INTEGER)
RETURNS TABLE (NODEID INTEGER)
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
select s.nodeid from table(subtree(I D)) as s
union all
values (ID)
@

-- changed my mind decided to go for path aproach instead

drop ...

CREATE FUNCTION SUBTREE (ID INTEGER)
RETURNS TABLE (NODEID INTEGER)
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
select nodeid from tree where path like (ID || '.%')

etc

The same goes for depth

As a bonus it will simplify your original query. I.e

with suspects (nodeid, ancestorid, depth) as (
select
ps.nodeid, ps.ancestorid, depth_func(ps.n odeid) as depth
from data d, tree t, table(path_self (t.nodeid)) ps
where ps.ancestorid = d.nodeid and d.color is not null
)
select ...

HTH
/Lennart

[color=blue]
> Thanks!
>
> - Jeff
>
>
>
> lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=green]
> > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...
> >
> > [...]
> >[color=darkred]
> > >
> > > So say the tree is specified like this:
> > >
> > > (nodeId int, parentId int, color varchar, phone varchar)
> > >
> > > 5, 0,"GREEN", "555-1212"
> > > 7, 5, NULL, "777-5555"
> > > 8, 7 NULL, NULL
> > > 9, 7 "BLUE", NULL[/color][/color][/color]
[color=blue][color=green][color=darkred]
> > >
> > >
> > > That is: node 7 inherits values from 5. Nodes 8,9 inherit values from
> > > 7. Node 5 is the top level node.
> > >
> > > I want to run a query that would give the following result set:
> > >
> > > select nodeId, color, phone from ...
> > >
> > > 5,"GREEN", "555-1212"
> > > 7,"GREEN", "777-5555"
> > > 8,"GREEN", "777-5555"
> > > 9,"BLUE", "777-5555"
> > >
> > > Can such a query be constructed?
> > >[/color]
> >
> > If you dont have a maximum depth of your tree (and cannot use
> > recursion), then no. The reason is that you cannot express queries
> > like "gimmie the ancestors of x". What you need to do is to inform
> > your system that 5 is an ancestor of 9. There are several ways of
> > doing this. Troels Arvin has a page with links to articles on the
> > subject:
> >
> > http://troels.arvin.dk/db/rdbms/links/#hierarchical
> >
> > 1. Nested set: google for Celko and nested. There are also variants of
> > this.
> > 2. Store the path in each node. In your case something like:
> >
> > 5,'',"GREEN", "555-1212"
> > 7,'5',"GREEN", "777-5555"
> > 8,'5.7',"GREEN" , "777-5555"
> > 9,'5.7',"BLUE", "777-5555"
> >
> > In your case I dont think this is an option
> >
> > 3. Store a separate ancestor relation. In your case
> >
> > create table ancestor (
> > nodeid int not null
> > references tree,
> > ancestorid int not null
> > references tree,
> > primary key (nodeid, ancestorid)
> > );
> >
> > insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
> >
> > Main drawback is that you have to maintain this relation as you
> > modifies your tree. I have some notes on how that can be achieved:
> >
> > http://fungus.teststation.com/~jon/t...eeHandling.htm
> >
> >
> >
> > Anyhow, once you have a way of asking for ancestors for a given node,
> > you can do what you want in a single query, namely:
> >
> > find the ancestor at the largest depth with property p
> >
> > Assuming the following ddl
> >
> > create table tree (
> > nodeid int not null primary key,
> > parent int not null
> > references tree
> > on delete restrict
> > );
> >
> > create table data (
> > nodeid int not null primary key
> > references tree
> > on delete cascade,
> > color varchar(10),
> > phone varchar(10)
> > );
> >
> > insert into tree values (5,5), (7,5), (8,7), (9,7);
> > insert into data values (5, 'GREEN', '555-1212'), (7, NULL,
> > '777-5555'), (8, NUL
> > L, NULL), (9, 'BLUE', NULL);
> >
> > create table ancestor (
> > nodeid int not null
> > references tree,
> > ancestorid int not null
> > references tree,
> > primary key (nodeid, ancestorid)
> > );
> >
> > insert into ancestor values (7,5), (8,7), (8,5), (9,7), (9,5);
> >
> > You could use a query like:
> >
> > with suspects (nodeid, suspectid, depth) as (
> > select
> > a.nodeid, a.ancestorid,
> > (select count(1) from ancestor where nodeid = a.ancestorid) as
> > depth
> > from ancestor a, data d
> > where a.ancestorid = d.nodeid
> > and d.color is not null
> > union all
> > select
> > d.nodeid, d.nodeid,
> > (select count(1) from ancestor where nodeid = d.nodeid) as
> > depth
> > from data d
> > where d.color is not null
> > and not exists (
> > select 1 from ancestor where ancestorid = d.nodeid
> > )
> > )
> > select s.nodeid, d.color from suspects s, data d
> > where d.nodeid = s.suspectid
> > and depth = (select max(depth) from suspects where nodeid =
> > s.nodeid)
> >
> > NODEID COLOR
> > ----------- ----------
> > 7 GREEN
> > 8 GREEN
> > 9 BLUE
> >
> > 3 record(s) selected.
> >
> > As you can see node 5 is missing, but I'll leave that for you ;-)
> >
> >
> > HTH
> > /Lennart
> >
> >[color=darkred]
> > > - Jeff[/color][/color][/color]

**Jan** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

e.g. in Oracle 9i,

you can:

Creating demo data

CREATE TABLE T
(
CHILD_ID NUMBER,
PARENT_ID NUMBER,
ATTR1 VARCHAR2(10),
ATTR2 VARCHAR2(10)
)

INSERT INTO T VALUES (1 NULL,'A','000') ;
INSERT INTO T VALUES (2,1, NULL, '111');
INSERT INTO T VALUES (3,1, 'B', NULL);
INSERT INTO T VALUES (4,2, NULL, NULL);
INSERT INTO T VALUES (5, 2, 'C', '999');
INSERT INTO T VALUES (6, 5, NULL, NULL);

and the query is (it is too ugly with INSTR/SUBSTR, but maybe faster
then an inline query per each attribute with another CONNECT BY):

------------------
SELECT child_id,PARENT _ID,attr1,attr2 ,tree_path,
RTRIM(SUBSTR(al l_attr1,INSTR(a ll_attr1,'/',-1,2)+1),'/')
inherit_attr1,
RTRIM(SUBSTR(al l_attr2,INSTR(a ll_attr2,'/',-1,2)+1),'/')
inherit_attr2
FROM

(SELECT child_id,parent _id,attr1,attr2 ,
SYS_CONNECT_BY_ PATH(TO_CHAR(ch ild_id), '/') tree_path,
CASE
WHEN attr1 IS NOT NULL THEN '/'||attr1||'/'
ELSE REPLACE('/'||SYS_CONNECT_ BY_PATH(attr1,'/'),'//','/')
END all_attr1,

CASE
WHEN attr2 IS NOT NULL THEN '/'||attr2||'/'
ELSE REPLACE('/'||SYS_CONNECT_ BY_PATH(attr2,'/'),'//','/')
END all_attr2

FROM T
START WITH parent_id IS NULL
CONNECT BY parent_id=PRIOR child_id) v
--------------------------------------------------------

jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406011716.37d0 0399@posting.go ogle.com>...[color=blue]
> First of all, I apologize if coalescing is not the right term to
> describe my problem. I have a tree where each node has the same set of
> attributes (is the same entity) but child nodes should inherit
> attribute values from parent node.
>
> for example, say I have the following table:
>
> (nodeId int , color varchar, phone varchar) with two rows
>
> 5, "GREEN", "555-1212"
> 7, NULL, "777-5555"
> 8, NULL, NULL
> 9, "BLUE", NULL
>
> in addition there is a tree structure that specifies that node 5 is
> the parent of node 7, and 7 is the parent of nodes 8 and 9. I know
> there is many ways to make trees in SQL but as a simple example let's
> say the tree is:
>
> id, parentid
> 8, 7
> 9, 7
> 7, 5
>
> Thus in this case, node 7 inherits the value "GREEN" from node 5 for
> attribute "color", but provides its own value "777-5555" for attribute
> "phone". Node 8, in turn, inherits "GREEN" for "color" from node 7
> (really from node 5 since 7 did not specify its own) and "777-5555"
> for "phone" from node 7. Node 9 provides its own value for "color" and
> inherits the one for "phone" from Node 7.
>
> So the runtime values in the application are:
>
> Node 5: "GREEN", "555-1212"
> Node 7: "GREEN", "777-5555"
> Node 8: "GREEN", "777-5555"
> Node 9: "BLUE", "777-5555"
>
> Question 1: Is there a single SQL statement that for a given node can
> replace the NULLs with inherited values from the parent node?
>
> Question 2: Is there a better way to structure such data in SQL as to
> make answer to question 1 possible?
>
> I would restate the problem as follows:
>
> In a nested structure child nodes inherit values from parent nodes _by
> reference_ or specify their own. "By reference" is the key word here.
> If it wasn't for that you could just duplicate the necessary values
> from the parent entitity upon creation.
>
> Thanks!
>
> - Jeff[/color]

**Mikito Harakiri** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

"Lennart Jonsson" <lennart@kommun icera.umea.se> wrote in message
news:6dae7e65.0 406032300.77888 608@posting.goo gle.com...[color=blue]
> jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message[/color]
news:<235c483f. 0406031837.62f2 701@posting.goo gle.com>...[color=blue]
> Since youre mainly concerned in retrieving ancestors, (not
> investigating subtree
> ), I think it is a bit clumsy. Assume that you have a node x with path
> 1.2.3.56.89.112 and want to investigate which nodes in the path that
> have property p. How would you do that? Subtree queries are much
> easier since you can query: where path like '1.2.4.6.%'[/color]

Small procedural part -- table function -- which returns ancestors
materialized path encodings set

1.2.3.56.89
1.2.3.56
1.2.3
1.2
1

for the input encoding 1.2.3.56.89.112 could be plugged in into the query
returning all the ancestors. This idea translated into the nested intervals
encoding (if you prefer to work with numbers instead of string parsing) is
implemented at the end of

Nested Intervals with Farey Fractions

http://arxiv.org/html/cs.DB/0401014

BTW, http://arxiv.org/abs/cs.DB/0402051 is significantly rewritten.
Essentially, it is Euclidean algorithm (the mother of all algorithms?) that
is employed in the numerical counterpart of the Materialized Path.

**Lennart Jonsson** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406031939.1ed5 fa49@posting.go ogle.com>...[color=blue]
> lennart@kommuni cera.umea.se (Lennart Jonsson) wrote in message news:<6dae7e65. 0406022337.3388 70@posting.goog le.com>...[color=green]
> > jlanfield2003@y ahoo.com (Jeff Lanfield) wrote in message news:<235c483f. 0406021038.13a1 b83b@posting.go ogle.com>...[/color]
>
> [...]
>
> Hmmm, I must have been just a little bit tired there ;-) Should read
> (diff in snd part of union):
>[/color]

and as you probably discovered already, it can be further simplified as

[...]
union all
select
d.nodeid, d.nodeid,
(select count(1) from ancestor where nodeid = d.nodeid) as
depth
from data d
where d.color is not null
)

[...]

/L

**--CELKO--** · Jul 20 '05, 03:56 AM

Re: Nested Coalescing possible in SQL?

>> if an entity instance (one row) does not specify a value for one of
its field [sic]s it should inherit the values from its nearest parent
that does have a value for this field [sic]. <<

Rows are not records; fields are not columns; tables are not files.
It drives me nuts to see people screw up the terms and therefore their
mental model of how SQL works. Let's put this into a simplified
nested set model which has more NULL-able columns than the payroll
system of a major auto company:

CREATE TABLE Nodes
(node_id INTEGER NOT NULL PRIMARY KEY,
col_1 CHAR(5),
col_2 CHAR(5),
..);

CREATE TABLE Tree
(node_id INTEGER NOT NULL UNIQUE
REFERENCES Nodes(node_id),
lft INTEGER NOT NULL,
rgt INTEGER NOT NULL,
PRIMARY KEY (lft, rgt),
<< other tree constraints here >>
..);

Now update all the rows where any column has a NULL. You will have to
run this update once for every level in the tree at most.

UPDATE Nodes
SET col_1
= COALESCE(Nodes. col_1, -- keep non-nul
(SELECT B.col_1 -- find parent
FROM Tree AS E
LEFT OUTER JOIN
Tree AS B
ON B.lft
= (SELECT MAX(lft)
FROM Tree AS S
WHERE E.lft > S.lft
AND E.lft < S.rgt)),
col_2
= COALESCE(Nodes. col_2,
(SELECT B.col_2
FROM ..)),

Etc.
WHERE (col_1 || col_2 || ..||col_n) IS NULL;

This probably scared you. The WHERE clause uses the fact that NULLs
propagate; we don't care *which* column is NULL, so why look for
useless details? That left outer join scalar subquery is how to get
the immediate superiors (B= boss, E= employee) in a hierarchy. The
COALESCE() will retain an existing non-NULL value.

Nested Coalescing possible in SQL?

Nested Coalescing possible in SQL?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment