Using XSLT and XPath for graph data structure processing?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Ramon M. Felciano

    Using XSLT and XPath for graph data structure processing?

    Helo all --

    I'm trying to gain a deeper understand for what type of
    semi-declarative programming can be done through XML and XPath/XSLT.
    I'm looking at graph processing problems as a testbed for this, and
    came across a problem that I haven't been able to solve elegantly. The
    problem is to find "linker" vertexes that a pair of verteces from a
    pre-defined set. For example, if the graph verteces represent cities
    and edges represent flights between them, then given a list of cities,
    find all intermediate cities that you would stop in via a "connecting
    flight".

    For example, given the following simple graph:

    V1 -> V2 -> V3 -> V4
    \<- V5 ->/

    (V5 points to both V2 and V4), and its XML serialization:

    <graph>
    <vertex id="V1"/>
    <vertex id="V2" type="anchor"/>
    <vertex id="V3"/>
    <vertex id="V4" type="anchor"/>
    <vertex id="V5"/>
    <edge source="V1" target="V2"/>
    <edge source="V2" target="V3"/>
    <edge source="V3" target="V4"/>
    <edge source="V5" target="V2"/>
    <edge source="V5" target="V4"/>
    </graph>

    I would like to transform this into a second graph where all vertexes
    that "link" two anchor distinct vertexes are flagged as link nodes. In
    this case, there are two anchor vertexes V2 and V4, and I can link
    them through V3 (V2 -> V3 -> V4) and V5 (V2 <- V5 -> V4). Note that
    linked verteces must be distinct, so traversing the V2 <- V1 -> V2
    path should not yield V1 as a link node. So I'd like to see something
    like this:

    <graph>
    <vertex id="V1"/>
    <vertex id="V2" type="anchor"/>
    <vertex id="V3" linker="true"/>
    <vertex id="V4" type="anchor"/>
    <vertex id="V5" linker="true"/>
    <edge source="V1" target="V2"/>
    <edge source="V2" target="V3"/>
    <edge source="V3" target="V4"/>
    <edge source="V5" target="V2"/>
    <edge source="V5" target="V4"/>
    </graph>

    It would be ideal to come up with a generalized solution that would
    let you use 1, 2, .. N intermediate linking nodes. I've been able to
    get this working with nested loops, but it isn't particularly
    declarative or speedy, and is certainly more verbose than I'd like, so
    I'm wondering if anyone here has insights into how to do this
    elegantly and in XSLT/XPath style. For example, is it possible to
    write a single XPath expression that will select <vertex> elements
    that obey the above criteria? If not, does anyone have any suggestions
    for how to code this effectively and efficiently with XSLT? Or is
    XPath/XSLT not the right paradigm for this type of information
    processing and transformation?

    Thanks!

    Ramon
  • Andy Dingley

    #2
    Re: Using XSLT and XPath for graph data structure processing?

    On 11 Jan 2004 12:41:08 -0800, felciano@yahoo. com (Ramon M. Felciano)
    wrote:
    [color=blue]
    >I'm trying to gain a deeper understand for what type of
    >semi-declarative programming can be done through XML and XPath/XSLT.
    >I'm looking at graph processing problems as a testbed for this,[/color]

    Don't do graphs in XML, use RDF instead.

    Jave & Jena are the tools to use.
    --
    Do whales have krillfiles ?

    Comment

    • Kenneth Stephen

      #3
      Re: Using XSLT and XPath for graph data structure processing?


      "Andy Dingley" <dingbat@codesm iths.com> wrote in message
      news:81m3005veb o5n0i15db6q2o7a vqir2ntau@4ax.c om...[color=blue]
      > On 11 Jan 2004 12:41:08 -0800, felciano@yahoo. com (Ramon M. Felciano)
      > wrote:
      >[color=green]
      > >I'm trying to gain a deeper understand for what type of
      > >semi-declarative programming can be done through XML and XPath/XSLT.
      > >I'm looking at graph processing problems as a testbed for this,[/color]
      >
      > Don't do graphs in XML, use RDF instead.
      >[/color]
      Andy,

      ????? You are aware of course of RDF over XML?

      Regards,
      Kenneth


      Comment

      • Andy Dingley

        #4
        Re: Using XSLT and XPath for graph data structure processing?

        On Mon, 12 Jan 2004 20:57:59 GMT, "Kenneth Stephen"
        <kstephen@austi n.rr.com> wrote:
        [color=blue]
        > ????? You are aware of course of RDF over XML?[/color]

        Sure, but you don't process it _as_ XML.

        XML does tree structure, not graphs. ID & IDREF just don't do anything
        useful for you and there's a whole issue about the difficulty of
        working with XML which contains a graph that isn't entirely
        represented by that document (ie. external resources)

        If you want to work with "a graph" in something that's transported "in
        XML", then you need to build a layer above XML. Or you can use one,
        like RDF, that already exists.

        --
        Do whales have krillfiles ?

        Comment

        • Patrick TJ McPhee

          #5
          Re: Using XSLT and XPath for graph data structure processing?

          In article <b3cf7422.04011 11241.20293d7a@ posting.google. com>,
          Ramon M. Felciano <felciano@yahoo .com> wrote:

          % I'm trying to gain a deeper understand for what type of
          % semi-declarative programming can be done through XML and XPath/XSLT.

          [...]

          [given]

          % <graph>
          % <vertex id="V1"/>
          % <vertex id="V2" type="anchor"/>
          % <vertex id="V3"/>
          % <vertex id="V4" type="anchor"/>
          % <vertex id="V5"/>
          % <edge source="V1" target="V2"/>
          % <edge source="V2" target="V3"/>
          % <edge source="V3" target="V4"/>
          % <edge source="V5" target="V2"/>
          % <edge source="V5" target="V4"/>
          % </graph>

          % this case, there are two anchor vertexes V2 and V4, and I can link
          % them through V3 (V2 -> V3 -> V4) and V5 (V2 <- V5 -> V4). Note that

          I don't really understand this last one. It seems like this is a directed
          graph, so to have a link, you need to start at one anchor and
          go to the other. V5 would be a linker if there were some way to
          get to it from an anchor.

          [we want to get]

          % <graph>
          % <vertex id="V1"/>
          % <vertex id="V2" type="anchor"/>
          % <vertex id="V3" linker="true"/>
          % <vertex id="V4" type="anchor"/>
          % <vertex id="V5" linker="true"/>
          % <edge source="V1" target="V2"/>
          % <edge source="V2" target="V3"/>
          % <edge source="V3" target="V4"/>
          % <edge source="V5" target="V2"/>
          % <edge source="V5" target="V4"/>
          % </graph>

          I start by defining keys which can be used to find the anchors, the
          sources for each edge, and the targets for each edge:

          <xsl:key name='anchors' use='@id' match='vertex[@type = "anchor"]'/>
          <xsl:key name='sources' use='@source' match='edge'/>
          <xsl:key name='targets' use='@target' match='edge'/>

          If we pay attention to the directions of the edges, we want an
          expression which is true when the current node is the target of an edge
          from one or more anchors and the source of an edge towards one or more
          anchors. We can use the `targets' key and the current node's id attribute
          to find all the edges for which it's a target

          key('targets', @id)

          That's a node set made up of edge elements. We can turn it into
          a node set made up of the corresponding source attributes

          key('targets', @id)/@source

          then use that as an argument to key, to find the set anchor vertices
          which are sources of the current node:

          key('anchors', key('targets', @id)/@source)

          If the current node is the target of an edge from an anchor, the
          resulting node set will be non-empty. Similarly, here's the set
          of anchor nodes which are targets of edges from the current node

          key('anchors', key('sources', @id)/@target)

          we can use those node sets as boolean values, and so create a match
          expression which matches all vertex elements which are targets
          of an edge from an anchor and sources of an edge to an anchor


          match="vertex[key('anchors', key('targets', @id)/@source) and
          key('anchors', key('sources', @id)/@target)]"

          to ignore the direction of the edges, we can create a union of
          those two node sets and count the elements in the union

          match="vertex[count(key('anch ors', key('targets', @id)/@source)|
          key('anchors', key('sources', @id)/@target)) > 1]"


          Note that if we were to make an edge from V2 to V1, the `directed'
          match expression would call V1 a link, but the undirected one would
          not.

          The full style sheet is

          <xsl:styleshe et
          xmlns:xsl = 'http://www.w3.org/1999/XSL/Transform' version='1.0'>

          <xsl:key name='anchors' use='@id' match='vertex[@type = "anchor"]'/>
          <xsl:key name='sources' use='@source' match='edge'/>
          <xsl:key name='targets' use='@target' match='edge'/>

          <xsl:template match='node()|@ *'>
          <xsl:copy>
          <xsl:apply-templates select='node()| @*'/>
          </xsl:copy>
          </xsl:template>

          <xsl:template match='vertex[count(key("anch ors", key("targets", @id)/@source)
          |key("anchors", key("sources", @id)/@target)) > 1]'>
          <xsl:copy>
          <xsl:apply-templates select='node()| @*'/>
          <xsl:attribut e name='linker'>t rue</xsl:attribute>
          </xsl:copy>
          </xsl:template>

          </xsl:stylesheet>

          % It would be ideal to come up with a generalized solution that would
          % let you use 1, 2, .. N intermediate linking nodes.

          I'm not sure off the top of my head how you would generalise this.
          --

          Patrick TJ McPhee
          East York Canada
          ptjm@interlog.c om

          Comment

          • Dimitre Novatchev

            #6
            The General Solution (Was:Re: Using XSLT and XPath for graph data structure processing?)

            felciano@yahoo. com (Ramon M. Felciano) wrote in message news:<b3cf7422. 0401111241.2029 3d7a@posting.go ogle.com>...[color=blue]
            > Helo all --
            >
            > I'm trying to gain a deeper understand for what type of
            > semi-declarative programming can be done through XML and XPath/XSLT.
            > I'm looking at graph processing problems as a testbed for this, and
            > came across a problem that I haven't been able to solve elegantly. The
            > problem is to find "linker" vertexes that a pair of verteces from a
            > pre-defined set. For example, if the graph verteces represent cities
            > and edges represent flights between them, then given a list of cities,
            > find all intermediate cities that you would stop in via a "connecting
            > flight".
            >
            > For example, given the following simple graph:
            >
            > V1 -> V2 -> V3 -> V4
            > \<- V5 ->/
            >
            > (V5 points to both V2 and V4), and its XML serialization:
            >
            > <graph>
            > <vertex id="V1"/>
            > <vertex id="V2" type="anchor"/>
            > <vertex id="V3"/>
            > <vertex id="V4" type="anchor"/>
            > <vertex id="V5"/>
            > <edge source="V1" target="V2"/>
            > <edge source="V2" target="V3"/>
            > <edge source="V3" target="V4"/>
            > <edge source="V5" target="V2"/>
            > <edge source="V5" target="V4"/>
            > </graph>
            >
            > I would like to transform this into a second graph where all vertexes
            > that "link" two anchor distinct vertexes are flagged as link nodes. In
            > this case, there are two anchor vertexes V2 and V4, and I can link
            > them through V3 (V2 -> V3 -> V4) and V5 (V2 <- V5 -> V4). Note that
            > linked verteces must be distinct, so traversing the V2 <- V1 -> V2
            > path should not yield V1 as a link node. So I'd like to see something
            > like this:
            >
            > <graph>
            > <vertex id="V1"/>
            > <vertex id="V2" type="anchor"/>
            > <vertex id="V3" linker="true"/>
            > <vertex id="V4" type="anchor"/>
            > <vertex id="V5" linker="true"/>
            > <edge source="V1" target="V2"/>
            > <edge source="V2" target="V3"/>
            > <edge source="V3" target="V4"/>
            > <edge source="V5" target="V2"/>
            > <edge source="V5" target="V4"/>
            > </graph>
            >
            > It would be ideal to come up with a generalized solution that would
            > let you use 1, 2, .. N intermediate linking nodes. I've been able to
            > get this working with nested loops, but it isn't particularly
            > declarative or speedy, and is certainly more verbose than I'd like, so
            > I'm wondering if anyone here has insights into how to do this
            > elegantly and in XSLT/XPath style. For example, is it possible to
            > write a single XPath expression that will select <vertex> elements
            > that obey the above criteria? If not, does anyone have any suggestions
            > for how to code this effectively and efficiently with XSLT? Or is
            > XPath/XSLT not the right paradigm for this type of information
            > processing and transformation?
            >
            > Thanks!
            >
            > Ramon[/color]


            Ramon,

            Here's the general solution and it is quite simple and
            straightforward . I use a recursive template based on the following
            rule:

            Paths(/.., X, Z) = Union( {X -> Xi} . Paths(X, Xi, Z) )

            Where the function Paths(Eset, X, Z) returns all paths from X to Z,
            that do not include any vertices belonging to the exclusion-set Eset.

            Here {X -> Xi} are all edges from X.

            As you can see the code of the transformation is 71 lines long, but it
            should be noted that for the purpose of readability I write some XPath
            expressions on several lines and also half of these 71 lines are just
            closing tags.

            Here's the code. This transformation:

            <xsl:styleshe et version="1.0"
            xmlns:xsl="http ://www.w3.org/1999/XSL/Transform"
            xmlns:exsl="htt p://exslt.org/common"
            exclude-result-prefixes="exsl"[color=blue]
            >[/color]
            <xsl:output omit-xml-declaration="ye s" indent="yes"/>

            <xsl:key name="kNeighbor s" match="vertex"
            use="../edge[@target = current()/@id]/@source"/>

            <xsl:key name="kNeighbor s" match="vertex"
            use="../edge[@source = current()/@id]/@target"/>

            <xsl:template match="/">
            <xsl:call-template name="getPaths" >
            <xsl:with-param name="pNode1"
            select="/*/vertex[@type='anchor'][1]"/>
            <xsl:with-param name="pNode2"
            select="/*/vertex[@type='anchor'][2]"/>
            </xsl:call-template>
            </xsl:template>

            <xsl:template name="getPaths" >
            <xsl:param name="pNode1" select="/.."/>
            <xsl:param name="pNode2" select="/.."/>
            <xsl:param name="pExcluded " select="/.."/>

            <xsl:for-each select=
            "key('kNeighbor s', $pNode1/@id)
            [not(@id = $pExcluded/@id)]">
            <xsl:choose>
            <xsl:when test="@id = $pNode2/@id">
            <path>
            <xsl:copy-of
            select="/*/edge[$pNode1/@id = @*
            and
            $pNode2/@id = @*
            ]"/>
            </path>
            </xsl:when>
            <xsl:otherwis e>
            <xsl:variable name="vrtfTail" >
            <xsl:call-template name="getPaths" >
            <xsl:with-param name="pNode1"
            select="."/>
            <xsl:with-param name="pNode2"
            select="$pNode2 "/>
            <xsl:with-param name="pExcluded "
            select="$pExclu ded | $pNode1"/>
            </xsl:call-template>
            </xsl:variable>

            <xsl:variable name="vTail"
            select="exsl:no de-set($vrtfTail)/*"/>

            <xsl:if test="$vTail">
            <path>
            <xsl:copy-of
            select="/*/edge[$pNode1/@id = @*
            and
            current()/@id = @*
            ]"/>

            <xsl:copy-of select="$vTail/*"/>
            </path>
            </xsl:if>
            </xsl:otherwise>
            </xsl:choose>
            </xsl:for-each>
            </xsl:template>
            </xsl:stylesheet>

            when applied on this source.xml:

            <graph>
            <vertex id="V1"/>
            <vertex id="V2" type="anchor"/>
            <vertex id="V3"/>
            <vertex id="V4" type="anchor"/>
            <vertex id="V5"/>
            <edge source="V1" target="V2"/>
            <edge source="V1" target="V3"/>
            <edge source="V2" target="V3"/>
            <edge source="V3" target="V4"/>
            <edge source="V5" target="V2"/>
            <edge source="V5" target="V4"/>
            </graph>

            produces thewanted result:

            <path>
            <edge source="V1" target="V2"/>
            <edge source="V1" target="V3"/>
            <edge source="V3" target="V4"/>
            </path>
            <path>
            <edge source="V2" target="V3"/>
            <edge source="V3" target="V4"/>
            </path>
            <path>
            <edge source="V5" target="V2"/>
            <edge source="V5" target="V4"/>
            </path>

            These are all three different paths (with all nodes in the path only
            once) from V2 to V4 in the graph described by the xml above.

            The first solution:

            V2 -> V1 -> V3 ->V4

            is 3 edges long.

            The other two are two edges long each.


            I hope this helped in this specific problem and also to answer
            positively your questions about the expressiveness of XPath/XSLT and
            the appropriateness of XSLT as a tool for solving this type of
            problems.


            Dimitre Novatchev.
            FXSL developer, XML Insider,

            http://fxsl.sourceforge.net/ -- the home of FXSL
            Resume: http://fxsl.sf.net/DNovatchev/Resume/Res.html

            Comment

            • Ed Beroset

              #7
              Re: The General Solution (Was:Re: Using XSLT and XPath for graphdata structure processing?)

              Dimitre Novatchev wrote:
              [color=blue]
              > I hope this helped in this specific problem and also to answer
              > positively your questions about the expressiveness of XPath/XSLT and
              > the appropriateness of XSLT as a tool for solving this type of
              > problems.[/color]

              What, no SVG output? :-) I'm kidding. Seriously, I wasn't the
              original poster on that one, but I enjoyed looking over your solution.
              Nice work!

              Ed

              Comment

              Working...