XPath for non-empty #text nodes (tDOM)

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Mikhail Teterin

    XPath for non-empty #text nodes (tDOM)

    Hello!

    What's would be the syntax for a query, which would allow me to get only the
    elements with non-empty text-nodes?

    For example, from:

    <a><b></b></a>
    <c/>
    <d><e>meow</e></d>

    I only want to get the d/e ... If this matters, I'm using Tcl with the tDOM
    extension. Thanks!

    -mi

  • Richard Tobin

    #2
    Re: XPath for non-empty #text nodes (tDOM)

    In article <44qdncuBAoFht2 DanZ2dnUVZ_s-pnZ2d@speakeasy .net>,
    Mikhail Teterin <usenet+mill@al dan.algebra.com wrote:
    >What's would be the syntax for a query, which would allow me to get only the
    >elements with non-empty text-nodes?
    Presumably you mean direct text children (rather than descendants),
    and by "non-empty" you mean "containing something other than whitespace".

    If so, try

    *[normalize-space(text()) != ""]

    (or with // at the start, depending on the context).

    -- Richard
    --
    :wq

    Comment

    • Mikhail Teterin

      #3
      Re: XPath for non-empty #text nodes (tDOM)

      Richard Tobin wrote:
      Presumably you mean direct text children (rather than descendants),
      and by "non-empty" you mean "containing something other than whitespace".
      >
      If so, try
      >
      *[normalize-space(text()) != ""]
      >
      (or with // at the start, depending on the context).
      Yep, that worked, believe it or not. Thank you very much for the quick
      response!

      Yours,

      -mi

      Comment

      • Mikhail Teterin

        #4
        more XPath struggles (tDOM)

        Hello!

        I need to locate all nodes named "mp:date". For some reason, the

        $xml selectNodes {//mp:date}

        does not find any, but

        $xml selectNodes {//*[name()="mp:date "]}

        works. What's the reason for the difference? xml_grep (Perl's front-end to
        XPath) find the right things with just "//mp:date".

        Of these nodes, I, actually, only need the ones, where an attribute
        "xc:value" is not "TODAY". How do I express this?

        Thanks a lot!

        -mi


        Comment

        • Mikhail Teterin

          #5
          XPath queries: is or/and possible? (tDOM)

          Hello!

          I need a rule, that will check a certain attribute (xc:value) of all nodes
          against /several/ different values...

          The following syntax:

          $txml selectNodes {//*[@xc:value!="FOO " and @xc:value!="BAR "}

          does not generate syntax errors, but does not return anything either...

          I tried it with and without the xc:-prefix... What am I doing wrong?

          Thank you!

          -mi

          Comment

          • Richard Tobin

            #6
            Re: XPath queries: is or/and possible? (tDOM)

            In article <0MadnUIyWvj-5IbVnZ2dnUVZ_ge dnZ2d@speakeasy .net>,
            Mikhail Teterin <usenet+mill@al dan.algebra.com wrote:
            $txml selectNodes {//*[@xc:value!="FOO " and @xc:value!="BAR "}
            You seem to have a missing close-bracket, but I assume that's a typo.

            Otherwise the XPath looks reasonable - I don't know anything about tDOM..

            Bear in mind that @foo != "bar" means "there exists a foo attribute
            not equal to 'bar'". A node with no foo attribute doesn't pass.
            >I tried it with and without the xc:-prefix... What am I doing wrong?
            Presumably you are doing whatever is necessary to bind that prefix.

            -- Richard
            --
            :wq

            Comment

            • Joseph J. Kesselman

              #7
              Re: more XPath struggles (tDOM)

              Mikhail Teterin wrote:
              $xml selectNodes {//mp:date}
              >
              does not find any, but
              >
              $xml selectNodes {//*[name()="mp:date "]}
              >
              works. What's the reason for the difference?
              XPath is namespace-aware. To select a namespaced node, you must use a
              prefix in your path (which you did) and tell your XPath evaluator what
              the prefix bound to (which you didn't). Look at the user's manual for
              your tool.

              (comp.lang.xml doesn't exist on my server, so I can't crosspost there.)

              Comment

              • Mikhail Teterin

                #8
                Re: more XPath struggles (tDOM)

                Joseph J. Kesselman wrote:
                XPath is namespace-aware. To select a namespaced node, you must use a
                prefix in your path (which you did) and tell your XPath evaluator what
                the prefix bound to (which you didn't). Look at the user's manual for
                your tool.
                Thanks. After I explicitly set:

                $xml selectNodesName spaces {mp MarketParameter s xc XmlCache}

                I got some progress... But the namespaces are defined in the document itself
                -- for example:

                <xc:XmlCache xmlns:xc="XmlCa che" xc:action="Upda te">

                why do I still need to specify them? It certainly works with xml_grep... Is
                there a bug in the package (tDOM), or is the above element not sufficient
                to define a namespace?

                Thanks! Yours,

                -mi

                Comment

                • Donal K. Fellows

                  #9
                  Re: more XPath struggles (tDOM)

                  Mikhail Teterin wrote:
                  But the namespaces are defined in the document itself
                  -- for example:
                  <xc:XmlCache xmlns:xc="XmlCa che" xc:action="Upda te">
                  why do I still need to specify them? It certainly works with xml_grep... Is
                  there a bug in the package (tDOM), or is the above element not sufficient
                  to define a namespace?
                  Basically, namespaces and XPath don't sit too well together (when the
                  XPath expression is located in an XML document) because the document
                  namespace context at the point in the document where the XPath is
                  located is shrouded from the expression (because it is formally an
                  xsd:string, which is not namespace-aware according to the namespaces
                  spec). This means that if you're embedding an XPath expression in an
                  XML document, you also need to embed a way to tell it what the
                  namespace context to evaluate in is. (This is stupid, but the way it
                  is and isn't a Tcl problem at all.)

                  If the XPath expression is not contained in some XML context, then it
                  is even more obvious that the namespace context needs to be given.

                  Donal.

                  Comment

                  • Joseph J. Kesselman

                    #10
                    Re: more XPath struggles (tDOM)

                    Mikhail Teterin wrote:
                    I got some progress... But the namespaces are defined in the document itself
                    -- for example:
                    <xc:XmlCache xmlns:xc="XmlCa che" xc:action="Upda te">
                    why do I still need to specify them?
                    Because they could be set differently at different places in the
                    document, and/or whatever generated your XPath might have different
                    prefixes bound to those namespaces or vice versa. You need to provide a
                    context so the system knows what you meant.

                    Some processors will let you specify a context node and will pick up the
                    namespaces defined there. Again, check your docs.
                    It certainly works with xml_grep
                    I don't know xml_grep, so I can't advise. It may be assuming the root
                    node as the context if not told otherwise. Or it may be flat-out broken
                    and not processing namespaces correctly.

                    Say what you mean. The system can't read your mind, and shouldn't try.

                    Comment

                    • Joseph J. Kesselman

                      #11
                      Re: more XPath struggles (tDOM)

                      Basically, namespaces and XPath don't sit too well together

                      It works fine when you understand how to use it properly.

                      The only real problem is that XPath relied on prefixes retrieved from
                      some unspecified environment (depending on the context/tool in which the
                      XPath is being executed). That's a bit less verbose than using an
                      "expanded qualified name" like {http://my_namespace}fo o, or requiring
                      that the namespace bindings be specified via some syntax in the XPath
                      string. But it does mean that an XPath is partly defined by that
                      context. (Then again, XPaths which use variables also need a context, as
                      do those which use some of the functions, so this is just the most
                      obvious -- and most unnecessary -- instance thereof.)

                      It is possible to write a portable namespace-aware XPath that doesn't
                      rely on prefixes (via some ugly predicate hacks)... but it really should
                      be easier to do so. Oh well. 20:20 hindsight; maybe XPath 3.0 will
                      finally reconsider that point.

                      By the way: The namespaces shown in the original example are not
                      considered acceptable by today's standards. Namespace names should be
                      fully-qualified ("absolute") URI References. Yes, the original namespace
                      spec was fuzzy about that, and many tools won't enforce this... but
                      after much painful debate, the W3C agreed that the concept of a
                      "relative namespace" really didn't make any sense no matter how you
                      sliced it. Tim Berners-Lee reserves the right to reintroduce that idea
                      if and when the Semantic Web effort comes up with a way to make those
                      meaningful... but until then, you really should make sure all your
                      namespace names follow the official absolute-URI-reference syntax.

                      Comment

                      • Mikhail Teterin

                        #12
                        Re: more XPath struggles (tDOM)

                        Joseph J. Kesselman wrote:
                        The system can't read your mind, and shouldn't try.
                        I don't want it to read my mind, I want it to read the document. The
                        namespaces are set there with an xmlns-attribute of containing elements. In
                        fact, when I for the node's name [$node nodeName], I get the
                        fully-qualified foo.bar.woof.me ow.

                        It KNOWS the namespace-mapping, but it wants me to repeat it (f means foo, b
                        means bar, w means woof, etc.). That's gratuitous...

                        -mi

                        Comment

                        • Richard Tobin

                          #13
                          Re: more XPath struggles (tDOM)

                          In article <14831708.fxxoT TJthF@aldan.alg ebra.com>,
                          Mikhail Teterin <usenet+meow@al dan.algebra.com wrote:
                          >I don't want it to read my mind, I want it to read the document. The
                          >namespaces are set there with an xmlns-attribute of containing elements. In
                          >fact, when I for the node's name [$node nodeName], I get the
                          >fully-qualified foo.bar.woof.me ow.
                          >
                          >It KNOWS the namespace-mapping, but it wants me to repeat it (f means foo, b
                          >means bar, w means woof, etc.). That's gratuitous...
                          Suppose you try to use the same XPath expressions with a document
                          that uses different prefixes. How's that going to work? Your XPath
                          expressions will all be wrong.

                          The choice of prefixes is supposed to be arbitrary. You can't rely on
                          f meaning foo. Even within a single document, you can use the same
                          prefix for different namespaces and different prefixes for the
                          same namespace.

                          -- Richard

                          --
                          :wq

                          Comment

                          • Donal K. Fellows

                            #14
                            Re: more XPath struggles (tDOM)

                            Joseph J. Kesselman wrote:
                            It is possible to write a portable namespace-aware XPath that doesn't
                            rely on prefixes (via some ugly predicate hacks)... but it really should
                            be easier to do so. Oh well. 20:20 hindsight; maybe XPath 3.0 will
                            finally reconsider that point.
                            It'd be OK if there was a type "like xs:string, but understands the
                            current namespace context" but there isn't. (Of course, once you
                            extract the XPath from its context document you then need to remember
                            to explicitly get the NS context from somewhere, which is almost
                            certainly the root of the problem in the message that started this
                            thread.)

                            Donal.

                            Comment

                            • Rolf Ade

                              #15
                              Re: more XPath struggles (tDOM)

                              Mikhail Teterin wrote:
                              >Joseph J. Kesselman wrote:
                              >XPath is namespace-aware. To select a namespaced node, you must use a
                              >prefix in your path (which you did) and tell your XPath evaluator what
                              >the prefix bound to (which you didn't). Look at the user's manual for
                              >your tool.
                              >
                              >Thanks. After I explicitly set:
                              >
                              $xml selectNodesName spaces {mp MarketParameter s xc XmlCache}
                              That's the right way to bind prefixes to a namespace. One way. You can
                              always use the -namespaces option to the selectNodes method, but
                              setting things up with one selectNodesName spaces call for the rest of
                              the lifetime of the document seems to be more convenient to me.
                              >I got some progress... But the namespaces are defined in the document itself
                              >-- for example:
                              >
                              <xc:XmlCache xmlns:xc="XmlCa che" xc:action="Upda te">
                              >
                              >why do I still need to specify them? It certainly works with xml_grep... Is
                              >there a bug in the package (tDOM), or is the above element not sufficient
                              >to define a namespace?
                              No, it's not a bug. As long as no selectNodesName spaces setting nor
                              the -namespaces option is given, tDOM even respects the XML namespace
                              declarations of the document. The context node of your XPath
                              expression is the node, from which you call your XPath expression. If
                              the (all) prefixes, you're using in your XPath expression are in scope
                              of that node, you've to do nothing; namespace resolving will work as
                              you expect. Since you had trouble with this, I'd bet, not all used
                              XML namespace declarations are in scope of your context node.

                              But, as others already have pointed out, it is _dangerous_ to bank on
                              the prefixes in the document. Prefixes don't matter, it's the
                              namespaces, that matters.

                              From the XML viewpoint,

                              <a:doc xmlns:a="http://foo.bar.com">
                              <a:elem>data</a:elem>
                              </a:doc>

                              and

                              <b:doc xmlns:b="http://foo.bar.com">
                              <b:elem>data</a:elem>
                              </b:doc>

                              are the in some sense the 'same' documents.

                              You can't just say [$someNode selectNodes a:elem] in your code and
                              expect that to work reliable. If the document provider uses another
                              prefix (bound to the same namespace), your code will fail.

                              The clear way out is, to say the XPath engine, which namespace you
                              mean with which prefix. With e.g. selectNodesName spaces.

                              rolf

                              Comment

                              Working...