java xerces xpath fails with namespace

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • jacksu

    java xerces xpath fails with namespace

    I have a simple program to run xpath with xerces 1_2_7

    XPathFactory factory = XPathFactory.ne wInstance();
    XPath xPath = factory.newXPat h();


    XPathExpression xp = xPath.compile(s trXpr);
    System.out.prin tln(xp.evaluate (new InputSource(new
    FileInputStream ("a.xml")))) ;

    if a.xml is
    <?xml
    version="1.0"?> <root><parent>< son>theTextValu e</son></parent></root>

    and strXpr is /root/parent/son/text()
    I got correct value back "theTextVal ue".

    But if parent is with namespace, such as soap message, it always fails,
    eg.:<?xml version="1.0"?> <soap:Envelop e
    xmlns:soap="htt p://schemas.xmlsoap .org/soap/envelope/"
    xmlns:xsd="http ://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http ://www.w3.org/2001/XMLSchema-instance"><soap :Body><son>theV alue</son>
    </soap:Body></soapEnvelope>

    strXpr is /soap:Envelope/soap:Body/son/text()

    empty string was returned...

    Any suggestion are welcome.

    Thanks.

  • Joe Kesselman

    #2
    Re: java xerces xpath fails with namespace

    XPath is namespace-sensitive. To correctly search a namespaced document,
    you must use prefixes in your XPath and provide bindings from those
    prefixes to the appropriate namespace URIs.

    If you really insist on doing a namespace-insensitive search, it's
    possible by using kluge-arounds such as node()[name()="foo"] ... but
    REALLY not recommended. Namespaces are used because they're a meaningful
    distinction. Don't attempt to ignore or bypass them.

    Comment

    • jacksu

      #3
      Re: java xerces xpath fails with namespace

      Would you please more detail? How to specify prefix in xpath?

      eg. my xml file is:
      ?xml version='1.0' encoding='UTF-8'?>
      <soap:Envelop e xmlns:soap="htt p://schemas.xmlsoap .org/soap/envelope/"
      xmlns:xsd="http ://www.w3.org/2
      001/XMLSchema"
      xmlns:xsi="http ://www.w3.org/2001/XMLSchema-instance"><soap :Body>testtext</soap:Body>
      </soap:Envelope>

      and my xpath is:
      //soap:Envelope/soap:Body/text()
      or
      //Envelope/Body/text()

      all no result back....

      Thanks.

      Comment

      • Joe Kesselman

        #4
        Re: java xerces xpath fails with namespace

        jacksu wrote:[color=blue]
        > and my xpath is:
        > //soap:Envelope/soap:Body/text()
        > or
        > //Envelope/Body/text()[/color]

        The second won't work. The first will, *if* you've told your XPath
        processor that the soap: prefix maps to
        "http://schemas.xmlsoap .org/soap/envelope/"

        How you do that depends on the processor. If you're using the XPath
        within an XSLT stylesheet, you just need to make sure soap: has been
        properly declared as a namespace at a point where it will be inherited
        by the statement which is executing the XPath; the usual practice is to
        define most namespaces all the way up at the top-level xsl:stylesheet
        element to make sure they're available throughout the document.

        If you're using an XPath API of some sort, check its docs to find out
        how to tell it the mapping between prefixes and namespace URIs.

        Comment

        • Soren Kuula

          #5
          Re: java xerces xpath fails with namespace

          jacksu wrote:

          Hi
          [color=blue]
          > //soap:Envelope/soap:Body/text()
          > or
          > //Envelope/Body/text()
          >
          > all no result back....[/color]

          Hmm... Shold fail with a loud bang, if you used a prefix in the XPath
          but did not bind it to a namespace.

          Try look around in the documentation for that XPath evaluation thing (I
          don't know it), for something called a namespace environment, context or
          something like that. When you found it, you will need to call something
          like bind("soap", "http://namespaceURI/of/SOAP") on it before evaluating.


          Soren

          Comment

          • Martin Honnen

            #6
            Re: java xerces xpath fails with namespace



            jacksu wrote:
            [color=blue]
            > I have a simple program to run xpath with xerces 1_2_7
            >
            > XPathFactory factory = XPathFactory.ne wInstance();
            > XPath xPath = factory.newXPat h();
            >
            >
            > XPathExpression xp = xPath.compile(s trXpr);
            > System.out.prin tln(xp.evaluate (new InputSource(new
            > FileInputStream ("a.xml")))) ;
            >[/color]

            That looks like Java using the JAXP XPath API from Java 1.5 to me.
            However Xerces-Java 2.6 or 2.7 might implement that, but rather not 1_2_7.

            If you want to use namespaces then you need to pass in an object
            implementing javax.xml.names pace.NamespaceC ontext that implements the
            methods to resolve prefixes to namespace URIs and the other way round.
            Then do
            xp.setNamespace Context(yourObj ectImplementing NamespaceContex t);
            before you compile or evaluate expressions.



            --

            Martin Honnen

            Comment

            • Joe Kesselman

              #7
              Re: java xerces xpath fails with namespace

              Martin Honnen wrote:[color=blue]
              > If you want to use namespaces then you need to pass in an object
              > implementing javax.xml.names pace.NamespaceC ontext that implements the
              > methods to resolve prefixes to namespace URIs and the other way round.[/color]

              See, for example,

              Comment

              • Joe Kesselman

                #8
                Re: java xerces xpath fails with namespace

                Working example for the Apache/Xalan code:

                public static void main(String[] args) {
                String strXpr = "/a:foo/a:bar";
                XPathFactory factory = XPathFactory.ne wInstance();
                XPath xPath = factory.newXPat h();
                try {
                // Anonymous hardcoded Namespace Context:
                NamespaceContex t MyNSC=new NamespaceContex t() {
                public String getNamespaceURI (String prefix) {
                if (prefix.equals( "a")) return "urn:a";
                else return XMLConstants.NU LL_NS_URI;
                }
                public String getPrefix(Strin g namespace) {
                return null; // Just dummied out; Xalan doesn't need it.
                }
                public Iterator getPrefixes(Str ing namespace) {
                return null; // Just dummied out; Xalan doesn't need it.
                }
                };

                xPath.setNamesp aceContext(myns c);
                XPathExpression xp = xPath.compile(s trXpr);
                System.out.prin tln(xp.evaluate (
                new InputSource(new FileInputStream ("a.xml")))) ;
                } catch (Exception e) {
                e.printStackTra ce();
                }
                }

                Comment

                • Joe Kesselman

                  #9
                  Re: java xerces xpath fails with namespace

                  Whups. Typo crept in while recopying this into the newsgroup; obviously,
                  MyNSC and mynsc were supposed to be the same variable. That's what I get
                  for trying to simplify the example on the fly.

                  Comment

                  • jacksu

                    #10
                    Re: java xerces xpath fails with namespace

                    Thanks a lot.

                    It works fine in pure with-prefix mode, but seems have problem in
                    with-prefix/without-prefix mixed mode
                    such as:
                    <?xml version='1.0' encoding='UTF-8'?><soap:Envel ope
                    xmlns:soap="htt p://schemas.xmlsoap .org/soap/envelope/"
                    xmlns:xsd="http ://www.w3.org/2001/XMLSchema"
                    xmlns:xsi="http ://www.w3.org/2001/XMLSchema-instance"><soap :Body><mynode
                    xmlns="http://mynamespace">my text</mynode></soap:Body></soap:Envelope>

                    I tried:
                    //soap:Envelope/soap:Body/mynode/text()

                    If I gives prefix to mynode, then everything works fine.

                    Any more suggestion?

                    Thanks.

                    Comment

                    • Martin Honnen

                      #11
                      Re: java xerces xpath fails with namespace



                      jacksu wrote:
                      [color=blue]
                      > <?xml version='1.0' encoding='UTF-8'?><soap:Envel ope
                      > xmlns:soap="htt p://schemas.xmlsoap .org/soap/envelope/"
                      > xmlns:xsd="http ://www.w3.org/2001/XMLSchema"
                      > xmlns:xsi="http ://www.w3.org/2001/XMLSchema-instance"><soap :Body><mynode
                      > xmlns="http://mynamespace">my text</mynode></soap:Body></soap:Envelope>
                      >
                      > I tried:
                      > //soap:Envelope/soap:Body/mynode/text()
                      >
                      > If I gives prefix to mynode, then everything works fine.[/color]

                      You need to use a prefix in the XPath expression, no need to change the
                      input XML but for the XPath you need a prefix bound to
                      http://mynamespace to select those elements.
                      See
                      <http://www.faqts.com/knowledge_base/view.phtml/aid/34022/fid/616>
                      Simply make sure you use a prefix e.g.
                      pf1:mynode
                      and your NamespaceContex t returns the URI http://mynamespace for that
                      prefix.

                      --

                      Martin Honnen

                      Comment

                      • jacksu

                        #12
                        Re: java xerces xpath fails with namespace

                        Excellent!! That works!!

                        Thanks a lot.

                        Comment

                        • Greg

                          #13
                          Re: java xerces xpath fails with namespace

                          I believe that I have prefixed my xpath properly, but I get an
                          XPathExpression Exception when I evaluate it.

                          The xpath looks like this (note the "xhtml" prefix):


                          /xhtml:html/xhtml:body//xhtml:div[@class='reviewl ist']


                          The source XML document is a garden-variety XHTML web page whose root
                          html element declares the document's default namespace as being
                          http://www.w3.org/1999/xhtml.


                          <?xml version="1.0" encoding="ISO-8859-1"?>
                          <!DOCTYPE html
                          PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                          "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[color=blue]
                          >[/color]
                          <html xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">


                          As I understand it, writing the html element like this declares that it
                          belongs to the http://www.w3.org/1999/xhtml namespace. So, to evaluate
                          an xpath for this document, my javax.xml.xpath .XPath must have a
                          namespace context set. Here is my implementation of the the
                          javax.xml.names pace.NamespaceC ontext interface (note this
                          implementation accommodates mutlitple namespaces by use of a
                          java.util.HashM ap - a tip from
                          http://www.onjava.com/pub/a/onjava/2.../12/xpath.html - lest a part
                          of the web page be in another language and its element has, say,
                          xml:lang="fr").


                          import java.util.HashM ap;
                          import java.util.Itera tor;
                          import java.util.Map;
                          import java.util.Set;

                          import javax.xml.names pace.NamespaceC ontext;

                          public class NamespaceContex tImpl implements NamespaceContex t {

                          private Map map;

                          /**
                          * A contructor that instantiates a new java.util.HashM ap in which
                          * namespace URIs will be mapped to prefixes.
                          *
                          * This method is inherited from the implemented NamespaceContex t
                          interface.
                          */
                          public NamespaceContex tImpl() {
                          map = new HashMap();
                          }

                          /**
                          * Adds a prefix and namespace URI pair to this
                          * NamespaceContex tImpl's HashMap.
                          *
                          * This method is not inherited from the implemented
                          * NamespaceContex t interface.
                          */
                          public void setNamespaceURI (String prefix, String namespaceURI) {
                          map.put(prefix, namespaceURI);
                          }

                          /**
                          * Gets the namespace URI mapped to the given
                          * prefix from this NamespaceContex tImpl's HashMap.
                          *
                          * This method is inherited from the implemented NamespaceContex t
                          interface.
                          */
                          public String getNamespaceURI (String prefix) {
                          return (String)map.get (prefix);
                          }

                          /**
                          * Gets the prefix to which the given namespace
                          * URI is mapped in this NamespaceContex tImpl's
                          * HashMap.
                          *
                          * This method is inherited from the implemented
                          * NamespaceContex t interface.
                          */
                          public String getPrefix(Strin g namespaceURI) {

                          Set keys = map.keySet();

                          // Loop through the prefixes until one is found
                          // whose corresponding namespace URI matches
                          // the namespace URI passed to this method.
                          // Return that prefix.
                          for(Iterator i = keys.iterator() ; i.hasNext(); ) {
                          String prefix = (String)i.next( );
                          String uri = (String)map.get (prefix);
                          if(uri.equals(n amespaceURI)) return prefix;
                          }

                          // If prefix is found with a namespace URI matching the
                          // namespace URI passed to this method, return null.
                          return null;
                          }

                          /**
                          * This method is inherited from the implemented
                          * NamespaceContex t interface.
                          */
                          public Iterator getPrefixes(Str ing namespaceURI) {
                          return null;
                          }

                          }


                          I then use this NamespaceContex t like this:


                          javax.xml.xpath .XPathFactory factory = XPathFactory.ne wInstance();
                          javax.xml.xpath .XPath xpath = factory.newXPat h();

                          NamespaceContex tImpl nsctx = new NamespaceContex tImpl();
                          nsctx.setNamesp aceURI("xml", "http://www.w3.org/XML/1998/namespace");
                          nsctx.setNamesp aceURI("xhtml", "http://www.w3.org/1999/xhtml");
                          xpath.setNamesp aceContext(nsct x);


                          For the evaluate method, I need a org.xml.sax.Inp utSource, which i get
                          like this:


                          java.net.URL url = new
                          java.net.URL("h ttp://www.mywebpage.c om/index.html");
                          java.net.HttpUR LConnection huc =
                          (java.net.HttpU RLConnection)ur l.openConnectio n();
                          org.xml.sax.Inp utSource ins = new
                          org.xml.sax.Inp utSource(huc.ge tInputStream()) ;


                          Now I can evaluate the xpath and, I hope, get a org.w3c.dom.Nod eList:


                          org.w3c.dom.Nod eList nl = (
                          org.w3c.dom.Nod eList)xpath.eva luate(
                          "/xhtml:html/xhtml:body//xhtml:div[@class='reviewl ist']"
                          ins,
                          javax.xml.xpath .XPathConstants .NODESET
                          );



                          The result I get from calling this method, though, is a
                          javax.xml.xpath .XPathExpressio nException, I think. At least, when I
                          catch the exception and call its toString() method, that what it says
                          it is. When I call its getCause() method, though, I get:

                          java.net.Connec tExcpetion: Connection timed out.


                          Its stack trace looks like this:


                          com.sun.org.apa che.xpath.inter nal.jaxp.XPathI mpl.evaluate(XP athImpl.java:47 5)
                          my.package.Webs iteHandler.star tElement(Unknow n Source)
                          org.apache.xerc es.parsers.Abst ractSAXParser.s tartElement(Unk nown
                          Source)

                          org.apache.xerc es.parsers.Abst ractXMLDocument Parser.emptyEle ment(Unknown
                          Source)

                          org.apache.xerc es.impl.XMLNSDo cumentScannerIm pl.scanStartEle ment(Unknown
                          Source)
                          org.apache.xerc es.impl.XMLDocu mentFragmentSca nnerImpl$Fragme ntContentDispat cher.dispatch(U nknown
                          Source)
                          org.apache.xerc es.impl.XMLDocu mentFragmentSca nnerImpl.scanDo cument(Unknown
                          Source)
                          org.apache.xerc es.parsers.XML1 1Configuration. parse(Unknown Source)
                          org.apache.xerc es.parsers.XML1 1Configuration. parse(Unknown Source)
                          org.apache.xerc es.parsers.XMLP arser.parse(Unk nown Source)
                          org.apache.xerc es.parsers.Abst ractSAXParser.p arse(Unknown Source)

                          org.cochrane.si tebuilder.servl et.WebsiteBuild er.parseWebSite LayoutXML(Unkno wn
                          Source)
                          org.cochrane.si tebuilder.servl et.WebsiteBuild er.service(Unkn own
                          Source)
                          javax.servlet.h ttp.HttpServlet .service(HttpSe rvlet.java:810)
                          org.apache.cata lina.core.Appli cationFilterCha in.internalDoFi lter(Applicatio nFilterChain.ja va:252)
                          org.apache.cata lina.core.Appli cationFilterCha in.doFilter(App licationFilterC hain.java:173)

                          org.jboss.web.t omcat.filters.R eplyHeaderFilte r.doFilter(Repl yHeaderFilter.j ava:81)

                          org.apache.cata lina.core.Appli cationFilterCha in.internalDoFi lter(Applicatio nFilterChain.ja va:202)
                          org.apache.cata lina.core.Appli cationFilterCha in.doFilter(App licationFilterC hain.java:173)
                          org.apache.cata lina.core.Stand ardWrapperValve .invoke(Standar dWrapperValve.j ava:213)
                          org.apache.cata lina.core.Stand ardContextValve .invoke(Standar dContextValve.j ava:178)
                          org.jboss.web.t omcat.security. CustomPrincipal Valve.invoke(Cu stomPrincipalVa lve.java:39)
                          org.jboss.web.t omcat.security. SecurityAssocia tionValve.invok e(SecurityAssoc iationValve.jav a:153)

                          org.jboss.web.t omcat.security. JaccContextValv e.invoke(JaccCo ntextValve.java :59)

                          org.apache.cata lina.core.Stand ardHostValve.in voke(StandardHo stValve.java:12 6)

                          org.apache.cata lina.valves.Err orReportValve.i nvoke(ErrorRepo rtValve.java:10 5)
                          org.apache.cata lina.core.Stand ardEngineValve. invoke(Standard EngineValve.jav a:107)

                          org.apache.cata lina.connector. CoyoteAdapter.s ervice(CoyoteAd apter.java:148)

                          org.apache.coyo te.http11.Http1 1Processor.proc ess(Http11Proce ssor.java:856)
                          org.apache.coyo te.http11.Http1 1Protocol$Http1 1ConnectionHand ler.processConn ection(Http11Pr otocol.java:744 )
                          org.apache.tomc at.util.net.Poo lTcpEndpoint.pr ocessSocket(Poo lTcpEndpoint.ja va:527)
                          org.apache.tomc at.util.net.Mas terSlaveWorkerT hread.run(Maste rSlaveWorkerThr ead.java:112)
                          java.lang.Threa d.run(Thread.ja va:595)




                          Is it an XPathExpression Exception? Is it a java.net.Connec tException?
                          What's going on?

                          Comment

                          • Joe Kesselman

                            #14
                            Re: java xerces xpath fails with namespace

                            Greg wrote:[color=blue]
                            > java.net.Connec tExcpetion: Connection timed out.[/color]

                            That's not a namespace problem, or shouldn't be. Namespaces are just
                            strings in URI format; the system never attempts to retrieve anything
                            from that URI. (At least, not unless you're getting involved in the
                            Semantic Web world, which is a different set of issues.) Hence,
                            namespaces don't have connections and don't time out.

                            It looks like the problem is during your attempt to retrieve your source
                            document, since it's reporting that the problem is in the parser.


                            --
                            () ASCII Ribbon Campaign | Joe Kesselman
                            /\ Stamp out HTML e-mail! | System architexture and kinetic poetry

                            Comment

                            • Greg

                              #15
                              Re: java xerces xpath fails with namespace

                              > It looks like the problem is during your attempt to retrieve your source[color=blue]
                              > document, since it's reporting that the problem is in the parser.
                              >[/color]
                              So there's something wrong with this source:



                              ?

                              I certainly have no trouble retrieving that page in a browser (without
                              any noticeable connection time delay). And I'm told by
                              http://validator.w3.org that it's "valid" XHTML, so I'm under the
                              impression that its a well-formed XML document.

                              Comment

                              Working...