How to request data from a lazily-created tree structure ?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • =?ISO-8859-1?Q?m=E9choui?=

    How to request data from a lazily-created tree structure ?

    Problem:

    - You have tree structure (XML-like) that you don't want to create
    100% in memory, because it just takes too long (for instance, you need
    a http request to request the information from a slow distant site).
    - But you want to be able to request data from it, such has "give me
    all nodes that are under a "//foo/bar" tree, and have a child with an
    "baz" attribute of value "zzz".

    Question :

    Do you have any other idea to request data from a lazily-created tree
    structure ?

    And does it make sense to create a DOM-like structure and to use a
    generic XPath engine to request the tree ? (and does this generic
    XPath engine exist ?)

    The idea is to have the tree structure created on the fly (we are in
    python), only when the XPath engine requests the data. Hopefully the
    XPath engine will not request all the data from the tree (if the
    request is smart enough and does not contain **, for instance).

    Thanks
  • Diez B. Roggisch

    #2
    Re: How to request data from a lazily-created tree structure ?

    méchoui schrieb:
    Problem:
    >
    - You have tree structure (XML-like) that you don't want to create
    100% in memory, because it just takes too long (for instance, you need
    a http request to request the information from a slow distant site).
    - But you want to be able to request data from it, such has "give me
    all nodes that are under a "//foo/bar" tree, and have a child with an
    "baz" attribute of value "zzz".
    >
    Question :
    >
    Do you have any other idea to request data from a lazily-created tree
    structure ?
    >
    And does it make sense to create a DOM-like structure and to use a
    generic XPath engine to request the tree ? (and does this generic
    XPath engine exist ?)
    >
    The idea is to have the tree structure created on the fly (we are in
    python), only when the XPath engine requests the data. Hopefully the
    XPath engine will not request all the data from the tree (if the
    request is smart enough and does not contain **, for instance).
    Generic XPath works only with a DOM(like) structure. How else would you
    e.g. evaluate an expression like foo[last()]?


    So if you really need lazy evaluation, you will need to specifically
    analyze the query of interest and see if it can be coded in a way that
    allows to forget as much of the tree as possible, or even better not
    query it.

    Diez

    Comment

    • =?ISO-8859-1?Q?m=E9choui?=

      #3
      Re: How to request data from a lazily-created tree structure ?

      On Jun 16, 11:16 pm, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
      méchoui schrieb:
      >
      >
      >
      Problem:
      >
      - You have tree structure (XML-like) that you don't want to create
      100% in memory, because it just takes too long (for instance, you need
      a http request to request the information from a slow distant site).
      - But you want to be able to request data from it, such has "give me
      all nodes that are under a "//foo/bar" tree, and have a child with an
      "baz" attribute of value "zzz".
      >
      Question :
      >
      Do you have any other idea to request data from a lazily-created tree
      structure ?
      >
      And does it make sense to create a DOM-like structure and to use a
      generic XPath engine to request the tree ? (and does this generic
      XPath engine exist ?)
      >
      The idea is to have the tree structure created on the fly (we are in
      python), only when the XPath engine requests the data. Hopefully the
      XPath engine will not request all the data from the tree (if the
      request is smart enough and does not contain **, for instance).
      >
      Generic XPath works only with a DOM(like) structure. How else would you
      e.g. evaluate an expression like foo[last()]?
      >
      So if you really need lazy evaluation, you will need to specifically
      analyze the query of interest and see if it can be coded in a way that
      allows to forget as much of the tree as possible, or even better not
      query it.
      >
      Diez
      Yes, I need to make sure my requests are properly written so that the
      generic XPath engine does not need all the structure in memory.

      There are quite a few cases where you really don't need to load
      everything at all. /a/b/*/c/d is an example. But even with an example
      like /x/z[last()]/t, you don't need to load everything under the
      every /x/z nodes. You just need to check for the latest one, and make
      sure there is a t node under it.

      Anyway, if I need to make requests that need all the data... that
      means that the need for lazy instantiation of nodes disappears,
      right ?

      Comment

      • Diez B. Roggisch

        #4
        Re: How to request data from a lazily-created tree structure ?

        >
        Yes, I need to make sure my requests are properly written so that the
        generic XPath engine does not need all the structure in memory.
        >
        There are quite a few cases where you really don't need to load
        everything at all. /a/b/*/c/d is an example. But even with an example
        like /x/z[last()]/t, you don't need to load everything under the
        every /x/z nodes. You just need to check for the latest one, and make
        sure there is a t node under it.
        >
        Anyway, if I need to make requests that need all the data... that
        means that the need for lazy instantiation of nodes disappears,
        right ?

        Yes. And unless you have memory-constraints I have to admit that I
        really doubt that the parsing overhead isn't by far exceeded by the
        network latency.

        Diez

        Comment

        • =?ISO-8859-1?Q?m=E9choui?=

          #5
          Re: How to request data from a lazily-created tree structure ?

          On Jun 17, 9:08 am, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
          Yes, I need to make sure my requests are properly written so that the
          generic XPath engine does not need all the structure in memory.
          >
          There are quite a few cases where you really don't need to load
          everything at all. /a/b/*/c/d is an example. But even with an example
          like /x/z[last()]/t, you don't need to load everything under the
          every /x/z nodes. You just need to check for the latest one, and make
          sure there is a t node under it.
          >
          Anyway, if I need to make requests that need all the data... that
          means that the need for lazy instantiation of nodes disappears,
          right ?
          >
          Yes. And unless you have memory-constraints I have to admit that I
          really doubt that the parsing overhead isn't by far exceeded by the
          network latency.
          >
          Diez
          Do you know if there is such XPath engine that can be applied to a DOM-
          like structure ?

          One way would be to take an XPath engine from an existing XML engine
          (ElementTree, or any other), and see what APIs it calls... and see if
          we cannot create a DOM-like structure that has the same API. Duck
          typing, really...

          Comment

          • Diez B. Roggisch

            #6
            Re: How to request data from a lazily-created tree structure ?

            Do you know if there is such XPath engine that can be applied to a DOM-
            like structure ?
            No. But I toyed with the idea to write one :)
            One way would be to take an XPath engine from an existing XML engine
            (ElementTree, or any other), and see what APIs it calls... and see if
            we cannot create a DOM-like structure that has the same API. Duck
            typing, really...

            Why can't you create a *real* DOM?

            Diez

            Comment

            • =?ISO-8859-1?Q?m=E9choui?=

              #7
              Re: How to request data from a lazily-created tree structure ?

              On Jun 17, 10:54 pm, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
              Do you know if there is such XPath engine that can be applied to a DOM-
              like structure ?
              >
              No. But I toyed with the idea to write one :)
              >
              One way would be to take an XPath engine from an existing XML engine
              (ElementTree, or any other), and see what APIs it calls... and see if
              we cannot create a DOM-like structure that has the same API. Duck
              typing, really...
              >
              Why can't you create a *real* DOM?
              >
              Diez
              I don't know what "real" means, in fact. In python, being a "real" sg
              is all about having the same interface, right? May be I did not
              undertand what you meant.

              I cannot load all the data in memory before I request it, because it
              would take too long. If using XPath-like tools requires that I load
              the data in memory, I'd better create my own algorithm instead. It
              will be much faster.

              What I mean it: if I have a XPath engine that works well on a specific
              DOM-like structure... may be I can create my own DOM-lile structure to
              fool the XPath engine; so that I can use it on my own structure.

              Comment

              • =?ISO-8859-1?Q?m=E9choui?=

                #8
                Re: How to request data from a lazily-created tree structure ?

                On Jun 17, 11:54 pm, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
                Do you know if there is suchXPathengine that can be applied to a DOM-
                like structure ?
                >
                No. But I toyed with the idea to write one :)
                >
                One way would be to take anXPathengine from an existing XML engine
                (ElementTree, or any other), and see what APIs it calls... and see if
                we cannot create a DOM-like structure that has the same API. Duck
                typing, really...
                >
                Why can't you create a *real* DOM?
                >
                Diez
                I may have found sg: http://sourceforge.net/projects/pdis-xpath/

                A XPath 1.0, in pure python, on top of ElementTree. I'll have a look.

                Comment

                • =?ISO-8859-1?Q?m=E9choui?=

                  #9
                  Re: How to request data from a lazily-created tree structure ?

                  On 17 juin, 13:53, méchoui <laurent.pl...@ gmail.comwrote:
                  On Jun 17, 9:08 am, "Diez B. Roggisch" <de...@nospam.w eb.dewrote:
                  >
                  >
                  >
                  Yes, I need to make sure my requests are properly written so that the
                  generic XPath engine does not need all the structure in memory.
                  >
                  There are quite a few cases where you really don't need to load
                  everything at all. /a/b/*/c/d is an example. But even with an example
                  like /x/z[last()]/t, you don't need to load everything under the
                  every /x/z nodes. You just need to check for the latest one, and make
                  sure there is a t node under it.
                  >
                  Anyway, if I need to make requests that need all the data... that
                  means that the need for lazy instantiation of nodes disappears,
                  right ?
                  >
                  Yes. And unless you have memory-constraints I have to admit that I
                  really doubt that the parsing overhead isn't by far exceeded by the
                  network latency.
                  >
                  Diez
                  >
                  Do you know if there is such XPath engine that can be applied to a DOM-
                  like structure ?
                  >
                  One way would be to take an XPath engine from an existing XML engine
                  (ElementTree, or any other), and see what APIs it calls... and see if
                  we cannot create a DOM-like structure that has the same API. Duck
                  typing, really...
                  I have something that works. http://lauploix.blogspot.com/2008/07...-my-trees.html
                  It has the pro and cons of the ElementTree 1.3 XPath engine, but it
                  works quite nice.

                  Laurent Ploix

                  Comment

                  Working...