Scripting XML?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Vijai Kalyan

    Scripting XML?

    Hello All,

    I have a few questions which you might seem irrelavant and/or foolish.
    I am asking anyway so I can find out.

    1. Is XSL as powerful as a programming language such as Java in its
    abilities to transform XML? The W3C site has the following definition
    on XSLT for example:

    "XSLT is designed for use as part of XSL, which is a stylesheet
    language for XML. In addition to XSLT, XSL includes an XML vocabulary
    for specifying formatting. XSL specifies the styling of an XML
    document by using XSLT to describe how the document is transformed
    into another XML document that uses the formatting vocabulary.

    XSLT is also designed to be used independently of XSL. However, XSLT
    is not intended as a completely general-purpose XML transformation
    language. Rather it is designed primarily for the kinds of
    transformations that are needed when XSLT is used as part of XSL."

    2. Does the above mean that when an XML document is transported over a
    network, it's content is totally static?

    This last question is relevant because we faced this problem when
    using XML to describe a sequence of actions. Data for the actions were
    available in the document largely, but some part of it was dynamic,
    that is, provided by the environment.

    To solve this we looked at various scripting languages (some of them
    geared towards XML but the majority not) including Groovy, Jython,
    Simkin, Xscript, XML Script etc.

    If you do find these questions relevant, can I impose upon the group
    to read this writeup and give your comments (including comments such
    as "this whole shebang is not correct but it is not even wrong")

    My news reader for some reason or other wouldn't allow me to post the
    document so I will have to ask you to read it from here:

    Download XML-MS - Macro Scripting for XML for free. XML-MS: A macro scripting language for XML that allows embedding of scripts inside XML documents. This can lend a dynamic character to XML documents.


    thank you,

    -vijai.
  • Martin Honnen

    #2
    Re: Scripting XML?



    Vijai Kalyan wrote:
    [color=blue]
    > 1. Is XSL as powerful as a programming language such as Java in its
    > abilities to transform XML? The W3C site has the following definition
    > on XSLT for example:
    >
    > "XSLT is designed for use as part of XSL, which is a stylesheet
    > language for XML. In addition to XSLT, XSL includes an XML vocabulary
    > for specifying formatting. XSL specifies the styling of an XML
    > document by using XSLT to describe how the document is transformed
    > into another XML document that uses the formatting vocabulary.
    >
    > XSLT is also designed to be used independently of XSL. However, XSLT
    > is not intended as a completely general-purpose XML transformation
    > language. Rather it is designed primarily for the kinds of
    > transformations that are needed when XSLT is used as part of XSL."[/color]

    XSLT 1.0 (http://www.w3.org/TR/xslt) is a declarative/functional
    programming language that I think has been proven to be Turing complete
    so in theory you can solve any task with it that you can solve with
    other programming languages. So yes, for transforming XML the language
    XSLT 1.0 is certainly theoretically as powerful as Java and practically
    even more suited to do such transformations as it is specifically
    constructed to transform XML to XML or to HTML or to text.

    --

    Martin Honnen

    Comment

    • Andy Dingley

      #3
      Re: Scripting XML?

      On 7 Nov 2004 00:41:35 -0800, vijai.kalyan@gm ail.com (Vijai Kalyan)
      wrote:
      [color=blue]
      >1. Is XSL as powerful as a programming language such as Java in its
      >abilities to transform XML?[/color]

      Without an agreement as to what "powerful" means, then there's no way
      to answer this question.

      XSLT is extremely specialised. It takes XML documents as input and
      generates an XML document as output. If you ask it nicely, it can
      serialise this output document as HTML or even text, by breaking a few
      of the XML well-formedness rules, but it's still broadly a "flattened"
      serialisation of an XML structure.

      XSL is by and large equivalent to XSLT. The difference is that XSL
      also includes XSL:FO, which are useful for generating non-XML outputs
      - frequently PDF, Quark output, or similar integration to DTP systems.
      The terms XSLT and XSL are strictly a subset/superset of each other,
      but "XSL" is a common loose term for either.

      XSLT is based on XPath (learning XSL is easy, it's learning XPath
      that takes the effort). With these tools, you can very easily perform
      tasks that would be difficult in Java. However XSLT is also a
      functional language, not a procedural, and so most programmers have a
      great deal of trouble in writing it well. It's more "write-only" than
      badly structured Perl.

      XSLT 2 and EXSLT are also worth looking at. XSLT is based on XML,
      where the contents of nodes (text nodes or attributes) are largely
      opaque. This is a major limitation in practice, and so these are
      efforts to improve matters.

      On the whole, I'd assert that XSL was "more useful" than Java "in its
      abilities to transform XML", but I wouldn't claim this was "more
      powerful".

      [color=blue]
      >2. Does the above mean that when an XML document is transported over a
      >network, it's content is totally static?[/color]

      I have no idea how you draw that conclusion from the statement listed,
      so I don't really know what you're getting at.

      XML has a number of issues (re encoding and whitespace) which are seen
      as freely interchangeable . As such, a transport protocol can change
      this without consequence. Whether this is regarded as "static" depends
      on the context.

      There are also many cases (e.g. serving XML content to HTML-only
      browser) where XML may be transformed on being served. This isn't
      really a mere "transport" though.

      So XML documents transported over networks are static. But I'm sure
      this wasn't what you meant, because your whole project seems to be
      based on breaking this.
      [color=blue]
      >This last question is relevant because we faced this problem when
      >using XML to describe a sequence of actions.[/color]
      [color=blue]
      >http://www.sourceforge.net/projects/xmlms/[/color]

      I had a brief read of your document. To be frank, I found it a very
      hard read - it needs an introduction to it, which it painfully doesn't
      have at present. There's no distinction drawn between your monitoring
      project, and your XML-MS concept.

      If your project boils down to "Add in-transit processing of scriptlets
      to XML", then that's a worthy idea (although it's already out there).
      What I want to know about it is how the processing model works (who
      processes it, and how does it decide what to process), what's
      available as a coding platform to use, and what's the presented
      interface for the document that's being processed. I don't care about
      in-line code fragments, because quite honestly if I have to learn
      anything new to fulfill that particular role, this isn't a good
      solution for me.

      There's also a strong sense that you're ignorant of Schematron, JSP,
      Cocoon, even XSLT, and many of the other "pre-invented wheels" that
      are already out there.


      --
      Smert' spamionam

      Comment

      • Shmuel (Seymour J.) Metz

        #4
        Re: Scripting XML?

        In <18b36e50.04110 70041.79224814@ posting.google. com>, on 11/07/2004
        at 12:41 AM, vijai.kalyan@gm ail.com (Vijai Kalyan) said:
        [color=blue]
        >1. Is XSL as powerful as a programming language such as Java in its
        >abilities to transform XML?[/color]

        It is for the sorts of things that it is designed to do. If you need
        to do something beyond that, use a programming language and an XML
        parser.
        [color=blue]
        >2. Does the above mean that when an XML document is transported over
        >a network, it's content is totally static?[/color]

        No. It has nothing to do with transporting XML over a network. The T
        stands for transform, not transport.

        --
        Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

        Unsolicited bulk E-mail subject to legal action. I reserve the
        right to publicly post or ridicule any abusive E-mail. Reply to
        domain Patriot dot net user shmuel+news to contact me. Do not
        reply to spamtrap@librar y.lspace.org

        Comment

        • Vijayaraghavan Kalyanapasupathy

          #5
          Re: Scripting XML?

          > In article <doeso0lafp2edc bid1auppq1rtoai k0qvt@4ax.com>, dingbat@codesmi ths.com says...[color=blue]
          > Subject: Re: Scripting XML?
          > From: Andy Dingley <dingbat@codesm iths.com>
          > Newsgroups: comp.text.xml[/color]
          [color=blue]
          > Without an agreement as to what "powerful" means, then there's no way
          > to answer this question.[/color]
          [color=blue]
          > XSLT 2 and EXSLT are also worth looking at. XSLT is based on XML,
          > where the contents of nodes (text nodes or attributes) are largely
          > opaque. This is a major limitation in practice, and so these are
          > efforts to improve matters.[/color]
          [color=blue]
          > On the whole, I'd assert that XSL was "more useful" than Java "in its
          > abilities to transform XML", but I wouldn't claim this was "more
          > powerful".[/color]

          Yes, you are right. My mistake in not clarifying that.

          Well, simply put,

          Can I take a XML parser, a XML document, a programming language such as
          Java and combine them in the following manner for example:

          XML Document => parsed by parser --> in-memory representation of
          document => manipulated by application --> output in different form

          So, if there were commands in some "embedded language" (or variable
          value substituitions, variable declarations, function definitions,
          funtion declarations etc.) in the document (in attribute values, element
          data etc), the application can walk through the in-memory representation
          of the document and act as a recognizer for the embedded language. Upon
          completion the final document is free of all the "embedded language"
          constructs.

          Note that in both cases, per se, the document is valid as against being
          merely well-formed because the embedded language constructs have no
          meaning to a pure XML parser.

          I can do this with the above combination certainly!

          Question is, can I do the same thing with only XSLT which boils down to
          asking, "can I represent the same actions in XSLT" or can I contrive an
          equivalence between say Java and XSLT in their ability to do things with
          an XML document?
          [color=blue][color=green]
          >> 2. Does the above mean that when an XML document is transported over[/color][/color]
          a[color=blue][color=green]
          >> network, it's content is totally static?[/color][/color]
          [color=blue]
          > I have no idea how you draw that conclusion from the statement listed,
          > so I don't really know what you're getting at.[/color]
          [color=blue]
          > XML has a number of issues (re encoding and whitespace) which are seen
          > as freely interchangeable . As such, a transport protocol can change
          > this without consequence. Whether this is regarded as "static" depends
          > on the context.[/color]
          [color=blue]
          > There are also many cases (e.g. serving XML content to HTML-only
          > browser) where XML may be transformed on being served. This isn't
          > really a mere "transport" though.[/color]
          [color=blue]
          > So XML documents transported over networks are static. But I'm sure
          > this wasn't what you meant, because your whole project seems to be
          > based on breaking this.[/color]

          Well, what I meant is, if we consider the network as a black box and the
          server and client on each side, then what is input into the box at the
          server side is the same when it comes out at the client side of the box.
          In between yes, the transport protocol may change things.

          Specifically, an example of the client is "the parser on the browser"
          not the rendering engine or the end-user.

          Transformations such as applying a stylesheet are applied after the
          client received it, so what I am really getting at is really "before
          transformation on the client side are applied" and "after processing at
          server side is complete".

          Naturally, when you consider stand alone documents used by an
          application, the server really does nothing but the client (the parser)
          does a lot of stuff before the document is passed on to the final
          application (the end-user?).
          [color=blue]
          >This last question is relevant because we faced this problem when
          >using XML to describe a sequence of actions.[/color]
          [color=blue]
          >http://www.sourceforge.net/projects/xmlms/[/color]
          [color=blue]
          > I had a brief read of your document. To be frank, I found it a very
          > hard read - it needs an introduction to it, which it painfully doesn't
          > have at present. There's no distinction drawn between your monitoring
          > project, and your XML-MS concept.[/color]

          I see. It could be that I was explaining it from our perspective and
          problems in the project so things got intermixed. I will try to separate
          things out and clarify them.
          [color=blue]
          > If your project boils down to "Add in-transit processing of scriptlets
          > to XML", then that's a worthy idea (although it's already out there).[/color]

          Well, a general purpose scripting language can be used for anything,
          including in transit processing (I presume you mean things like in-
          network processing?) primarily because the language itself doesn't
          impose any restrictions on what can be done with the document and its
          content.

          Yes, the primary point here is as you said above, the "non-opaqueness"
          about the content (such as attribute values or element data). Simply
          put, it is the same as the C-preprocessor walking through C modules and
          substituting macros, except that the "pre-processor" in our problem is a
          more powerful than merely doing macro expansion.
          [color=blue]
          > What I want to know about it is how the processing model works (who
          > processes it, and how does it decide what to process), what's
          > available as a coding platform to use, and what's the presented
          > interface for the document that's being processed. I don't care about
          > in-line code fragments, because quite honestly if I have to learn
          > anything new to fulfill that particular role, this isn't a good
          > solution for me.[/color]

          Well here is how it could work:

          - XML-Document A (in Schema S) served by server (which may be a
          webserver, or just a file reader library)

          - XML-Document A is parsed by XML Parser P into an in-memory
          representation M

          - Depending on the type of application (stand alone, web browser etc)

          + P invokes I (an interpreter for the embedded scripting language L)
          on the value of each attribute and the data of each element.

          or

          + P invokes I on only certain elements/attributes on-demand by the
          stand alone application

          # A stand alone application would be something that I would code for
          some specific project.

          # A general application would be something like a web-browser which
          does not expose the presence of L constructs in a particular
          document to an end-user, but does allow the end-user to tell it at
          a very general level what to do when such constructs are found.
          This is similar to you or me having control over whether IE or
          Firefox executes Java applets or Javascript functions.

          + I recognizes certain constructs in the data that is passed to it.
          So, in effect, I plays two roles:

          * A pattern recognizer
          * A language interpreter

          # Note that the language is as powerful as any conventional
          programming language.
          # The patterns are relatively simple and serve only to mark out
          regions of the data that should be considered as constructs in L

          * On recognition of the pattern, I performs the tasks denoted by
          the pattern:

          @ This task may be as simple as requesting macro subsutition
          @ or it may be as complex as a nested function call

          + When I finishes, the result is a XML document that is free of L in
          all respects and conforms to a schema S'.

          This model does actually allow you one to do in-transit processing
          because all the necessary information to do such processing is available
          in the document itself and the "patterns" can be extended to signal
          which patterns are to be invoked and where. (That is a very good point,
          thanks for pointing it out; I will look this up).
          [color=blue]
          > There's also a strong sense that you're ignorant of Schematron, JSP,
          > Cocoon, even XSLT, and many of the other "pre-invented wheels" that
          > are already out there.[/color]

          Definitely true. I had no idea about Schematron and Coocoon (and no
          great experience with XSLT) before you pointed them out. JSP I have
          heard off. So, yes, I will go and investigate them as well!

          Thanks again for your comments, but I hope I can impose upon you to
          comment some more?

          regards,

          -vijai.

          Comment

          • Patrick TJ McPhee

            #6
            Re: Scripting XML?

            In article <18b36e50.04110 70041.79224814@ posting.google. com>,
            Vijai Kalyan <vijai.kalyan@g mail.com> wrote:

            % 1. Is XSL as powerful as a programming language such as Java in its
            % abilities to transform XML?

            It can be easier to write transformations involving document structure
            using XSLT than to do it in some general-purpose language using
            a tree representation of the same document. XSLT is not as strong
            at text manipulation as some languages. I think Java is a poor language
            for both problems.

            One thing you can do with many XSLT processors is to define extension
            functions written in other languages. This can allow you to take advantage
            of its strengths and ignore its weaknesses.

            [...]

            % 2. Does the above mean that when an XML document is transported over a
            % network, it's content is totally static?

            The above, which I've omitted, didn't seem to have anything to do with
            network transport. You can pass environmental data to an XSLT processor
            using parameters.

            Your problem description is too vague for me to give specific advice,
            but if you're not prepared to blow some time delving into it a bit and
            seeing what it's all about, you're not going to get XSLT working
            effectively in your project.
            --

            Patrick TJ McPhee
            North York Canada
            ptjm@interlog.c om

            Comment

            • Vijayaraghavan Kalyanapasupathy

              #7
              Re: Scripting XML?

              In article <doeso0lafp2edc bid1auppq1rtoai k0qvt@4ax.com>,
              dingbat@codesmi ths.com says...[color=blue]
              > If your project boils down to "Add in-transit processing of scriptlets
              > to XML", then that's a worthy idea (although it's already out there).[/color]

              You have given me food for thought. Can you give me examples of these?

              regards,

              -vijai.

              Comment

              • Andy Dingley

                #8
                Re: Scripting XML?

                On Sun, 7 Nov 2004 16:11:27 -0600, Vijayaraghavan Kalyanapasupath y
                <vijai.lists@gm ail.com> wrote:
                [color=blue]
                >Can I take a XML parser, a XML document, a programming language such as
                >Java and combine them in the following manner for example:
                >
                >XML Document => parsed by parser --> in-memory representation of
                >document => manipulated by application --> output in different form[/color]

                Yes, of course you can. But that's not a particularly helpful
                statement - it's still far too vague.

                I think that at the most generic you're talking about "Applying
                scriptlets to XML documents as they pass through a transform process"?
                Even at this level, the problem splits in two.

                One is like Coccoon. You have a fairly rigid "processor" engine, and
                you tag your documents with a very lightweight reference to a styling
                or scripting task. This will typically use an XML PI (processing
                instruction) because it's emphatically a _link_ to a shared process /
                styling / script. Many documents will pass through this same engine,
                and have similar sets of transforms applied to them. It's not even
                significant who transforms them - that task could be widely
                distributed, so long as there's some consistency about the
                interpretation of the stylesheet.

                The other (which I think is rather more like your project) uses
                embedded script fragments. At some point, not entirely unlike the XML
                DOM model, an "event" is triggered which causes appropriate script
                elements to be invoked.


                This TR already defines much of what you're looking at. It (obviously
                enough) sees a separation between "observer" and "target", whilst your
                more narrow context assumes they're bound together. This would require
                your model to state this binding for each document, but it also means
                that you don't need to repeat the script inside each document.



                I don't see any concepts like "pattern recognizer" or "language
                interpreter" as being at all helpful to this project. My life already
                has enough language interpreters in it - don't give me another one,
                give mea simple binding to the one(s) I already know. Look at DOM
                and event models, not parsing embedded scriptlets from scratch.

                --
                Smert' spamionam

                Comment

                • John Fereira

                  #9
                  Re: Scripting XML?

                  Andy Dingley <dingbat@codesm iths.com> wrote in
                  news:71mvo016mv cjm7cdknt0p0438 b0b02k7tt@4ax.c om:
                  [color=blue]
                  > On Sun, 7 Nov 2004 16:11:27 -0600, Vijayaraghavan Kalyanapasupath y
                  > <vijai.lists@gm ail.com> wrote:
                  >[color=green]
                  >>Can I take a XML parser, a XML document, a programming language such as
                  >>Java and combine them in the following manner for example:
                  >>
                  >>XML Document => parsed by parser --> in-memory representation of
                  >>document => manipulated by application --> output in different form[/color]
                  >
                  > Yes, of course you can. But that's not a particularly helpful
                  > statement - it's still far too vague.
                  >
                  > I think that at the most generic you're talking about "Applying
                  > scriptlets to XML documents as they pass through a transform process"?
                  > Even at this level, the problem splits in two.
                  >
                  > One is like Coccoon.[/color]

                  The Jakarta Velocity project may also be a good example. I just recently
                  converted all of the documentation for an open source project to Velocity
                  Anakia. All of the static html pages were converted to xhtml, some anakia
                  specific tags added, and a transformation performed such that a format for
                  viewing on a web page and a format for printing are available. I could have
                  just as easily (well almost as easy) added an xmlfo transformation so that
                  the documentation pages are available as pdf files as well.

                  Comment

                  • Vijayaraghavan Kalyanapasupathy

                    #10
                    Re: Scripting XML?

                    In article <71mvo016mvcjm7 cdknt0p0438b0b0 2k7tt@4ax.com>,
                    dingbat@codesmi ths.com says...[color=blue]
                    > On Sun, 7 Nov 2004 16:11:27 -0600, Vijayaraghavan Kalyanapasupath y
                    > <vijai.lists@gm ail.com> wrote:[/color]
                    [color=blue]
                    > I think that at the most generic you're talking about "Applying
                    > scriptlets to XML documents as they pass through a transform process"?[/color]

                    Well, yes. I guess another way to think about this is to consider an
                    XML-document as a template for something by itself (self-contained?).
                    It's processed repeatedly (repetition of the transform process) till
                    every scriptlet has been processed. The result is an XML-document.
                    [color=blue]
                    > The other (which I think is rather more like your project) uses
                    > embedded script fragments. At some point, not entirely unlike the XML[/color]

                    Yes that is correct.
                    [color=blue]
                    > DOM model, an "event" is triggered which causes appropriate script
                    > elements to be invoked.
                    > http://www.w3.org/TR/2003/REC-xml-events-20031014/[/color]

                    I will look into this. Thanx for the info!
                    [color=blue]
                    > enough) sees a separation between "observer" and "target", whilst your
                    > more narrow context assumes they're bound together. This would require
                    > your model to state this binding for each document, but it also means
                    > that you don't need to repeat the script inside each document.[/color]

                    Before I comment on this, I probably need to read up on XML_Events.
                    [color=blue]
                    > I don't see any concepts like "pattern recognizer" or "language
                    > interpreter" as being at all helpful to this project. My life already
                    > has enough language interpreters in it - don't give me another one,
                    > give mea simple binding to the one(s) I already know. Look at DOM
                    > and event models, not parsing embedded scriptlets from scratch.[/color]

                    Yep, precisely why I am asking these questions. I don't want to
                    implement a parser and interpreter for a full-fledged language from
                    scratch either (painful :)!

                    thanx and regards,

                    -vijai.

                    Comment

                    Working...