XML Document with BASE64 Encoded Sections

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chris Fink

    XML Document with BASE64 Encoded Sections

    I have an xml document that contains some elements encoded as Base64. How do
    I dynamically scan the XML Document and pull out the sections that are
    Base64...

    My overall goal is to display the XML document in a browser will all the
    Base64 sections converted to Ascii (UTF-8).
  • Martin Honnen

    #2
    Re: XML Document with BASE64 Encoded Sections



    Chris Fink wrote:
    [color=blue]
    > I have an xml document that contains some elements encoded as Base64.[/color]

    What does that mean, there are elements that have contents that is
    Base64 encoded?
    Or what are "elements encoded as Base64"?
    [color=blue]
    > How do
    > I dynamically scan the XML Document and pull out the sections that are
    > Base64...[/color]

    Is there some indication in the document or in a schema for the document
    that an element has base64 encoded contents e.g. an attribute indicating
    the type perhaps
    <data xsi:type="xs:ba se64Binary">... </data>

    Or do you know the tag names of the elements containing base64 encoded data?

    You could access the InnerText of such an element and use the method
    Convert.FromBas e64String
    <http://msdn.microsoft. com/library/default.asp?url =/library/en-us/cpref/html/frlrfSystemConv ertClassFromBas e64StringTopic. asp>
    to convert the text content to a byte array.

    --

    Martin Honnen --- MVP XML

    Comment

    • Chris Fink

      #3
      Re: XML Document with BASE64 Encoded Sections

      Yes, the XML document does indicate which elemets are Base64 encoded as such:
      <xsd:element name="Payload" type="xsd:base6 4Binary"/>

      I do know the tag names now, but would like to make it more flexible since
      additional tag names in the future may be added that would break the design.

      My main challenge is to dynamically find the elements that are Base64 and
      decode them to Ascii (UTF-8). I just need help on finding these sections.


      "Martin Honnen" wrote:
      [color=blue]
      >
      >
      > Chris Fink wrote:
      >[color=green]
      > > I have an xml document that contains some elements encoded as Base64.[/color]
      >
      > What does that mean, there are elements that have contents that is
      > Base64 encoded?
      > Or what are "elements encoded as Base64"?
      >[color=green]
      > > How do
      > > I dynamically scan the XML Document and pull out the sections that are
      > > Base64...[/color]
      >
      > Is there some indication in the document or in a schema for the document
      > that an element has base64 encoded contents e.g. an attribute indicating
      > the type perhaps
      > <data xsi:type="xs:ba se64Binary">... </data>
      >
      > Or do you know the tag names of the elements containing base64 encoded data?
      >
      > You could access the InnerText of such an element and use the method
      > Convert.FromBas e64String
      > <http://msdn.microsoft. com/library/default.asp?url =/library/en-us/cpref/html/frlrfSystemConv ertClassFromBas e64StringTopic. asp>
      > to convert the text content to a byte array.
      >
      > --
      >
      > Martin Honnen --- MVP XML
      > http://JavaScript.FAQTs.com/
      >[/color]

      Comment

      • Martin Honnen

        #4
        Re: XML Document with BASE64 Encoded Sections



        Chris Fink wrote:
        [color=blue]
        > Yes, the XML document does indicate which elemets are Base64 encoded as such:
        > <xsd:element name="Payload" type="xsd:base6 4Binary"/>[/color]

        That looks like a schema definition. A schema is usually external to the
        XML instance document e.g. your XML instance document is then more
        likely to have
        <PayLoad>base 64 encoded data sits here</PayLoad>
        [color=blue]
        > I just need help on finding these sections.[/color]

        There are various APIs in the .NET framework to read out information
        from an XML document, there is XmlTextReader which is a fast forward
        only pull parsing approach that does not consume much memory but
        requires you to set up your own code to extract data and store it.
        There is XPathDocument which loads a complete document in memory into an
        optimized data structure for read only XPath navigation and data access.
        And there is XmlDocument which also loads the complete document in
        memory but in a structure which allows manipulation. XmlDocument also
        allows XPath navigation.


        --

        Martin Honnen --- MVP XML

        Comment

        • Chris Fink

          #5
          Re: XML Document with BASE64 Encoded Sections

          Martin,

          So without the xml fragment containing an attribute that describes it's
          type, I cannot dynamically determine the sections that are base64? I was
          thinking that the definition of the type in the schema would be sufficient?
          The XML doc contains a reference to the schema, so I would think the solution
          would be to scan the xsd for base64 types and then pull out these sections
          from the xml fragment and convert to ascii. Your suggestions?

          "Martin Honnen" wrote:
          [color=blue]
          >
          >
          > Chris Fink wrote:
          >[color=green]
          > > Yes, the XML document does indicate which elemets are Base64 encoded as such:
          > > <xsd:element name="Payload" type="xsd:base6 4Binary"/>[/color]
          >
          > That looks like a schema definition. A schema is usually external to the
          > XML instance document e.g. your XML instance document is then more
          > likely to have
          > <PayLoad>base 64 encoded data sits here</PayLoad>
          >[color=green]
          > > I just need help on finding these sections.[/color]
          >
          > There are various APIs in the .NET framework to read out information
          > from an XML document, there is XmlTextReader which is a fast forward
          > only pull parsing approach that does not consume much memory but
          > requires you to set up your own code to extract data and store it.
          > There is XPathDocument which loads a complete document in memory into an
          > optimized data structure for read only XPath navigation and data access.
          > And there is XmlDocument which also loads the complete document in
          > memory but in a structure which allows manipulation. XmlDocument also
          > allows XPath navigation.
          >
          >
          > --
          >
          > Martin Honnen --- MVP XML
          > http://JavaScript.FAQTs.com/
          >[/color]

          Comment

          • Martin Honnen

            #6
            Re: XML Document with BASE64 Encoded Sections



            Chris Fink wrote:

            [color=blue]
            > So without the xml fragment containing an attribute that describes it's
            > type, I cannot dynamically determine the sections that are base64? I was
            > thinking that the definition of the type in the schema would be sufficient?
            > The XML doc contains a reference to the schema, so I would think the solution
            > would be to scan the xsd for base64 types and then pull out these sections
            > from the xml fragment and convert to ascii.[/color]

            If the schema and the instance are available then you can use an
            XmlValidatingRe ader and access type informations from the schema while
            parsing the XML instance document. The following snippet checks for
            elements with bas64Binary typed contents and reads out the content into
            a byte array then:

            XmlValidatingRe ader validator = new XmlValidatingRe ader(new
            XmlTextReader(@ "test2005102001 .xml"));
            validator.Valid ationType = ValidationType. Schema;
            validator.Valid ationEventHandl er += new
            ValidationEvent Handler(Validat ionHandler);

            while (validator.Read ()) {
            if (validator.Node Type == XmlNodeType.Ele ment) {
            if (validator.Sche maType is XmlSchemaDataty pe) {
            XmlSchemaDataty pe currentType =
            (XmlSchemaDatat ype)validator.S chemaType;
            if (currentType.To String() ==
            "System.Xml.Sch ema.Datatype_ba se64Binary") {
            Console.WriteLi ne("Element {0} has base64Binary type.",
            validator.Name) ;
            object currentValue = validator.ReadT ypedValue();
            Console.WriteLi ne("Typed value read as {0}.",
            currentValue.Ge tType().Name);
            // could use currentValue as byte[] here
            }
            }
            }
            }

            validator.Close ();


            --

            Martin Honnen --- MVP XML

            Comment

            • Chris Fink

              #7
              Re: XML Document with BASE64 Encoded Sections

              Exactly what I was looking for.

              Thank you very much!

              "Martin Honnen" wrote:
              [color=blue]
              >
              >
              > Chris Fink wrote:
              >
              >[color=green]
              > > So without the xml fragment containing an attribute that describes it's
              > > type, I cannot dynamically determine the sections that are base64? I was
              > > thinking that the definition of the type in the schema would be sufficient?
              > > The XML doc contains a reference to the schema, so I would think the solution
              > > would be to scan the xsd for base64 types and then pull out these sections
              > > from the xml fragment and convert to ascii.[/color]
              >
              > If the schema and the instance are available then you can use an
              > XmlValidatingRe ader and access type informations from the schema while
              > parsing the XML instance document. The following snippet checks for
              > elements with bas64Binary typed contents and reads out the content into
              > a byte array then:
              >
              > XmlValidatingRe ader validator = new XmlValidatingRe ader(new
              > XmlTextReader(@ "test2005102001 .xml"));
              > validator.Valid ationType = ValidationType. Schema;
              > validator.Valid ationEventHandl er += new
              > ValidationEvent Handler(Validat ionHandler);
              >
              > while (validator.Read ()) {
              > if (validator.Node Type == XmlNodeType.Ele ment) {
              > if (validator.Sche maType is XmlSchemaDataty pe) {
              > XmlSchemaDataty pe currentType =
              > (XmlSchemaDatat ype)validator.S chemaType;
              > if (currentType.To String() ==
              > "System.Xml.Sch ema.Datatype_ba se64Binary") {
              > Console.WriteLi ne("Element {0} has base64Binary type.",
              > validator.Name) ;
              > object currentValue = validator.ReadT ypedValue();
              > Console.WriteLi ne("Typed value read as {0}.",
              > currentValue.Ge tType().Name);
              > // could use currentValue as byte[] here
              > }
              > }
              > }
              > }
              >
              > validator.Close ();
              >
              >
              > --
              >
              > Martin Honnen --- MVP XML
              > http://JavaScript.FAQTs.com/
              >[/color]

              Comment

              Working...