convert .txt file to .doc binary format

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • gokul
    New Member
    • Sep 2006
    • 6

    convert .txt file to .doc binary format

    Hi,

    Before i convert .doc binary format files to .txt files and i added some content to .txt files. Now i again convert back to .doc binary format.

    Pls Help Me How to Convert .txt files to .doc binary format files in Linux.

    Regards,
    Gokul.N
  • macklin01
    New Member
    • Aug 2005
    • 145

    #2
    Originally posted by gokul
    Hi,

    Before i convert .doc binary format files to .txt files and i added some content to .txt files. Now i again convert back to .doc binary format.

    Pls Help Me How to Convert .txt files to .doc binary format files in Linux.

    Regards,
    Gokul.N
    Well, this won't use C++, but I'd recommend just opening the files in OpenOffice and saving in .doc format. -- Paul

    Comment

    • gokul
      New Member
      • Sep 2006
      • 6

      #3
      Originally posted by macklin01
      Well, this won't use C++, but I'd recommend just opening the files in OpenOffice and saving in .doc format. -- Paul
      Hi,

      i want every thing will do by programming.wit hout user interaction.

      so pls tell Is there any command to convert from .txt to .doc binary format in linux

      Comment

      • macklin01
        New Member
        • Aug 2005
        • 145

        #4
        Hmm, this doesn't get you completely there, but this project converts ASCII to RTF, which is a start. You may also find this page useful. I have a feeling that you will need to string several tools together. (Or search freshmeat/sourceforge for a better tool.) Or you'll need to open the emptiest word document you can and try to figure out the bare minimum to write a file.

        If you do go with the last route, I'd be very interested in seeing what you come up with. -- Paul

        Comment

        • macklin01
          New Member
          • Aug 2005
          • 145

          #5
          This also looks interesting. -- Paul

          Comment

          • gokul
            New Member
            • Sep 2006
            • 6

            #6
            Originally posted by macklin01
            Hi,

            Pls tell me ,How to write some data to .doc binary format Files in Linux.

            Regards,
            Gokul.N

            Comment

            • macklin01
              New Member
              • Aug 2005
              • 145

              #7
              Originally posted by gokul
              Hi,

              Pls tell me ,How to write some data to .doc binary format Files in Linux.

              Regards,
              Gokul.N
              I'm afraid that I'm not versed on the specifics of the MS WORD file format. You might consider digging through some open office code to determine how to write it. The problem you'll encounter is that the the format isn't particularly well-published, and as far as I recall, it's almost more of a binary memory dump than a file format. I just spent a good 45 minutes trying to dig in and understand it, and I didn't make any substantial progress. (Whenever I tried to modify the text portions in a text editor, I only corrupted the files.)

              You might also consider looking into the "OpenXML" format that MS will be using in future versions of office.

              Otherwise, the last program I linked to is a way to automate OpenOffice to do the conversions you seek.

              Does it really have to be in .doc format, or just in a format the word can readily read but with reasonable formatting? I still think you should seriously consider the rtf format. At least that's well-documented. (And I believe I linked to some C++ code for text->RTF conversions.) -- Paul

              Comment

              • macklin01
                New Member
                • Aug 2005
                • 145

                #8
                If you are willing to just write a very simple rich text format file (RTF), it's as easy as this:
                Code:
                {\rtf1\ansi\deff0
                {\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}}
                {\comment{Paul Macklin's very empty RTF template}}
                {\colortbl;\red0\green0\blue0;}
                {\ltrch\fcs0 Hi,\par
                \par
                Before i convert .doc binary format files to .txt files and i added 
                some content to .txt files. Now i again convert back to .doc binary format.\par
                \par
                Pls Help Me How to Convert .txt files to .doc binary format files in Linux.\par
                \par
                Regards,\par
                Gokul.N}}
                I came up with that after browsing through the RTF 1.5 page and another sample RTF file. You can simply cut and paste that text, and replace it with your own. In essence, here are the steps:

                1) Output my text, character for character, to a text file up until the beginning of the text. (Hi.)

                2) Output your text. Replace and endline characters with the \par character. I'm assuming that the tab character must be replaced with something similar, but I"ll leave that to you.

                3) Close with two close braces "}}".

                That's it. You've just written a bona fide RTF file that Word can read without a problem. Feel free to add more formatting.

                Comment

                • macklin01
                  New Member
                  • Aug 2005
                  • 145

                  #9
                  I just noticed that the {\comment{blah} } stuff doesn't work well in OpenOffice. For maximum compatibility, remove that. -- Paul

                  Comment

                  • gokul
                    New Member
                    • Sep 2006
                    • 6

                    #10
                    Originally posted by macklin01
                    I'm afraid that I'm not versed on the specifics of the MS WORD file format. You might consider digging through some open office code to determine how to write it. The problem you'll encounter is that the the format isn't particularly well-published, and as far as I recall, it's almost more of a binary memory dump than a file format. I just spent a good 45 minutes trying to dig in and understand it, and I didn't make any substantial progress. (Whenever I tried to modify the text portions in a text editor, I only corrupted the files.)

                    You might also consider looking into the "OpenXML" format that MS will be using in future versions of office.

                    Otherwise, the last program I linked to is a way to automate OpenOffice to do the conversions you seek.

                    Does it really have to be in .doc format, or just in a format the word can readily read but with reasonable formatting? I still think you should seriously consider the rtf format. At least that's well-documented. (And I believe I linked to some C++ code for text->RTF conversions.) -- Paul
                    Hi,

                    Thanks for ur reply. While we Converting Doc to Rtf files. the original doc formate has to be totally changed in Rtf files.
                    I want add some text to end of the document without changing any format .
                    My document contains text,image,head er and footer .

                    Actually my project using client server tech/ using(php(clien t) and C code(server)), php page requesting files based on file name.C code and Unix shell script processing Operation. Files are stored in the linux m/c in .doc Binary Format. Based on requesting particular files i need to add some text to end of the line.

                    pls tell me Is there any command to add text in binary format .doc files .

                    Regards,
                    Goukl.N

                    Comment

                    • macklin01
                      New Member
                      • Aug 2005
                      • 145

                      #11
                      Originally posted by gokul
                      Hi,

                      Thanks for ur reply. While we Converting Doc to Rtf files. the original doc formate has to be totally changed in Rtf files.
                      I want add some text to end of the document without changing any format .
                      My document contains text,image,head er and footer .

                      Actually my project using client server tech/ using(php(clien t) and C code(server)), php page requesting files based on file name.C code and Unix shell script processing Operation. Files are stored in the linux m/c in .doc Binary Format. Based on requesting particular files i need to add some text to end of the line.

                      pls tell me Is there any command to add text in binary format .doc files .

                      Regards,
                      Goukl.N
                      Hmmm, that's a very good question. (Sounds like an interesting setup!) The problem (or one of them) is that you cannot simply insert extra text into the "text portion" of the binary file; changing the length of text fields requires other changes in the binary data, likely including field sizes, etc. A good analog is the BMP bitmap format: if you add a row of pixels, information in the file header also needs to be changed for consistency.

                      I'm afraid I can't be of much further help at the moment. If I were to approach this project, however, I'd look at some of the projects I linked to you, and try to set up a server-side scripting of OpenOffice using those tools. Something like this:

                      1) Create a text file with the text you want to append / merge to the existing documents.

                      2) Script OpenOffice to convert the text file to .doc format.

                      3) Script OpenOffice to merge the documents.

                      Of course, (2)-(3) are the hard parts, but again, I believe that some of the projects I linked to you are capable of manipulating OpenOffice from the command line. I'd dig further, but I'm afraid that I'm out of time to spend on this, as I need to get back to my dissertation writing, etc.

                      If anybody else has some help to provide, I'm sure it would be appreciated!! Otherwise, good luck!!! -- Paul

                      PS: If did manage to convert the original documents to RTF, then I believe you could append the extra text via

                      Code:
                      \ltrch\fcs0
                      \par
                      \par 
                      Append text here.
                      }}
                      at the very end of the file, just prior to the last "}}". -- Paul

                      Comment

                      Working...