how to read word file using java

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • sang
    New Member
    • Sep 2006
    • 83

    how to read word file using java

    How to read the .doc file into the printable format(ie without ascii code).

    It's urgent pls anyone help me


    Thanks
    Sang
  • evilchia
    New Member
    • Sep 2006
    • 7

    #2
    .doc files are part binary, part text files. The easiest way to "read" in a microsoft document is by using the OpenOffice api's (http://www.openoffice. org) if you want a formattable/printable document or if you want just the raw bytes then use a ByteArrayInputS tream. The OpenOffice api's are _not_ the easiest things to use, but it is a whole lot better than trying to parse through all the crap yourself.

    Originally posted by sang
    How to read the .doc file into the printable format(ie without ascii code).

    It's urgent pls anyone help me


    Thanks
    Sang

    Comment

    • sang
      New Member
      • Sep 2006
      • 83

      #3
      Thank you for your replay,

      But i am not able to understand your answer please give the java code.
      Once again thanks for your replay,
      I am waiting for your answer.

      Thanks,
      Sang.

      Comment

      • r035198x
        MVP
        • Sep 2006
        • 13225

        #4
        Originally posted by sang
        Thank you for your replay,

        But i am not able to understand your answer please give the java code.
        Once again thanks for your replay,
        I am waiting for your answer.

        Thanks,
        Sang.
        What he/she meant is that it's easier to use a third party package to read the .doc files. If you do not have one that reads .doc files then you visit the site he has given or google POI.

        Comment

        • sang
          New Member
          • Sep 2006
          • 83

          #5
          The below code is used to read the document file. But at output it will give the special characters and ascii values. So i want only the readable characters (ie a-z and 0-9) how is possible pls give the solution.

          Example code:

          import java.io.*;
          class fileinput
          {
          public static void main(String[] args) throws IOException
          {
          FileInputStream Fin=new FileInputStream ("file.doc") ;
          int j;
          while((j=Fin.re ad())!=-1)
          System.out.prin t((char)j);
          }
          }

          Thanks
          Sang.

          Comment

          • dipakpatil26
            New Member
            • Mar 2008
            • 1

            #6
            Hi friend,
            I have used Jakarta POI library to read the doc file.
            This program simply reads the doc file and prints each line on the console, I think this program will help you to read the doc file.

            --------------------------------------------------------------------------------
            import java.io.File;
            import java.io.FileInp utStream;

            import org.apache.poi. hwpf.HWPFDocume nt;
            import org.apache.poi. hwpf.extractor. WordExtractor;

            public class DocReader {

            public void readDocFile() {
            File docFile = null;
            WordExtractor docExtractor = null ;
            WordExtractor exprExtractor = null ;
            try {
            docFile = new File("c:\\Resum e.doc");
            //A FileInputStream obtains input bytes from a file.
            FileInputStream fis=new FileInputStream (docFile.getAbs olutePath());

            //A HWPFDocument used to read document file from FileInputStream
            HWPFDocument doc=new HWPFDocument(fi s);

            docExtractor = new WordExtractor(d oc);
            }
            catch(Exception exep)
            {
            System.out.prin tln(exep.getMes sage());
            }

            //This Array stores each line from the document file.
            String [] docArray = docExtractor.ge tParagraphText( );

            for(int i=0;i<docArray. length;i++)
            {
            if(docArray[i] != null)
            System.out.prin tln("Line "+ i +" : " + docArray[i]);
            }
            }

            public static void main(String[] args) {
            DocReader reader = new DocReader();
            reader.readDocF ile();
            }
            }
            -------------------------------------------------------------------------------------

            Comment

            Working...