.doc files are part binary, part text files. The easiest way to "read" in a microsoft document is by using the OpenOffice api's (http://www.openoffice. org) if you want a formattable/printable document or if you want just the raw bytes then use a ByteArrayInputS tream. The OpenOffice api's are _not_ the easiest things to use, but it is a whole lot better than trying to parse through all the crap yourself.
Originally posted by sang
How to read the .doc file into the printable format(ie without ascii code).
But i am not able to understand your answer please give the java code.
Once again thanks for your replay,
I am waiting for your answer.
Thanks,
Sang.
What he/she meant is that it's easier to use a third party package to read the .doc files. If you do not have one that reads .doc files then you visit the site he has given or google POI.
The below code is used to read the document file. But at output it will give the special characters and ascii values. So i want only the readable characters (ie a-z and 0-9) how is possible pls give the solution.
Example code:
import java.io.*;
class fileinput
{
public static void main(String[] args) throws IOException
{
FileInputStream Fin=new FileInputStream ("file.doc") ;
int j;
while((j=Fin.re ad())!=-1)
System.out.prin t((char)j);
}
}
Hi friend,
I have used Jakarta POI library to read the doc file.
This program simply reads the doc file and prints each line on the console, I think this program will help you to read the doc file.
Comment