I m able to read txt format but i m not gettinnn how to read doc & pdf ,and what is format of both & how to implement read procedure for these file formats.
how to read a doc & pdf file in C++
Collapse
X
-
Originally posted by BswapnilI m able to read txt format but i m not gettinnn how to read doc & pdf ,and what is format of both & how to implement read procedure for these file formats.
Have you tried googling?
Savage -
PDF files have a very specific format, which is documented by Adobe in rather lengthy specs. To read an arbitrary PDF file, you'll need to implement a lot of features of the standard.
DOC files are worse: they are (to some extent) a binary dump of MS Word's memory, +/- some formatting. MS does not document the format in any kind of public way, and even their "Open"XML format standard has things like "do this feature like Word 97" without further clarification. It has only been through extensive reverse-engineering that open source projects have figured out how to read/write them.
In general, to be able to read any kind of file with a complex format, you'll need to understand and implement their standards. This is really going to reinvent the wheel, and instead you should look towards other projects that have already done this. i.e., look to the open source community.
MS may have some specific win32 functions to do some of this, if you're willing to restrict yourself to windows platforms only. -- PaulComment
Comment