Re: check if file is MS Word or PDF file

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Michael Crute

    Re: check if file is MS Word or PDF file

    On Sat, Sep 27, 2008 at 7:01 PM, Chris Rebert <clp@rebertia.c omwrote:
    Looking at the docs for the mimetypes module, it just guesses based on
    the filename (and extension), not the actual contents of the file, so
    it doesn't really help the OP, who wants to make sure their program
    isn't misled by an inaccurate extension.
    One other way to detect a pdf is to just read the first 4 bytes from
    the file. Valid pdf files start with "%PDF-". Something similar can be
    done with Word docs but I don't know what the magic bytes are. This
    approach is pretty similar to what the file command does but is
    probably a better approach if you have to support multiple platforms.

    -mike

    --
    _______________ _______________ __
    Michael E. Crute


    God put me on this earth to accomplish a certain number of things.
    Right now I am so far behind that I will never die. --Bill Watterson
Working...