On Sat, Sep 27, 2008 at 7:01 PM, Chris Rebert <clp@rebertia.c omwrote:
One other way to detect a pdf is to just read the first 4 bytes from
the file. Valid pdf files start with "%PDF-". Something similar can be
done with Word docs but I don't know what the magic bytes are. This
approach is pretty similar to what the file command does but is
probably a better approach if you have to support multiple platforms.
-mike
--
_______________ _______________ __
Michael E. Crute
God put me on this earth to accomplish a certain number of things.
Right now I am so far behind that I will never die. --Bill Watterson
Looking at the docs for the mimetypes module, it just guesses based on
the filename (and extension), not the actual contents of the file, so
it doesn't really help the OP, who wants to make sure their program
isn't misled by an inaccurate extension.
the filename (and extension), not the actual contents of the file, so
it doesn't really help the OP, who wants to make sure their program
isn't misled by an inaccurate extension.
the file. Valid pdf files start with "%PDF-". Something similar can be
done with Word docs but I don't know what the magic bytes are. This
approach is pretty similar to what the file command does but is
probably a better approach if you have to support multiple platforms.
-mike
--
_______________ _______________ __
Michael E. Crute
God put me on this earth to accomplish a certain number of things.
Right now I am so far behind that I will never die. --Bill Watterson