Re: check if file is MS Word or PDF file

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chris Rebert

    Re: check if file is MS Word or PDF file

    On Sat, Sep 27, 2008 at 3:42 PM, Michael Crute <mcrute@gmail.c omwrote:
    On Sat, Sep 27, 2008 at 5:43 PM, A. Joseph <joefazee@gmail .comwrote:
    >What should I look for in a file to determine whether or not it is a
    >MS Word file or an Excel file or a PDF file, etc., etc.? including Zip
    >files
    >>
    >I don`t want to check for file extension.
    >os.path.splite xt('Filename.jp g') will produce a tuple of filename and
    >extension, but some file don`t even have extension and can still be read by
    >MS Word or NotePad. i want to be 100% sure of the file.
    >
    You could use the mimetypes module...
    >
    <<< import mimetypes
    <<< mimetypes.guess _type("LegalNot ices.pdf")
    >>>('applicatio n/pdf', None)
    Looking at the docs for the mimetypes module, it just guesses based on
    the filename (and extension), not the actual contents of the file, so
    it doesn't really help the OP, who wants to make sure their program
    isn't misled by an inaccurate extension.

    Regards,
    Chris
    --
    Follow the path of the Iguana...

    >
    -mike
    >
    --
    _______________ _______________ __
    Michael E. Crute

    >
    God put me on this earth to accomplish a certain number of things.
    Right now I am so far behind that I will never die. --Bill Watterson
    --

    >
  • Sean DiZazzo

    #2
    Re: check if file is MS Word or PDF file

    On Sep 27, 4:01 pm, "Chris Rebert" <c...@rebertia. comwrote:
    On Sat, Sep 27, 2008 at 3:42 PM, Michael Crute <mcr...@gmail.c omwrote:
    On Sat, Sep 27, 2008 at 5:43 PM, A. Joseph <joefa...@gmail .comwrote:
    What should I look for in a file to determine whether or not it is a
    MS Word file or an Excel file or a PDF file, etc., etc.? including Zip
    files
    >
    I don`t want to check for file extension.
    os.path.splitex t('Filename.jpg ') will produce a tuple of filename and
    extension, but some file don`t even have extension and can still be read by
    MS Word or NotePad. i want to be 100% sure of the file.
    >
    You could use the mimetypes module...
    >
    <<< import mimetypes
    <<< mimetypes.guess _type("LegalNot ices.pdf")
    >>('applicati on/pdf', None)
    >
    Looking at the docs for the mimetypes module, it just guesses based on
    the filename (and extension), not the actual contents of the file, so
    it doesn't really help the OP, who wants to make sure their program
    isn't misled by an inaccurate extension.
    >
    Regards,
    Chris
    --
    Follow the path of the Iguana...http://rebertia.com
    >
    >
    >
    -mike
    >
    --
    _______________ _______________ __
    Michael E. Crute
    http://mike.crute.org
    >
    God put me on this earth to accomplish a certain number of things.
    Right now I am so far behind that I will never die. --Bill Watterson
    --
    http://mail.python.org/mailman/listinfo/python-list
    Check http://sourceforge.net/project/showf...group_id=23617

    for the 'file' command for Windows.

    ~Sean

    Comment

    Working...