Opening MS Word files via Python

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Fazer

    Opening MS Word files via Python

    Here comes another small question from me :-)

    I am curious as to how I should approach this issue. I would just
    want to parse simple text and maybe perhaps tables in the future.
    Would I have to save the word file and open it in a text editor? That
    would kind of....suck... Has anyone else tackled this issue?

    Thanks,
  • Rob Nikander

    #2
    Re: Opening MS Word files via Python

    Fazer wrote:[color=blue]
    > I am curious as to how I should approach this issue. I would just
    > want to parse simple text and maybe perhaps tables in the future.
    > Would I have to save the word file and open it in a text editor? That
    > would kind of....suck... Has anyone else tackled this issue?[/color]

    The win32 extensions for python allow you to get at the COM objects for
    applications like Word, and that would let you get the text and tables.
    google: win32 python.

    word = win32com.client .Dispatch('Word .Application')
    word.Documents. Open('C:\\myfil e.doc')

    But I don't know the best way to find out the methods and properties of
    the "word" object.

    Rob

    Comment

    • Simon Brunning

      #3
      Re: Opening MS Word files via Python

      faizan@jaredweb .com (Fazer) wrote in message[color=blue]
      > I am curious as to how I should approach this issue. I would just
      > want to parse simple text and maybe perhaps tables in the future.
      > Would I have to save the word file and open it in a text editor? That
      > would kind of....suck... Has anyone else tackled this issue?[/color]

      See http://aspn.activestate.com/ASPN/Coo.../Recipe/279003

      Cheers,
      Simon B.

      Comment

      • jmdeschamps

        #4
        Re: Opening MS Word files via Python

        Rob Nikander <rnikaREMOVEnde r@adelphia.net> wrote in message news:<i7-dnZNwpJ8TfhjdRV n-jg@adelphia.com >...[color=blue]
        > Fazer wrote:[color=green]
        > > I am curious as to how I should approach this issue. I would just
        > > want to parse simple text and maybe perhaps tables in the future.
        > > Would I have to save the word file and open it in a text editor? That
        > > would kind of....suck... Has anyone else tackled this issue?[/color]
        >
        > The win32 extensions for python allow you to get at the COM objects for
        > applications like Word, and that would let you get the text and tables.
        > google: win32 python.
        >
        > word = win32com.client .Dispatch('Word .Application')
        > word.Documents. Open('C:\\myfil e.doc')
        >
        > But I don't know the best way to find out the methods and properties of
        > the "word" object.
        >
        > Rob[/color]

        You can use VBA documentation for Word, and using dot notation and
        normal Pythonesque way of calling functions, play with its diverses
        objects, methods and attributes...
        Here's some pretty straightforward code along these lines:
        #************** **********
        import win32com.client
        import tkFileDialog

        # Launch Word
        MSWord = win32com.client .Dispatch("Word .Application")
        MSWord.Visible = 0
        # Open a specific file
        myWordDoc = tkFileDialog.as kopenfilename()
        MSWord.Document s.Open(myWordDo c)
        #Get the textual content
        docText = MSWord.Document s[0].Content
        # Get a list of tables
        listTables= MSWord.Document s[0].Tables
        #************** **********

        Happy parsing,

        Jean-Marc

        Comment

        • Fazer

          #5
          Re: Opening MS Word files via Python

          jmdeschamps@cvm .qc.ca (jmdeschamps) wrote in message news:<3d06fae9. 0404210536.3f27 7a37@posting.go ogle.com>...[color=blue]
          > Rob Nikander <rnikaREMOVEnde r@adelphia.net> wrote in message news:<i7-dnZNwpJ8TfhjdRV n-jg@adelphia.com >...[color=green]
          > > Fazer wrote:[color=darkred]
          > > > I am curious as to how I should approach this issue. I would just
          > > > want to parse simple text and maybe perhaps tables in the future.
          > > > Would I have to save the word file and open it in a text editor? That
          > > > would kind of....suck... Has anyone else tackled this issue?[/color]
          > >
          > > The win32 extensions for python allow you to get at the COM objects for
          > > applications like Word, and that would let you get the text and tables.
          > > google: win32 python.
          > >
          > > word = win32com.client .Dispatch('Word .Application')
          > > word.Documents. Open('C:\\myfil e.doc')
          > >
          > > But I don't know the best way to find out the methods and properties of
          > > the "word" object.
          > >
          > > Rob[/color]
          >
          > You can use VBA documentation for Word, and using dot notation and
          > normal Pythonesque way of calling functions, play with its diverses
          > objects, methods and attributes...
          > Here's some pretty straightforward code along these lines:
          > #************** **********
          > import win32com.client
          > import tkFileDialog
          >
          > # Launch Word
          > MSWord = win32com.client .Dispatch("Word .Application")
          > MSWord.Visible = 0
          > # Open a specific file
          > myWordDoc = tkFileDialog.as kopenfilename()
          > MSWord.Document s.Open(myWordDo c)
          > #Get the textual content
          > docText = MSWord.Document s[0].Content
          > # Get a list of tables
          > listTables= MSWord.Document s[0].Tables
          > #************** **********
          >
          > Happy parsing,
          >
          > Jean-Marc[/color]


          That is Awesome! Thanks!

          How would I save something in word format? I am guessing
          MSWord.Docments .Save(myWordDoc ) or around those lines? where can I
          find more documentatin? Thanks.

          Comment

          • anon

            #6
            Re: Opening MS Word files via Python

            Fazer wrote...
            [color=blue]
            > jmdeschamps@cvm .qc.ca (jmdeschamps) wrote in message news:<3d06fae9. 0404210536.3f27 7a37@posting.go ogle.com>...
            >[color=green]
            >>Rob Nikander <rnikaREMOVEnde r@adelphia.net> wrote in message news:<i7-dnZNwpJ8TfhjdRV n-jg@adelphia.com >...[/color][/color]
            <snip>[color=blue][color=green][color=darkred]
            >>>
            >>>But I don't know the best way to find out the methods and properties of
            >>>the "word" object.
            >>>[/color][/color][/color]
            <snip>[color=blue]
            >
            > How would I save something in word format? I am guessing
            > MSWord.Docments .Save(myWordDoc ) or around those lines? where can I
            > find more documentatin? Thanks.[/color]



            Open MS Word and press (ALT + F11), then F2





            Comment

            Working...