using python to edit a word file?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • John Salerno

    using python to edit a word file?

    I figured my first step is to install the win32 extension, which I did,
    but I can't seem to find any documentation for it. A couple of the links
    on Mark Hammond's site don't seem to work.

    Anyway, all I need to do is search in the Word document for certain
    strings and either delete them or replace them. Easy enough, if only I
    knew which function, etc. to use.

    Hope someone can push me in the right direction.

    Thanks.
  • Rob Wolfe

    #2
    Re: using python to edit a word file?

    John Salerno <johnjsal@NOSPA Mgmail.comwrite s:
    I figured my first step is to install the win32 extension, which I
    did, but I can't seem to find any documentation for it. A couple of
    the links on Mark Hammond's site don't seem to work.
    >
    Anyway, all I need to do is search in the Word document for certain
    strings and either delete them or replace them. Easy enough, if only I
    knew which function, etc. to use.
    >
    Hope someone can push me in the right direction.
    Maybe this will be helpful:



    --
    Regards,
    Rob

    Comment

    • John Salerno

      #3
      Re: using python to edit a word file?

      Rob Wolfe wrote:
      John Salerno <johnjsal@NOSPA Mgmail.comwrite s:
      >
      >I figured my first step is to install the win32 extension, which I
      >did, but I can't seem to find any documentation for it. A couple of
      >the links on Mark Hammond's site don't seem to work.
      >>
      >Anyway, all I need to do is search in the Word document for certain
      >strings and either delete them or replace them. Easy enough, if only I
      >knew which function, etc. to use.
      >>
      >Hope someone can push me in the right direction.
      >
      Maybe this will be helpful:
      >

      >
      But if I save the file to text, won't it lose its formatting?

      More specifically, here's what I have: a four-page calendar, each page
      with three months on it. The months are in tables, which is why I don't
      think making a text file will help me here, because I'll lose all that.
      What I need to do is renumber all the dates, basically replacing a
      number with itself minus 1. So it's not a simple find/replace task, and
      there doesn't seem to be a way to do this in Word's find/replace feature
      (but if there is, please let me know!)

      Comment

      • Gerhard Fiedler

        #4
        Re: using python to edit a word file?

        On 2006-08-10 15:15:34, John Salerno wrote:
        I figured my first step is to install the win32 extension, which I did,
        but I can't seem to find any documentation for it. A couple of the links
        on Mark Hammond's site don't seem to work.
        >
        Anyway, all I need to do is search in the Word document for certain
        strings and either delete them or replace them. Easy enough, if only I
        knew which function, etc. to use.
        >
        Hope someone can push me in the right direction.
        When Word is installed, you have a few COM interfaces to Word. I'm not sure
        how to access these with Python (but documentation about using COM with
        Python should help you here), and I'm not sure whether what you want is
        available (but the Word COM documentation should help you with that).

        Gerhard

        Comment

        • John Salerno

          #5
          Re: using python to edit a word file?

          John Salerno wrote:
          But if I save the file to text, won't it lose its formatting?
          It looks like I can save it as an XML file and it will retain all the
          formatting. Now I just need to decipher where the dates are in all that
          mess and replace them, just using a normal text file! :)

          Comment

          • John Henry

            #6
            Re: using python to edit a word file?

            John Salerno wrote:
            I figured my first step is to install the win32 extension, which I did,
            but I can't seem to find any documentation for it. A couple of the links
            on Mark Hammond's site don't seem to work.
            >
            Anyway, all I need to do is search in the Word document for certain
            strings and either delete them or replace them. Easy enough, if only I
            knew which function, etc. to use.
            >
            Hope someone can push me in the right direction.
            >
            Thanks.
            The easiest way for me to do things like this is to do it in Word and
            record a VB Macro. For instance you will see something like this:

            Selection.Find. ClearFormatting
            Selection.Find. Replacement.Cle arFormatting
            With Selection.Find
            .Text = "save it"
            .Replacement.Te xt = "dont save it"
            .Forward = True
            .Wrap = wdFindContinue
            .Format = False
            .MatchCase = False
            .MatchWholeWord = False
            .MatchByte = False
            .CorrectHangulE ndings = False
            .MatchAllWordFo rms = False
            .MatchSoundsLik e = False
            .MatchWildcards = False
            .MatchFuzzy = False
            End With
            Selection.Find. Execute Replace:=wdRepl aceAll

            and then hand translate it to Win32 Python, like:

            wordApp = Dispatch("Word. Application")
            wordDoc=wordApp .Documents.Add( ...some word file name...)
            wordRange=wordD oc.Range(0,0).S elect()
            sel=wordApp.Sel ection
            sel.Find.ClearF ormatting()
            sel.Find.Replac ement.ClearForm atting()
            sel.Find.Text = "save it"
            sel.Find.Replac ement.Text = "dont save it"
            sel.Find.Forwar d = True
            sel.Find.Wrap = constants.wdFin dContinue
            sel.Find.Format = False
            sel.Find.MatchC ase = False
            sel.Find.MatchW holeWord = False
            sel.Find.MatchB yte = False
            sel.Find.Correc tHangulEndings = False
            sel.Find.MatchA llWordForms = False
            sel.Find.MatchS oundsLike = False
            sel.Find.MatchW ildcards = False
            sel.Find.MatchF uzzy = False
            sel.Find.Find.E xecute(Replace= constants.wdRep laceAll)
            wordDoc.SaveAs( ...some word file name...)

            Can't say that this works as I typed because I haven't try it myself
            but should give you a good start.

            Make sure you run the makepy.py program in the
            \python23\lib\s ite-packages\win32c om\client directory and install the
            "MS Word 11.0 Object Library (8.3)" (or something equivalent). On my
            computers, this is not installed automatically and I have to remember
            to do it myself or else things won't work.

            Good Luck.

            Comment

            • Anthra Norell

              #7
              Re: using python to edit a word file?

              John,

              I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
              from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
              practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
              If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

              Frederic


              ----- Original Message -----
              From: "John Salerno" <johnjsal@NOSPA Mgmail.com>
              Newsgroups: comp.lang.pytho n
              To: <python-list@python.org >
              Sent: Thursday, August 10, 2006 9:08 PM
              Subject: Re: using python to edit a word file?

              John Salerno wrote:
              >
              But if I save the file to text, won't it lose its formatting?
              >
              It looks like I can save it as an XML file and it will retain all the
              formatting. Now I just need to decipher where the dates are in all that
              mess and replace them, just using a normal text file! :)
              --
              http://mail.python.org/mailman/listinfo/python-list

              Comment

              • John Salerno

                #8
                Re: using python to edit a word file?

                Anthra Norell wrote:
                John,
                >
                I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
                from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
                practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
                If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.
                >
                Frederic
                What I ended up doing was just saving the Word file as an XML file, and
                then writing a little script to process the text file. Then when it
                opens back in Word, all the formatting remains. The script isn't ideal,
                but it did the bulk of changing the numbers, and then I did a few things
                by hand. I love having Python for these chores! :)



                import re

                xml_file = open('calendar. xml')
                xml_data = xml_file.read()
                xml_file.close( )

                pattern = re.compile(r'<w :t>(\d+)</w:t>')

                def subtract(match_ obj):
                date = int(match_obj.g roup(1)) - 1
                return '<w:t>%s</w:t>' % date

                new_data = re.sub(pattern, subtract, xml_data)

                new_file = open('calendar2 007.xml', 'w')
                new_file.write( new_data)
                new_file.close( )

                Comment

                • Anthra Norell

                  #9
                  Re: using python to edit a word file?

                  No one could do it any better. Good for you! - Frederic

                  ----- Original Message -----
                  From: "John Salerno" <johnjsal@NOSPA Mgmail.com>
                  Newsgroups: comp.lang.pytho n
                  To: <python-list@python.org >
                  Sent: Friday, August 11, 2006 4:08 PM
                  Subject: Re: using python to edit a word file?

                  Anthra Norell wrote:
                  John,

                  I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the
                  conversion
                  from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
                  practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
                  If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

                  Frederic
                  >
                  What I ended up doing was just saving the Word file as an XML file, and
                  then writing a little script to process the text file. Then when it
                  opens back in Word, all the formatting remains. The script isn't ideal,
                  but it did the bulk of changing the numbers, and then I did a few things
                  by hand. I love having Python for these chores! :)
                  >
                  >
                  >
                  import re
                  >
                  xml_file = open('calendar. xml')
                  xml_data = xml_file.read()
                  xml_file.close( )
                  >
                  pattern = re.compile(r'<w :t>(\d+)</w:t>')
                  >
                  def subtract(match_ obj):
                  date = int(match_obj.g roup(1)) - 1
                  return '<w:t>%s</w:t>' % date
                  >
                  new_data = re.sub(pattern, subtract, xml_data)
                  >
                  new_file = open('calendar2 007.xml', 'w')
                  new_file.write( new_data)
                  new_file.close( )
                  --
                  http://mail.python.org/mailman/listinfo/python-list

                  Comment

                  Working...