newbie - merging xls files using xldt and xlwt

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Albert-jan Roskam

    newbie - merging xls files using xldt and xlwt

    Hi,

    I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.

    Any ideas?

    Thanks in advance!

    Albert-Jan


    """
    Merge all xls files in a given directory into one multisheet xls file.
    The sheets get the orginal file name, without the extension.
    File names should not exceed 31 characters, as this is the maximum
    sheet name length
    """

    import xlrd, xlwt
    import glob, os.path

    def merge_xls (in_dir, out_file="d:/merged_output.x ls"):

    xls_files = glob.glob(in_di r + "*.xls")
    sheet_names = []
    merged_book = xlwt.Workbook()

    [sheet_names.app end(os.path.bas ename(v)[:-4]) for k, v in enumerate(xls_f iles)]
    for k, xls_file in enumerate(xls_f iles):
    if len (sheet_names[k]) <= 31:
    book = xlrd.open_workb ook(xls_file)
    ws = merged_book.add _sheet(sheet_na mes[k])
    for sheetx in range(book.nshe ets):
    sheet = book.sheet_by_i ndex(sheetx)
    for rx in range(sheet.nro ws):
    for cx in range(sheet.nco ls):
    ws.write(rx, cx, sheet.cell_valu e(rx, cx))
    else:
    print "File name too long: <%s.xls(maxim um is 31 chars) " % (sheet_names[k])
    print "File <%s.xlsis *not* included in the merged xls file." % (sheet_names[k])
    merged_book.sav e(out_file)

    print "---Merged xls file written to %s using the following source files: " % (out_file)
    for k, v in enumerate(sheet _names):
    if len(v) <= 31: print "\t", str(k+1).zfill( 3), "%s.xls" % (v)

    merge_xls(in_di r="d:/temp/")

    *** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
    put_cell 0 1

    Traceback (most recent call last):
    File "G:\generic_syn taxes\merge_xls .py", line 37, in -toplevel-
    merge_xls(in_di r="d:/temp/")
    File "G:\generic_syn taxes\merge_xls .py", line 21, in merge_xls
    book = xlrd.open_workb ook(xls_file)
    File "C:\Python24\li b\site-packages\xlrd\_ _init__.py", line 139, in open_workbook
    bk.get_sheets()
    File "C:\Python24\li b\site-packages\xlrd\_ _init__.py", line 389, in get_sheets
    sht = self.get_sheet( sheetno)
    File "C:\Python24\li b\site-packages\xlrd\_ _init__.py", line 379, in get_sheet
    sh.read(self)
    File "C:\Python24\li b\site-packages\xlrd\s heet.py", line 285, in read
    self.put_cell(r owx, colx, XL_CELL_TEXT, bk._sharedstrin gs[index])
    File "C:\Python24\li b\site-packages\xlrd\s heet.py", line 214, in put_cell
    self._cell_type s[rowx][colx] = ctype
    IndexError: list assignment index out of range



  • John Machin

    #2
    Re: newbie - merging xls files using xldt and xlwt

    On Oct 15, 9:16 pm, Albert-jan Roskam <fo...@yahoo.co mwrote:
    Hi,
    >
    I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, theprogram works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4,xlwt version 0.7.0, xlrd version 0.5.2, Win NT.
    >
    Any ideas?
    The version of xlrd that you are using is an antique. Go to
    http://pypi.python.org/pypi/xlrd and get the latest version. If that
    doesn't work, send me a copy of one of the files that is displaying
    this problem.
    >
    Thanks in advance!
    >
    Albert-Jan
    >
    """
    Merge all xls files in a given directory into one multisheet xls file.
    The sheets get the orginal file name, without the extension.
    File names should not exceed 31 characters, as this is the maximum
    sheet name length
    """
    >
    import xlrd, xlwt
    import glob, os.path
    >
    def merge_xls (in_dir, out_file="d:/merged_output.x ls"):
    >
        xls_files   = glob.glob(in_di r + "*.xls")
        sheet_names = []
        merged_book = xlwt.Workbook()
    >
        [sheet_names.app end(os.path.bas ename(v)[:-4]) for k, v in enumerate(xls_f iles)]
    Wah! Try this:
    sheet_names = [os.path.basenam e(v)[:-4]) for v in xls_files]

        for k, xls_file in enumerate(xls_f iles):
            if len (sheet_names[k]) <= 31:
                book = xlrd.open_workb ook(xls_file)
                ws = merged_book.add _sheet(sheet_na mes[k])
                for sheetx in range(book.nshe ets):
                    sheet = book.sheet_by_i ndex(sheetx)
                    for rx in range(sheet.nro ws):
                        for cx in range(sheet.nco ls):
                             ws.write(rx, cx, sheet..cell_val ue(rx, cx))
    I presume that you're not too worried about any date data.

    If an input file has more than 1 sheet, you are creating only one
    sheet in the output file, and overwriting cells.
            else:
    >[snip]
    >
    merge_xls(in_di r="d:/temp/")
    >
    *** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero***
    put_cell 0 1
    >
    Traceback (most recent call last):
    [snip]
        self._cell_type s[rowx][colx] = ctype
    IndexError: list assignment index out of range
    Yeah, symptom of an xls file with a DIMENSIONS records that lies ...
    xlrd has like Excel become more resilient over time :-)

    BTW, consider joining the python-excel group at http://groups.google.com/group/python-excel

    Cheers,
    John

    Comment

    • Support Desk

      #3
      account balance checker

      Hello all,

      I was wondering if it would be possible to make a script to grab my
      balance account balance a few times a day without having to login every
      time. I know I can use the urlib2 library, but not sure how to go about
      filling in the forms and submitting them. BOA has a mobile site that is
      pretty simple. Anyone else use Bank of America and would be interested in
      this. This is not for anything illegal, just for me to prevent overdrafting
      my account





      y =
      urllib.urlopen( 'https://sitekey.bankofa merica.com/sas/signonScreen.do ?isMobl
      eDevice=true',u rllib.urlencode ({'onlineID':'M YONLLINEID'})). readlines()

      Comment

      • Chris Rebert

        #4
        Re: account balance checker

        You'd probably have to use something like mechanize
        (http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
        but if BofA's website uses Javascript at all, you're probably out of
        luck.

        Cheers,
        Chris
        --
        Follow the path of the Iguana...


        On Wed, Oct 15, 2008 at 8:09 AM, Support Desk
        <support.desk.i pg@gmail.comwro te:
        Hello all,
        >
        I was wondering if it would be possible to make a script to grab my
        balance account balance a few times a day without having to login every
        time. I know I can use the urlib2 library, but not sure how to go about
        filling in the forms and submitting them. BOA has a mobile site that is
        pretty simple. Anyone else use Bank of America and would be interested in
        this. This is not for anything illegal, just for me to prevent overdrafting
        my account
        >

        >
        >
        >
        y =
        urllib.urlopen( 'https://sitekey.bankofa merica.com/sas/signonScreen.do ?isMobl
        eDevice=true',u rllib.urlencode ({'onlineID':'M YONLLINEID'})). readlines()
        >
        --

        >

        Comment

        • Support Desk

          #5
          RE: account balance checker

          I was also looking at the ClientForm Library


          which can get me past the first username form, but I noticed it then goes
          to a challenge question form and im not sure how to take the resulting for
          and resubmit it with new information and then resubmit the resulting form
          with the password

          -----Original Message-----
          From: cvrebert@gmail. com [mailto:cvrebert @gmail.com] On Behalf Of Chris
          Rebert
          Sent: Thursday, October 16, 2008 10:15 AM
          To: Support Desk
          Cc: python-list@python.org
          Subject: Re: account balance checker

          You'd probably have to use something like mechanize
          (http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
          but if BofA's website uses Javascript at all, you're probably out of
          luck.

          Cheers,
          Chris
          --
          Follow the path of the Iguana...


          On Wed, Oct 15, 2008 at 8:09 AM, Support Desk
          <support.desk.i pg@gmail.comwro te:
          Hello all,
          >
          I was wondering if it would be possible to make a script to grab my
          balance account balance a few times a day without having to login every
          time. I know I can use the urlib2 library, but not sure how to go about
          filling in the forms and submitting them. BOA has a mobile site that is
          pretty simple. Anyone else use Bank of America and would be interested in
          this. This is not for anything illegal, just for me to prevent
          overdraftingurllib.urlopen( 'https://sitekey.bankofa merica.com/sas/signonScreen.do ?isMobl
          eDevice=true',u rllib.urlencode ({'onlineID':'M YONLLINEID'})). readlines()
          >
          --

          >

          Comment

          • Jules Stevenson

            #6
            paster 64bit windows compile

            Hello all,

            I've been tearing my hair out trying to get pylon installed most of the day,
            and it seems that both setup tools and paster.exe have some serious issues
            with 64bit on windows.

            Unfortunately I'm stuck with 2.6 64bit.

            I think I've got it nearly all up and running, the biggest problem is that
            paster fails to find the python executable when run, as it is 32bit and the
            python executable is 64 [afaik]. Does anyone either have a 64bit compile, or
            know where the source code for paster.exe lives so I can attempt to compile
            it myself. It doesn't appear to be in the source for PasterScript package.
            Somehow it 'appears' when setup .py install is run, but I cannot find any c
            code anywhere.

            Many thanks in advance

            Jules

            Comment

            Working...