Python new user question - file writeline error

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • James

    Python new user question - file writeline error

    Hello,

    I'm a newbie to Python & wondering someone can help me with this...

    I have this code:
    --------------------------
    #! /usr/bin/python

    import sys

    month ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':
    8,'SEP':9,'OCT' :10,'NOV':11,'D EC':12}
    infile=file('TV A-0316','r')
    outfile=file('t mp.out','w')

    for line in infile:
    item = line.split(',')
    dob = item[6].split('/')
    dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0]
    lbdt = item[8].split('/')
    lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0]
    lbrc = item[10].split('/')
    lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0]
    lbrp = item[14].split('/')
    lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0]
    item[6] = dob
    item[8] = lbdt
    item[10]=lbrc
    item[14]=lbrp
    list = ','.join(item)
    outfile.writeli nes(list)
    infile.close
    outfile.close
    -----------------------------

    And the data file(TVA-0316) looks like this:
    -----------------------------
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,,
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h,
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h,
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,,
    -----------------------------

    Basically I'm reading in each line and converting all date fields (05/
    MAR/1950) to different format (1950-03-05) in order to load into MySQL
    table.

    I have two issues:
    1. the outfile doesn't complete with no error message. when I check
    the last line in the python interpreter, it has read and processed the
    last line, but the output file stopped before.
    2. Is this the best way to do this in Python?
    3. (Out of scope) is there a way to load this CSV file directly into
    MySQL data field without converting the format?

    Thank you.

    James

  • Bruno Desthuilliers

    #2
    Re: Python new user question - file writeline error

    James a écrit :
    Hello,
    >
    I'm a newbie to Python & wondering someone can help me with this...
    >
    I have this code:
    --------------------------
    #! /usr/bin/python
    >
    import sys
    >
    month ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':
    8,'SEP':9,'OCT' :10,'NOV':11,'D EC':12}
    infile=file('TV A-0316','r')
    outfile=file('t mp.out','w')
    >
    for line in infile:
    item = line.split(',')
    CSV format ?
    Source code: Lib/csv.py The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. CSV format was used for many years prior to att...

    dob = item[6].split('/')
    dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0]
    Why did you use integers as values in the month dict if it's for using
    them as strings ?
    lbdt = item[8].split('/')
    lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0]
    lbrc = item[10].split('/')
    lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0]
    lbrp = item[14].split('/')
    lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0]
    This may help too:
    Source code: Lib/datetime.py The datetime module supplies classes for manipulating dates and times. While date and time arithmetic is supported, the focus of the implementation is on efficient attr...

    item[6] = dob
    item[8] = lbdt
    item[10]=lbrc
    item[14]=lbrp
    list = ','.join(item)
    Better to avoid using builtin types names as identifiers. And FWIW, this
    is *not* a list...
    outfile.writeli nes(list)
    You want file.writeline( ) or file.write(). And you have to manually add
    the newline.
    infile.close
    You're not actually *calling* infile.close - just getting a reference on
    the file.close method. The parens are not optional in Python, they are
    the call operator.
    outfile.close
    Idem.
    -----------------------------
    >
    And the data file(TVA-0316) looks like this:
    -----------------------------
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,,
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h,
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h,
    06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
    NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,,
    -----------------------------
    >
    Basically I'm reading in each line and converting all date fields (05/
    MAR/1950) to different format (1950-03-05) in order to load into MySQL
    table.
    >
    I have two issues:
    1. the outfile doesn't complete with no error message. when I check
    the last line in the python interpreter, it has read and processed the
    last line, but the output file stopped before.
    Use the csv module and cleanly close your files, then come back if you
    still have problems.
    2. Is this the best way to do this in Python?
    Err... What to say... Obviously, no.

    Comment

    • Shawn Milo

      #3
      Re: Python new user question - file writeline error

      On 7 Feb 2007 11:31:32 -0800, James <cityhunter007@ gmail.comwrote:
      Hello,
      >
      I'm a newbie to Python & wondering someone can help me with this...
      >
      I have this code:
      --------------------------
      #! /usr/bin/python
      >
      import sys
      >
      month ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':
      8,'SEP':9,'OCT' :10,'NOV':11,'D EC':12}
      infile=file('TV A-0316','r')
      outfile=file('t mp.out','w')
      >
      for line in infile:
      item = line.split(',')
      dob = item[6].split('/')
      dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0]
      lbdt = item[8].split('/')
      lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0]
      lbrc = item[10].split('/')
      lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0]
      lbrp = item[14].split('/')
      lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0]
      item[6] = dob
      item[8] = lbdt
      item[10]=lbrc
      item[14]=lbrp
      list = ','.join(item)
      outfile.writeli nes(list)
      infile.close
      outfile.close
      -----------------------------
      >
      And the data file(TVA-0316) looks like this:
      -----------------------------
      06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
      NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,,
      06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
      NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h,
      06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
      NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h,
      06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
      NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,,
      -----------------------------
      >
      Basically I'm reading in each line and converting all date fields (05/
      MAR/1950) to different format (1950-03-05) in order to load into MySQL
      table.
      >
      I have two issues:
      1. the outfile doesn't complete with no error message. when I check
      the last line in the python interpreter, it has read and processed the
      last line, but the output file stopped before.
      2. Is this the best way to do this in Python?
      3. (Out of scope) is there a way to load this CSV file directly into
      MySQL data field without converting the format?
      >
      Thank you.
      >
      James
      >
      --

      >

      Your script worked for me. I'm not sure what the next step is in
      troubleshooting it. Is it possible that your whitespace isn't quite
      right? I had to reformat it, but I assume it was because of the way
      cut & paste worked from Gmail.

      I usually use Perl for data stuff like this, but I don't see why
      Python wouldn't be a great solution. However, I would re-write it
      using regexes, to seek and replace sections that are formatted like a
      date, rather than breaking it into a variable for each field, changing
      each date individually, then putting them back together.

      As for how MySQL likes having dates formatted in CSV input: I can't
      help there, but I'm sure someone else can.

      I'm pretty new to Python myself, but if you'd like help with a
      Perl/regex solution, I'm up for it. For that matter, whipping up a
      Python/regex solution would probably be good for me. Let me know.

      Shawn

      Comment

      • James

        #4
        Re: Python new user question - file writeline error

        On Feb 7, 4:59 pm, "Shawn Milo" <S...@Milochik. comwrote:
        On 7 Feb 2007 11:31:32 -0800, James <cityhunter...@ gmail.comwrote:
        >
        >
        >
        Hello,
        >
        I'm a newbie to Python & wondering someone can help me with this...
        >
        I have this code:
        --------------------------
        #! /usr/bin/python
        >
        import sys
        >
        month ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':
        8,'SEP':9,'OCT' :10,'NOV':11,'D EC':12}
        infile=file('TV A-0316','r')
        outfile=file('t mp.out','w')
        >
        for line in infile:
        item = line.split(',')
        dob = item[6].split('/')
        dob = dob[2]+'-'+str(month[dob[1]])+'-'+dob[0]
        lbdt = item[8].split('/')
        lbdt = lbdt[2]+'-'+str(month[lbdt[1]])+'-'+lbdt[0]
        lbrc = item[10].split('/')
        lbrc = lbrc[2]+'-'+str(month[lbrc[1]])+'-'+lbrc[0]
        lbrp = item[14].split('/')
        lbrp = lbrp[2]+'-'+str(month[lbrp[1]])+'-'+lbrp[0]
        item[6] = dob
        item[8] = lbdt
        item[10]=lbrc
        item[14]=lbrp
        list = ','.join(item)
        outfile.writeli nes(list)
        infile.close
        outfile.close
        -----------------------------
        >
        And the data file(TVA-0316) looks like this:
        -----------------------------
        06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
        NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,,
        06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
        NOV/2006,V1,,,21/NOV/2006,GGT,34,U/L,11,32,h,
        06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
        NOV/2006,V1,,,21/NOV/2006,ALT,31,U/L,5,29,h,
        06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/
        NOV/2006,V1,,,21/NOV/2006,ALKP,61,U/L,40,135,,
        -----------------------------
        >
        Basically I'm reading in each line and converting all date fields (05/
        MAR/1950) to different format (1950-03-05) in order to load into MySQL
        table.
        >
        I have two issues:
        1. the outfile doesn't complete with no error message. when I check
        the last line in the python interpreter, it has read and processed the
        last line, but the output file stopped before.
        2. Is this the best way to do this in Python?
        3. (Out of scope) is there a way to load this CSV file directly into
        MySQL data field without converting the format?
        >
        Thank you.
        >
        James
        >>
        Your script worked for me. I'm not sure what the next step is in
        troubleshooting it. Is it possible that your whitespace isn't quite
        right? I had to reformat it, but I assume it was because of the way
        cut & paste worked from Gmail.
        >
        I usually use Perl for data stuff like this, but I don't see why
        Python wouldn't be a great solution. However, I would re-write it
        using regexes, to seek and replace sections that are formatted like a
        date, rather than breaking it into a variable for each field, changing
        each date individually, then putting them back together.
        >
        As for how MySQL likes having dates formatted in CSV input: I can't
        help there, but I'm sure someone else can.
        >
        I'm pretty new to Python myself, but if you'd like help with a
        Perl/regex solution, I'm up for it. For that matter, whipping up a
        Python/regex solution would probably be good for me. Let me know.
        >
        Shawn
        Thank you very much for your kind offer.
        I'm also coming from Perl myself - heard many good things about Python
        so I'm trying it out - but it seems harder than I thought :(

        James

        Comment

        • Bruno Desthuilliers

          #5
          Re: Python new user question - file writeline error

          James a écrit :
          On Feb 7, 4:59 pm, "Shawn Milo" <S...@Milochik. comwrote:
          >
          (snip)
          >>I'm pretty new to Python myself, but if you'd like help with a
          >>Perl/regex solution, I'm up for it. For that matter, whipping up a
          >>Python/regex solution would probably be good for me. Let me know.
          >>
          >>Shawn
          >
          >
          Thank you very much for your kind offer.
          I'm also coming from Perl myself - heard many good things about Python
          so I'm trying it out - but it seems harder than I thought :(
          If I may comment, Python is not Perl, and trying to solve things the
          Perl way, while still possible, may not be the best idea (I don't mean
          Perl is a bad idea in itself - just that it's another language with
          another way to do things).

          Here, doing the parsing oneself - either manually as james did or with
          regexps - is certainly not as easy as with Perl, and IMHO not the
          simplest way to go, when the csv module can take care of parsing and
          formatting CSV files and the datetime module of parsing and formatting
          dates.

          Just my 2 cents...

          Comment

          • Jerry Hill

            #6
            Re: Python new user question - file writeline error

            On 7 Feb 2007 11:31:32 -0800, James <cityhunter007@ gmail.comwrote:
            I have this code:
            ....
            infile.close
            outfile.close
            ....
            1. the outfile doesn't complete with no error message. when I check
            the last line in the python interpreter, it has read and processed the
            last line, but the output file stopped before.
            You need to call the close methods on your file objects like this:
            outfile.close()

            If you leave off the parentheses, you get the method object, but don't
            do anything with it.
            2. Is this the best way to do this in Python?
            I would parse your dates using the python time module, like this:

            Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
            (Intel)] on win32
            IDLE 1.2
            >>import time
            >>line = r'06-0588,03,701,037 01,0000046613,J JB,05/MAR/1950,M,20/NOV/2006,08:50,21/NOV/2006,V1,,,21/NOV/2006,AST,19,U/L,5,40,,'
            >>item = line.split(',')
            >>time.strftime ('%a, %d %b %Y', timedate)
            'Sun, 05 Mar 1950'
            >>dob = item[6]
            >>dob_time = time.strptime(d ob, '%d/%b/%Y')
            >>dob_time
            (1950, 3, 5, 0, 0, 0, 6, 64, -1)
            >>time.strftime ('%Y-%m-%d', dob_time)
            '1950-03-05'

            See the docs for the time module here:
            This module provides various time-related functions. For related functionality, see also the datetime and calendar modules. Although this module is always available, not all functions are available...

            Using that will probably result in code that's quite a bit easier to
            read if you ever have to come back to it.

            You also might want to investigate the csv module
            (http://docs.python.org/lib/module-csv.html) for a bunch of tools
            specifically tailored to working with files full of comma separated
            values like your input files.

            --
            Jerry

            Comment

            • Shawn Milo

              #7
              Fwd: Python new user question - file writeline error

              To the list:

              I have come up with something that's working fine. However, I'm fairly
              new to Python, so I'd really appreciate any suggestions on how this
              can be made more Pythonic.

              Thanks,
              Shawn






              Okay, here's what I have come up with:


              #! /usr/bin/python

              import sys
              import re

              month ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':8,'SEP':9,' OCT':10,'NOV':1 1,'DEC':12}
              infile=file('TV A-0316','r')
              outfile=file('t mp.out','w')

              def formatDatePart( x):
              "take a number and transform it into a two-character string,
              zero padded"
              x = str(x)
              while len(x) < 2:
              x = "0" + x
              return x

              regex = re.compile(r",\ d{2}/[A-Z]{3}/\d{4},")

              for line in infile:
              matches = regex.findall(l ine)
              for someDate in matches:

              dayNum = formatDatePart( someDate[1:3])
              monthNum = formatDatePart( month[someDate[4:7]])
              yearNum = formatDatePart( someDate[8:12])

              newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
              line = line.replace(so meDate, newDate)

              outfile.writeli nes(line)

              infile.close
              outfile.close

              Comment

              • Gabriel Genellina

                #8
                Re: Fwd: Python new user question - file writeline error

                On 8 feb, 12:41, "Shawn Milo" <S...@Milochik. comwrote:
                I have come up with something that's working fine. However, I'm fairly
                new to Python, so I'd really appreciate any suggestions on how this
                can be made more Pythonic.
                A few comments:

                You don't need the formatDatePart function; delete it, and replace
                newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                with
                newDate = ",%04.4d-%02.2d-%02.2d," % (yearNum,monthN um,dayNum)

                and before:
                dayNum, monthNum, yearNum = [int(num) for num in
                someDate[1:-1].split('/')]

                And this: outfile.writeli nes(line)
                should be: outfile.write(l ine)
                (writelines works almost by accident here).

                You forget again to use () to call the close methods:
                infile.close()
                outfile.close()

                I don't like the final replace, but for a script like this I think
                it's OK.

                --
                Gabriel Genellina

                Comment

                • Shawn Milo

                  #9
                  Re: Fwd: Python new user question - file writeline error

                  On 8 Feb 2007 09:05:51 -0800, Gabriel Genellina <gagsl-py@yahoo.com.ar wrote:
                  On 8 feb, 12:41, "Shawn Milo" <S...@Milochik. comwrote:
                  >
                  I have come up with something that's working fine. However, I'm fairly
                  new to Python, so I'd really appreciate any suggestions on how this
                  can be made more Pythonic.
                  >
                  A few comments:
                  >
                  You don't need the formatDatePart function; delete it, and replace
                  newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                  with
                  newDate = ",%04.4d-%02.2d-%02.2d," % (yearNum,monthN um,dayNum)
                  >
                  and before:
                  dayNum, monthNum, yearNum = [int(num) for num in
                  someDate[1:-1].split('/')]
                  >
                  And this: outfile.writeli nes(line)
                  should be: outfile.write(l ine)
                  (writelines works almost by accident here).
                  >
                  You forget again to use () to call the close methods:
                  infile.close()
                  outfile.close()
                  >
                  I don't like the final replace, but for a script like this I think
                  it's OK.
                  >
                  --
                  Gabriel Genellina
                  >
                  --

                  >

                  Gabriel,

                  Thanks for the comments! The new version is below. I thought it made a
                  little more sense to format the newDate = ... line the way I have it
                  below, although I did incorporate your suggestions. Also, the
                  formatting options you provided seemed to specify not only string
                  padding, but also decimal places, so I changed it. Please let me know
                  if there is some other meaning behind the way you did it.

                  As for not liking the replace line, what would you suggest instead?

                  Shawn

                  #! /usr/bin/python

                  import sys
                  import re

                  month ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':8,'SEP':9,' OCT':10,'NOV':1 1,'DEC':12}
                  infile=file('TV A-0316','r')
                  outfile=file('t mp.out','w')

                  regex = re.compile(r",\ d{2}/[A-Z]{3}/\d{4},")

                  for line in infile:
                  matches = regex.findall(l ine)
                  for someDate in matches:

                  dayNum = someDate[1:3]
                  monthNum = month[someDate[4:7]]
                  yearNum = someDate[8:12]

                  newDate = ",%04d-%02d-%02d," %
                  (int(yearNum),i nt(monthNum),in t(dayNum))
                  line = line.replace(so meDate, newDate)

                  outfile.write(l ine)

                  infile.close()
                  outfile.close()

                  Comment

                  • Jussi Salmela

                    #10
                    Re: Fwd: Python new user question - file writeline error

                    Shawn Milo kirjoitti:
                    To the list:
                    >
                    I have come up with something that's working fine. However, I'm fairly
                    new to Python, so I'd really appreciate any suggestions on how this
                    can be made more Pythonic.
                    >
                    Thanks,
                    Shawn
                    >
                    >
                    >
                    >
                    >
                    >
                    Okay, here's what I have come up with:
                    What follows may feel harsh but you asked for it ;)
                    >
                    >
                    #! /usr/bin/python
                    >
                    import sys
                    import re
                    >
                    month
                    ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':8,'SEP':9,' OCT':10,'NOV':1 1,'DEC':12}
                    >
                    infile=file('TV A-0316','r')
                    outfile=file('t mp.out','w')
                    >
                    def formatDatePart( x):
                    "take a number and transform it into a two-character string,
                    zero padded"
                    If a comment or doc string is misleading one would be better off without
                    it entirely:
                    "take a number": the function can in fact take (at least)
                    any base type
                    "transform it": the function doesn't transform x to anything
                    although the name of the variable x is the same
                    as the argument x
                    "two-character string": to a string of at least 2 chars
                    "zero padded": where left/right???
                    x = str(x)
                    while len(x) < 2:
                    x = "0" + x
                    You don't need loops for these kind of things. One possibility is to
                    replace the whole body with:
                    return str(x).zfill(2)
                    return x
                    >
                    regex = re.compile(r",\ d{2}/[A-Z]{3}/\d{4},")
                    >
                    for line in infile:
                    matches = regex.findall(l ine)
                    for someDate in matches:
                    >
                    Empty lines are supposed to make code more readable. The above empty
                    line does the contrary by separating the block controlled by the for
                    and the for statement
                    dayNum = formatDatePart( someDate[1:3])
                    monthNum = formatDatePart( month[someDate[4:7]])
                    yearNum = formatDatePart( someDate[8:12])
                    You don't need the formatDatePart function at all:
                    newDate = ",%4s-%02d-%2s," % \
                    (someDate[8:12],month[someDate[4:7]],someDate[1:3])
                    >
                    newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                    line = line.replace(so meDate, newDate)
                    >
                    outfile.writeli nes(line)
                    >
                    infile.close
                    outfile.close
                    You have not read the answers given to the OP, have you. Because if you
                    had, your code would be:
                    infile.close()
                    outfile.close()
                    The reason your version seems to be working, is that you probably
                    execute your code from the command-line and exiting from Python to
                    command-line closes the files, even if you don't.

                    Cheers,
                    Jussi

                    Comment

                    • Shawn Milo

                      #11
                      Re: Fwd: Python new user question - file writeline error

                      On 2/8/07, Jussi Salmela <tiedon_jano@ho tmail.comwrote:
                      Shawn Milo kirjoitti:
                      To the list:

                      I have come up with something that's working fine. However, I'm fairly
                      new to Python, so I'd really appreciate any suggestions on how this
                      can be made more Pythonic.

                      Thanks,
                      Shawn






                      Okay, here's what I have come up with:
                      >
                      What follows may feel harsh but you asked for it ;)
                      >


                      #! /usr/bin/python

                      import sys
                      import re

                      month
                      ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':8,'SEP':9,' OCT':10,'NOV':1 1,'DEC':12}

                      infile=file('TV A-0316','r')
                      outfile=file('t mp.out','w')

                      def formatDatePart( x):
                      "take a number and transform it into a two-character string,
                      zero padded"
                      If a comment or doc string is misleading one would be better off without
                      it entirely:
                      "take a number": the function can in fact take (at least)
                      any base type
                      "transform it": the function doesn't transform x to anything
                      although the name of the variable x is the same
                      as the argument x
                      "two-character string": to a string of at least 2 chars
                      "zero padded": where left/right???
                      x = str(x)
                      while len(x) < 2:
                      x = "0" + x
                      You don't need loops for these kind of things. One possibility is to
                      replace the whole body with:
                      return str(x).zfill(2)
                      return x

                      regex = re.compile(r",\ d{2}/[A-Z]{3}/\d{4},")

                      for line in infile:
                      matches = regex.findall(l ine)
                      for someDate in matches:
                      Empty lines are supposed to make code more readable. The above empty
                      line does the contrary by separating the block controlled by the for
                      and the for statement
                      dayNum = formatDatePart( someDate[1:3])
                      monthNum = formatDatePart( month[someDate[4:7]])
                      yearNum = formatDatePart( someDate[8:12])
                      You don't need the formatDatePart function at all:
                      newDate = ",%4s-%02d-%2s," % \
                      (someDate[8:12],month[someDate[4:7]],someDate[1:3])

                      newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                      line = line.replace(so meDate, newDate)

                      outfile.writeli nes(line)

                      infile.close
                      outfile.close
                      You have not read the answers given to the OP, have you. Because if you
                      had, your code would be:
                      infile.close()
                      outfile.close()
                      The reason your version seems to be working, is that you probably
                      execute your code from the command-line and exiting from Python to
                      command-line closes the files, even if you don't.
                      >
                      Cheers,
                      Jussi
                      --

                      >

                      Jussi,

                      Thanks for the feedback. I received similar comments on a couple of
                      those items, and posted a newer version an hour or two ago. I think
                      the only thing missing there is a friendly blank line after my "for
                      line in infile:" statement.

                      Please let me know if there is anything else.

                      Shawn

                      Comment

                      • Bruno Desthuilliers

                        #12
                        Re: Fwd: Python new user question - file writeline error

                        Shawn Milo a écrit :
                        To the list:
                        >
                        I have come up with something that's working fine. However, I'm fairly
                        new to Python, so I'd really appreciate any suggestions on how this
                        can be made more Pythonic.
                        >
                        Thanks,
                        Shawn
                        >
                        >
                        >
                        >
                        >
                        >
                        Okay, here's what I have come up with:
                        >
                        >
                        #! /usr/bin/python
                        >
                        import sys
                        import re
                        >
                        month
                        ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':8,'SEP':9,' OCT':10,'NOV':1 1,'DEC':12}
                        >
                        infile=file('TV A-0316','r')
                        outfile=file('t mp.out','w')
                        >
                        def formatDatePart( x):
                        "take a number and transform it into a two-character string,
                        zero padded"
                        x = str(x)
                        while len(x) < 2:
                        x = "0" + x
                        return x
                        x = "%02d" % x

                        regex = re.compile(r",\ d{2}/[A-Z]{3}/\d{4},")
                        regexps are not really pythonic - we tend to use them only when we have
                        no better option. When it comes to parsing CSV files and/or dates, we do
                        have better solution : the csv module and the datetime module....
                        for line in infile:
                        matches = regex.findall(l ine)
                        for someDate in matches:
                        >
                        dayNum = formatDatePart( someDate[1:3])
                        monthNum = formatDatePart( month[someDate[4:7]])
                        yearNum = formatDatePart( someDate[8:12])
                        >
                        newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                        line = line.replace(so meDate, newDate)

                        outfile.writeli nes(line)
                        >
                        infile.close
                        outfile.close
                        I wonder why some of us took time to answer your first question. You
                        obviously forgot to read these answers.

                        Comment

                        • James

                          #13
                          Re: Fwd: Python new user question - file writeline error

                          On Feb 8, 3:26 pm, Bruno Desthuilliers
                          <bdesth.quelque ch...@free.quel quepart.frwrote :
                          Shawn Milo a écrit :
                          >
                          >
                          >
                          To the list:
                          >
                          I have come up with something that's working fine. However, I'm fairly
                          new to Python, so I'd really appreciate any suggestions on how this
                          can be made more Pythonic.
                          >
                          Thanks,
                          Shawn
                          >
                          Okay, here's what I have come up with:
                          >
                          #! /usr/bin/python
                          >
                          import sys
                          import re
                          >
                          month
                          ={'JAN':1,'FEB' :2,'MAR':3,'APR ':4,'MAY':5,'JU N':6,'JUL':7,'A UG':8,'SEP':9,' ­OCT':10,'NOV': 11,'DEC':12}
                          >
                          infile=file('TV A-0316','r')
                          outfile=file('t mp.out','w')
                          >
                          def formatDatePart( x):
                          "take a number and transform it into a two-character string,
                          zero padded"
                          x = str(x)
                          while len(x) < 2:
                          x = "0" + x
                          return x
                          >
                          x = "%02d" % x
                          >
                          regex = re.compile(r",\ d{2}/[A-Z]{3}/\d{4},")
                          >
                          regexps are not really pythonic - we tend to use them only when we have
                          no better option. When it comes to parsing CSV files and/or dates, we do
                          have better solution : the csv module and the datetime module....
                          >
                          for line in infile:
                          matches = regex.findall(l ine)
                          for someDate in matches:
                          >
                          dayNum = formatDatePart( someDate[1:3])
                          monthNum = formatDatePart( month[someDate[4:7]])
                          yearNum = formatDatePart( someDate[8:12])
                          >
                          newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                          line = line.replace(so meDate, newDate)
                          outfile.writeli nes(line)
                          >
                          infile.close
                          outfile.close
                          >
                          I wonder why some of us took time to answer your first question. You
                          obviously forgot to read these answers.
                          No offense - but the fact that 're' module is available, doesn't that
                          mean we can use it? (Pythonic or not - not sure what is really
                          pythonic at this stage of learning...)
                          Like Perl, I'm sure there are more than one way to solve problems in
                          Python.

                          I appreciate everyone's feedback - I definitely got more than
                          expected, but it feels comforting that people do care about writing
                          better codes! :)

                          Comment

                          • Gabriel Genellina

                            #14
                            Re: Fwd: Python new user question - file writeline error

                            En Thu, 08 Feb 2007 14:20:57 -0300, Shawn Milo <Shawn@Milochik .com>
                            escribió:
                            On 8 Feb 2007 09:05:51 -0800, Gabriel Genellina <gagsl-py@yahoo.com.ar >
                            wrote:
                            >On 8 feb, 12:41, "Shawn Milo" <S...@Milochik. comwrote:
                            >>
                            I have come up with something that's working fine. However, I'm fairly
                            new to Python, so I'd really appreciate any suggestions on how this
                            can be made more Pythonic.
                            >>
                            >A few comments:
                            >>
                            >You don't need the formatDatePart function; delete it, and replace
                            >newDate = ",%s-%s-%s," % (yearNum,monthN um,dayNum)
                            >with
                            >newDate = ",%04.4d-%02.2d-%02.2d," % (yearNum,monthN um,dayNum)
                            >>
                            >and before:
                            > dayNum, monthNum, yearNum = [int(num) for num in
                            >someDate[1:-1].split('/')]
                            >>
                            >And this: outfile.writeli nes(line)
                            >should be: outfile.write(l ine)
                            >(writelines works almost by accident here).
                            >>
                            >You forget again to use () to call the close methods:
                            >infile.close ()
                            >outfile.close( )
                            >>
                            >I don't like the final replace, but for a script like this I think
                            >it's OK.
                            >>
                            >--
                            >Gabriel Genellina
                            >>
                            >--
                            >http://mail.python.org/mailman/listinfo/python-list
                            >>
                            >
                            >
                            Gabriel,
                            >
                            Thanks for the comments! The new version is below. I thought it made a
                            little more sense to format the newDate = ... line the way I have it
                            below, although I did incorporate your suggestions.
                            Looks pretty good for me!
                            Just one little thing I would change, the variables monthNum, dayNum etc.;
                            the suffix might indicate that they're numbers, but they're strings
                            instead. So I would move the int(...) a few lines above, where the
                            variables are defined.
                            But that's just a cosmetic thing and just a matter of taste.
                            Also, the
                            formatting options you provided seemed to specify not only string
                            padding, but also decimal places, so I changed it. Please let me know
                            if there is some other meaning behind the way you did it.
                            No, it has no meaning, at least for this range of values.
                            As for not liking the replace line, what would you suggest instead?
                            You already have scanned the line to find the matching fragment; the match
                            object knows exactly where it begins and ends; so one could replace it
                            with the reformatted value without searching again, wich takes some more
                            time, at least in principle.
                            But this makes the code a bit more complex, and it would only make sense
                            if you were to process millions of lines, and even then, the execution
                            might be I/O-bound so you would gain nothing at the end.
                            That's why I think it's OK as it is now.

                            --
                            Gabriel Genellina

                            Comment

                            Working...