Need help converting text to csv format

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Chuck Connors

    Need help converting text to csv format

    Hey guys. I'm working on a little program to help my wife catalog her/
    our coupons. I found a good resource but need help formatting the
    text data so that I can import it into a mysql database. Here's the
    data format:

    409220000003 Life Fitness Products $1 (12-13-08) (CVS)
    546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35
    each
    518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09)
    518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
    Cinnamon Rolls, etc. .40 (2-14-09)

    The first value (large number) is the UPC, the next element is the
    coupon description, followed by a date in parenthesis. Those are the
    only three elements I am concerned with. Can someone help me in
    reformatting this:

    409220000003 Life Fitness Products $1 (12-13-08) (CVS)
    546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35
    each
    518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09)
    518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
    Cinnamon Rolls, etc. .40 (2-14-09)

    into something like this:

    "409220000003", "Life Fitness Products $1","12-13-08"
    "546500181141", "Oust Air Sanitizer, any B1G1F up to $3.49","1-17-09"
    "518000159258", "Pillsbury Crescent Dinner Rolls, any .25","2-14-09"
    "518000550406", "Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
    Cinnamon Rolls, etc. .40","2-14-09"

    Any help, pseudo code, or whatever push in the right direction would
    be most appreciated. I am a novice Python programmer but I do have a
    good bit of PHP programming experience.

    Thanks for your time....
  • George Sakkis

    #2
    Re: Need help converting text to csv format

    On Nov 21, 10:18 am, Chuck Connors <don...@gmail.c omwrote:
    Any help, pseudo code, or whatever push in the right direction would
    be most appreciated.  I am a novice Python programmer but I do have a
    good bit of PHP programming experience.
    I'm wondering if PHP experience precludes the ability to use a search
    engine before asking for help...

    Comment

    • Chuck Connors

      #3
      Re: Need help converting text to csv format

      I'm wondering if PHP experience precludes the ability to use a search
      engine before asking for help...
      Thanks for the push in the right direction, friend.

      Comment

      • Tim Chase

        #4
        Re: Need help converting text to csv format

        409220000003 Life Fitness Products $1 (12-13-08) (CVS)
        546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35
        each
        518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09)
        518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
        Cinnamon Rolls, etc. .40 (2-14-09)
        >
        into something like this:
        >
        "409220000003", "Life Fitness Products $1","12-13-08"
        "546500181141", "Oust Air Sanitizer, any B1G1F up to $3.49","1-17-09"
        "518000159258", "Pillsbury Crescent Dinner Rolls, any .25","2-14-09"
        "518000550406", "Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
        Cinnamon Rolls, etc. .40","2-14-09"
        >
        Any help, pseudo code, or whatever push in the right direction would
        be most appreciated. I am a novice Python programmer but I do have a
        good bit of PHP programming experience.
        A regexp should be able to split this fairly neatly:

        import re
        r = re.compile(r"^( \d+)\s+(.*)\((\ d{1,2}-\d{1,2}-\d{2,4})\).*")
        out = file('out.csv', 'w')
        for i, line in enumerate(file( 'in.txt')):
        m = r.match(line)
        if not m:
        print "Line %i is malformed" % (i+1)
        continue
        out.write(','.j oin(
        '"%s"' % item.strip().re place('"', '""')
        for item in m.groups()
        ))
        out.write('\n')
        out.close()

        -tkc



        Comment

        • Joe Strout

          #5
          Re: Need help converting text to csv format

          On Nov 21, 2008, at 8:18 AM, Chuck Connors wrote:
          The first value (large number) is the UPC, the next element is the
          coupon description, followed by a date in parenthesis. Those are the
          only three elements I am concerned with. Can someone help me in
          reformatting this:
          >
          409220000003 Life Fitness Products $1 (12-13-08) (CVS)
          Well, you could use regex to parse out those three parts... on the
          other hand, str.partition might be a little easier if you're not
          already used to regex. Try this:
          >>line = "4092200000 03 Life Fitness Products $1 (12-13-08) (CVS)"
          >>num,sep,res t = line.partition( ' ')
          >>num
          '409220000003'
          >>rest
          'Life Fitness Products $1 (12-13-08) (CVS)'
          >>desc,sep,re st = rest.partition( '(')
          >>desc
          'Life Fitness Products $1 '
          >>rest
          '12-13-08) (CVS)'
          >>date,sep,re st = rest.partition( ')')
          >>date
          '12-13-08'

          So three lines of code gets you your three fields.
          into something like this:
          >
          "409220000003", "Life Fitness Products $1","12-13-08"
          There are lots of ways to do that. Here's one:
          >>qfields = ['"' + fld.strip() + '"' for fld in (num,desc,date)]
          >>out = qfields.join(', ')
          >>out
          '"409220000003" ,"Life Fitness Products $1 ","12-13-08"'

          Cheers,
          - Joe

          P.S. Pay no attention to the grumpy graybeards who want to drive new
          users away from Python. There are quite a few friendly and helpful
          people here, too. :)

          Comment

          • Chuck Connors

            #6
            Re: Need help converting text to csv format

            Wow! What a change in direction from the previous post. Thank you
            both for the help and the explanations. This will work great!

            Comment

            • Steve Holden

              #7
              Re: Need help converting text to csv format

              George Sakkis wrote:
              On Nov 21, 10:18 am, Chuck Connors <don...@gmail.c omwrote:
              >
              >Any help, pseudo code, or whatever push in the right direction would
              >be most appreciated. I am a novice Python programmer but I do have a
              >good bit of PHP programming experience.
              >
              I'm wondering if PHP experience precludes the ability to use a search
              engine before asking for help...
              I'm wondering why you bothered to write that. Next time, save yourself
              the ten seconds and just skip to the next message. The world will be a
              better place.

              though-we-all-get-crabby-sometimes-ly y'rs - steve
              --
              Steve Holden +1 571 484 6266 +1 800 494 3119
              Holden Web LLC http://www.holdenweb.com/

              Comment

              • Tim Chase

                #8
                Re: Need help converting text to csv format

                >>qfields = ['"' + fld.strip() + '"' for fld in (num,desc,date)]
                >>out = qfields.join(', ')
                Just a quick note here to prevent the confusion of the OP...this
                should be

                ','.join(qfield s)

                -tkc




                Comment

                • Tim Golden

                  #9
                  Re: Need help converting text to csv format

                  Tim Chase wrote:
                  > >>qfields = ['"' + fld.strip() + '"' for fld in (num,desc,date)]
                  > >>out = qfields.join(', ')
                  >
                  Just a quick note here to prevent the confusion of the OP...this should be
                  >
                  ','.join(qfield s)
                  To be honest, it's so easy to use the stdlib csv module
                  that I'd always recommend that, especially as it covers
                  all those annoying corner cases with embedded commas and
                  quotes and stuff. And it's tested to hell and back.


                  In general, for a list of field tuples:

                  <code>
                  import csv

                  fields = [
                  ("Tim Golden", 40, "inf,o1"),
                  ("Fred Smith", 25, "info2"),
                  ("Joe O'Reilly", 55, 'inf,"blah",o3' ), ## lots of messiness
                  ]

                  ofile = open ("data.csv", "wb")
                  try:
                  writer = csv.writer (ofile)
                  writer.writerow s ([[str (i) for i in f] for f in fields])
                  finally:
                  ofile.close ()

                  </code>

                  Comment

                  • Joe Strout

                    #10
                    Re: Need help converting text to csv format

                    On Nov 21, 2008, at 9:22 AM, Tim Golden wrote:
                    Tim Chase wrote:
                    >>>qfields = ['"' + fld.strip() + '"' for fld in (num,desc,date)]
                    >>>out = qfields.join(', ')
                    >Just a quick note here to prevent the confusion of the OP...this
                    >should be
                    > ','.join(qfield s)
                    Thanks Tim #1, for pointing out my error (my standard procedure with
                    join is to [1] do it backwards, [2] curse myself, and [3] do it
                    correctly, but it looks like I only copied step 1 into my email).
                    To be honest, it's so easy to use the stdlib csv module
                    that I'd always recommend that, especially as it covers
                    all those annoying corner cases with embedded commas and
                    quotes and stuff. And it's tested to hell and back.
                    And thanks Tim #2, for pointing that out. I had frankly forgotten
                    about it, but that's good advice.
                    finally:
                    ofile.close ()
                    A follow-up question here... is it really necessary to close things
                    like files in Python? I've been slumming it in the REALbasic
                    community for the last decade, where you generally don't worry about
                    such things, as any object that represents something "open" will
                    automatically "close" itself when it dies (and since a closed object
                    in those cases is useless, I'd rather not have it around after it's
                    closed anyway). Is the same true in Python, or do we need to
                    explicitly close things?

                    Thanks,
                    - Joe

                    Comment

                    • Tim Golden

                      #11
                      Re: Need help converting text to csv format

                      Joe Strout wrote:
                      A follow-up question here... is it really necessary to close things like
                      files in Python? I've been slumming it in the REALbasic community for
                      the last decade, where you generally don't worry about such things, as
                      any object that represents something "open" will automatically "close"
                      itself when it dies (and since a closed object in those cases is
                      useless, I'd rather not have it around after it's closed anyway). Is the
                      same true in Python, or do we need to explicitly close things?

                      Implementation dependent. (Because it depends on what kind
                      of garbage collection or object finalisation happens). Like
                      most people, I imagine, in ad-hoc code I'll just do things
                      like:

                      <code
                      import csv

                      writer = csv.writer (open ("data.csv", "wb"))
                      writer.writerow (['blah', blah'])

                      </code>

                      If I exit the interpreter here, I'm pretty much safe.
                      But if I add os.startfile ("data.csv") , I'll likely
                      get nothing or a file lock.

                      I believe that in other implementations -- Jython,
                      for example -- you cannot rely on the file closing
                      itself.

                      In general in posting public code, especially to
                      newcomers, I make the effort to use a try-finally
                      or a with statement.

                      TJG

                      Comment

                      • Tim Chase

                        #12
                        Re: Need help converting text to csv format

                        Tim Golden wrote:
                        Tim Chase wrote:
                        >> >>qfields = ['"' + fld.strip() + '"' for fld in (num,desc,date)]
                        >> >>out = qfields.join(', ')
                        >Just a quick note here to prevent the confusion of the OP...this should be
                        >>
                        > ','.join(qfield s)
                        >
                        To be honest, it's so easy to use the stdlib csv module
                        that I'd always recommend that, especially as it covers
                        all those annoying corner cases with embedded commas and
                        quotes and stuff. And it's tested to hell and back.
                        [cedes point to TimG]

                        yes, the CSV module has some wonderful stuff in it, and I
                        regularly use it for *reading* CSV files. But for writing them,
                        it's often just as fast (for my purposes) to simply code my 1st
                        post's quickie as it is to scrounge in the docs/docstrings to
                        remember how to let the CSV do the creation. But TimG is correct
                        that using csv is the _right_ way to do it...gotta leave SOME
                        work for the OP :)

                        too-many-tims-on-c.l.p'ly yers,

                        -tkc


                        Comment

                        • Tim Golden

                          #13
                          Re: Need help converting text to csv format

                          Tim Chase wrote:
                          yes, the CSV module has some wonderful stuff in it, and I regularly use
                          it for *reading* CSV files. But for writing them, it's often just as
                          fast (for my purposes) to simply code my 1st post's quickie as it is to
                          scrounge in the docs/docstrings to remember how to let the CSV do the
                          creation. But TimG is correct that using csv is the _right_ way to do
                          it...gotta leave SOME work for the OP :)
                          Well I was going to point out that it's not that hard to remember
                          the functions, when I realised that I needn't have str ()-ed all
                          those values since the module does that for you. As it happens I
                          usually find it easy enough to remember.

                          But I think it's one of those
                          modules where you tend to think: why bother? it's only a
                          ",".join (str (i) for i in f) away. Until you hit embedded quotes
                          or irregular delimiters. And you realise you've spawned a
                          monster genexp. And *then* you reach for "import csv" :)

                          TJG

                          Comment

                          • MRAB

                            #14
                            Re: Need help converting text to csv format

                            Tim Golden wrote:
                            Joe Strout wrote:
                            >A follow-up question here... is it really necessary to close things
                            >like files in Python? I've been slumming it in the REALbasic
                            >community for the last decade, where you generally don't worry about
                            >such things, as any object that represents something "open" will
                            >automaticall y "close" itself when it dies (and since a closed object
                            >in those cases is useless, I'd rather not have it around after it's
                            >closed anyway). Is the same true in Python, or do we need to
                            >explicitly close things?
                            >
                            >
                            Implementation dependent. (Because it depends on what kind
                            of garbage collection or object finalisation happens). Like
                            most people, I imagine, in ad-hoc code I'll just do things like:
                            >
                            <code
                            import csv
                            >
                            writer = csv.writer (open ("data.csv", "wb"))
                            writer.writerow (['blah', blah'])
                            >
                            </code>
                            >
                            If I exit the interpreter here, I'm pretty much safe.
                            But if I add os.startfile ("data.csv") , I'll likely
                            get nothing or a file lock.
                            >
                            I believe that in other implementations -- Jython,
                            for example -- you cannot rely on the file closing
                            itself.
                            >
                            The file will be closed automatically when the file object is
                            garbage-collected.

                            CPython uses reference-counting, so the file object is garbage-collected
                            as soon as there are no references to it.

                            Jython (and IronPython?) are garbage-collected in the background, so the
                            file object is garbage-collected at some point (and you don't know when
                            that will be!) when there are no longer any references to it.
                            In general in posting public code, especially to
                            newcomers, I make the effort to use a try-finally
                            or a with statement.
                            >

                            Comment

                            • M.-A. Lemburg

                              #15
                              Re: Need help converting text to csv format

                              On 2008-11-21 16:18, Chuck Connors wrote:
                              Hey guys. I'm working on a little program to help my wife catalog her/
                              our coupons. I found a good resource but need help formatting the
                              text data so that I can import it into a mysql database. Here's the
                              data format:
                              >
                              409220000003 Life Fitness Products $1 (12-13-08) (CVS)
                              546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35
                              each
                              518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09)
                              518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
                              Cinnamon Rolls, etc. .40 (2-14-09)
                              >
                              The first value (large number) is the UPC, the next element is the
                              coupon description, followed by a date in parenthesis. Those are the
                              only three elements I am concerned with. Can someone help me in
                              reformatting this:
                              >
                              409220000003 Life Fitness Products $1 (12-13-08) (CVS)
                              546500181141 Oust Air Sanitizer, any B1G1F up to $3.49 (1-17-09) .35
                              each
                              518000159258 Pillsbury Crescent Dinner Rolls, any .25 (2-14-09)
                              518000550406 Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
                              Cinnamon Rolls, etc. .40 (2-14-09)
                              >
                              into something like this:
                              >
                              "409220000003", "Life Fitness Products $1","12-13-08"
                              "546500181141", "Oust Air Sanitizer, any B1G1F up to $3.49","1-17-09"
                              "518000159258", "Pillsbury Crescent Dinner Rolls, any .25","2-14-09"
                              "518000550406", "Pillsbury Frozen Grands Biscuits, Cinnamon Rolls, Mini
                              Cinnamon Rolls, etc. .40","2-14-09"
                              >
                              Any help, pseudo code, or whatever push in the right direction would
                              be most appreciated. I am a novice Python programmer but I do have a
                              good bit of PHP programming experience.
                              Have a look at the csv module:



                              --
                              Marc-Andre Lemburg
                              eGenix.com

                              Professional Python Services directly from the Source (#1, Nov 21 2008)
                              >>Python/Zope Consulting and Support ... http://www.egenix.com/
                              >>mxODBC.Zope.D atabase.Adapter ... http://zope.egenix.com/
                              >>mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
                              _______________ _______________ _______________ _______________ ____________
                              2008-11-12: Released mxODBC.Connect 0.9.3 http://python.egenix.com/

                              :::: Try mxODBC.Zope.DA for Windows,Linux,S olaris,MacOSX for free ! ::::


                              eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
                              D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
                              Registered at Amtsgericht Duesseldorf: HRB 46611

                              Comment

                              Working...