finding out the number of rows in a CSV file

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • SimonPalmer

    finding out the number of rows in a CSV file

    anyone know how I would find out how many rows are in a csv file?

    I can't find a method which does this on csv.reader.

    Thanks in advance
  • Jon Clements

    #2
    Re: finding out the number of rows in a CSV file

    On Aug 27, 12:16 pm, SimonPalmer <simon.pal...@g mail.comwrote:
    anyone know how I would find out how many rows are in a csv file?
    >
    I can't find a method which does this on csv.reader.
    >
    Thanks in advance
    You have to iterate each row and count them -- there's no other way
    without supporting information (since each row length is naturally
    variable, you can't even use the file size as an indicator).

    Something like:

    row_count = sum(1 for row in csv.reader( open('filename. csv') ) )

    hth
    Jon.

    Comment

    • Simon Brunning

      #3
      Re: finding out the number of rows in a CSV file

      2008/8/27 SimonPalmer <simon.palmer@g mail.com>:
      anyone know how I would find out how many rows are in a csv file?
      >
      I can't find a method which does this on csv.reader.
      len(list(csv.re ader(open('my.c sv'))))

      --
      Cheers,
      Simon B.
      simon@brunningo nline.net

      Comment

      • Jon Clements

        #4
        Re: finding out the number of rows in a CSV file

        On Aug 27, 12:29 pm, "Simon Brunning" <si...@brunning online.net>
        wrote:
        2008/8/27 SimonPalmer <simon.pal...@g mail.com>:
        >
        anyone know how I would find out how many rows are in a csv file?
        >
        I can't find a method which does this on csv.reader.
        >
        len(list(csv.re ader(open('my.c sv'))))
        >
        --
        Cheers,
        Simon B.
        si...@brunningo nline.nethttp://www.brunningonl ine.net/simon/blog/
        Not the best of ideas if the row size or number of rows is large!
        Manufacture a list, then discard to get its length -- ouch!

        Comment

        • Simon Brunning

          #5
          Re: finding out the number of rows in a CSV file

          2008/8/27 Jon Clements <joncle@googlem ail.com>:
          >len(list(csv.r eader(open('my. csv'))))
          Not the best of ideas if the row size or number of rows is large!
          Manufacture a list, then discard to get its length -- ouch!
          I do try to avoid premature optimization. ;-)

          --
          Cheers,
          Simon B.

          Comment

          • SimonPalmer

            #6
            Re: finding out the number of rows in a CSV file [Resolved]

            On Aug 27, 12:41 pm, Jon Clements <jon...@googlem ail.comwrote:
            On Aug 27, 12:29 pm, "Simon Brunning" <si...@brunning online.net>
            wrote:
            >
            2008/8/27 SimonPalmer <simon.pal...@g mail.com>:
            >
            anyone know how I would find out how many rows are in a csv file?
            >
            I can't find a method which does this on csv.reader.
            >
            len(list(csv.re ader(open('my.c sv'))))
            >
            --
            Cheers,
            Simon B.
            si...@brunningo nline.nethttp://www.brunningonl ine.net/simon/blog/
            >
            Not the best of ideas if the row size or number of rows is large!
            Manufacture a list, then discard to get its length -- ouch!
            Thanks to everyone for their suggestions.

            In my case the number of rows is never going to be that large (<200)
            so it is a practical if slightly inelegant solution

            Comment

            • SimonPalmer

              #7
              Re: finding out the number of rows in a CSV file [Resolved]

              On Aug 27, 12:50 pm, SimonPalmer <simon.pal...@g mail.comwrote:
              On Aug 27, 12:41 pm, Jon Clements <jon...@googlem ail.comwrote:
              >
              >
              >
              On Aug 27, 12:29 pm, "Simon Brunning" <si...@brunning online.net>
              wrote:
              >
              2008/8/27 SimonPalmer <simon.pal...@g mail.com>:
              >
              anyone know how I would find out how many rows are in a csv file?
              >
              I can't find a method which does this on csv.reader.
              >
              len(list(csv.re ader(open('my.c sv'))))
              >
              --
              Cheers,
              Simon B.
              si...@brunningo nline.nethttp://www.brunningonl ine.net/simon/blog/
              >
              Not the best of ideas if the row size or number of rows is large!
              Manufacture a list, then discard to get its length -- ouch!
              >
              Thanks to everyone for their suggestions.
              >
              In my case the number of rows is never going to be that large (<200)
              so it is a practical if slightly inelegant solution
              actually not resolved...

              after reading the file throughthe csv.reader for the length I cannot
              iterate over the rows. How do I reset the row iterator?

              Comment

              • Jon Clements

                #8
                Re: finding out the number of rows in a CSV file

                On Aug 27, 12:48 pm, "Simon Brunning" <si...@brunning online.net>
                wrote:
                2008/8/27 Jon Clements <jon...@googlem ail.com>:
                >
                len(list(csv.re ader(open('my.c sv'))))
                Not the best of ideas if the row size or number of rows is large!
                Manufacture a list, then discard to get its length -- ouch!
                >
                I do try to avoid premature optimization. ;-)
                >
                --
                Cheers,
                Simon B.
                :)

                Comment

                • Jon Clements

                  #9
                  Re: finding out the number of rows in a CSV file [Resolved]

                  On Aug 27, 12:54 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                  On Aug 27, 12:50 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                  >
                  >
                  >
                  On Aug 27, 12:41 pm, Jon Clements <jon...@googlem ail.comwrote:
                  >
                  On Aug 27, 12:29 pm, "Simon Brunning" <si...@brunning online.net>
                  wrote:
                  >
                  2008/8/27 SimonPalmer <simon.pal...@g mail.com>:
                  >
                  anyone know how I would find out how many rows are in a csv file?
                  >
                  I can't find a method which does this on csv.reader.
                  >
                  len(list(csv.re ader(open('my.c sv'))))
                  >
                  --
                  Cheers,
                  Simon B.
                  si...@brunningo nline.nethttp://www.brunningonl ine.net/simon/blog/
                  >
                  Not the best of ideas if the row size or number of rows is large!
                  Manufacture a list, then discard to get its length -- ouch!
                  >
                  Thanks to everyone for their suggestions.
                  >
                  In my case the number of rows is never going to be that large (<200)
                  so it is a practical if slightly inelegant solution
                  >
                  actually not resolved...
                  >
                  after reading the file throughthe csv.reader for the length I cannot
                  iterate over the rows. How do I reset the row iterator?
                  If you're sure that the number of rows is always less than 200.

                  Slightly modify Simon Brunning's example and do:

                  rows = list( csv.reader(open ('filename.csv' )) )
                  row_count = len(rows)
                  for row in rows:
                  # do something




                  Comment

                  • John Machin

                    #10
                    Re: finding out the number of rows in a CSV file [Resolved]

                    On Aug 27, 9:54 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                    On Aug 27, 12:50 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                    >
                    >
                    >
                    On Aug 27, 12:41 pm, Jon Clements <jon...@googlem ail.comwrote:
                    >
                    On Aug 27, 12:29 pm, "Simon Brunning" <si...@brunning online.net>
                    wrote:
                    >
                    2008/8/27 SimonPalmer <simon.pal...@g mail.com>:
                    >
                    anyone know how I would find out how many rows are in a csv file?
                    >
                    I can't find a method which does this on csv.reader.
                    >
                    len(list(csv.re ader(open('my.c sv'))))
                    >
                    --
                    Cheers,
                    Simon B.
                    si...@brunningo nline.nethttp://www.brunningonl ine.net/simon/blog/
                    >
                    Not the best of ideas if the row size or number of rows is large!
                    Manufacture a list, then discard to get its length -- ouch!
                    >
                    Thanks to everyone for their suggestions.
                    >
                    In my case the number of rows is never going to be that large (<200)
                    so it is a practical if slightly inelegant solution
                    >
                    actually not resolved...
                    >
                    after reading the file throughthe csv.reader for the length I cannot
                    iterate over the rows.
                    OK, I'll bite: Why do you think you need to know the number of rows in
                    advance?
                    How do I reset the row iterator?
                    You don't. You throw it away and get another one. You need to seek to
                    the beginning of the file first. E.g.:

                    C:\junk>type foo.csv
                    blah,blah
                    waffle
                    q,w,e,r,t,y

                    C:\junk>type csv2iters.py
                    import csv
                    f = open('foo.csv', 'rb')
                    rdr = csv.reader(f)
                    n = 0
                    for row in rdr:
                    n += 1
                    print n, f.tell()
                    f.seek(0)
                    rdr = csv.reader(f)
                    for row in rdr:
                    print row

                    C:\junk>csv2ite rs.py
                    3 32
                    ['blah', 'blah']
                    ['waffle']
                    ['q', 'w', 'e', 'r', 't', 'y']

                    HTH,
                    John

                    Comment

                    • SimonPalmer

                      #11
                      Re: finding out the number of rows in a CSV file [Resolved]

                      On Aug 27, 1:15 pm, John Machin <sjmac...@lexic on.netwrote:
                      On Aug 27, 9:54 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                      >
                      >
                      >
                      On Aug 27, 12:50 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                      >
                      On Aug 27, 12:41 pm, Jon Clements <jon...@googlem ail.comwrote:
                      >
                      On Aug 27, 12:29 pm, "Simon Brunning" <si...@brunning online.net>
                      wrote:
                      >
                      2008/8/27 SimonPalmer <simon.pal...@g mail.com>:
                      >
                      anyone know how I would find out how many rows are in a csv file?
                      >
                      I can't find a method which does this on csv.reader.
                      >
                      len(list(csv.re ader(open('my.c sv'))))
                      >
                      --
                      Cheers,
                      Simon B.
                      si...@brunningo nline.nethttp://www.brunningonl ine.net/simon/blog/
                      >
                      Not the best of ideas if the row size or number of rows is large!
                      Manufacture a list, then discard to get its length -- ouch!
                      >
                      Thanks to everyone for their suggestions.
                      >
                      In my case the number of rows is never going to be that large (<200)
                      so it is a practical if slightly inelegant solution
                      >
                      actually not resolved...
                      >
                      after reading the file throughthe csv.reader for the length I cannot
                      iterate over the rows.
                      >
                      OK, I'll bite: Why do you think you need to know the number of rows in
                      advance?
                      >
                      How do I reset the row iterator?
                      >
                      You don't. You throw it away and get another one. You need to seek to
                      the beginning of the file first. E.g.:
                      >
                      C:\junk>type foo.csv
                      blah,blah
                      waffle
                      q,w,e,r,t,y
                      >
                      C:\junk>type csv2iters.py
                      import csv
                      f = open('foo.csv', 'rb')
                      rdr = csv.reader(f)
                      n = 0
                      for row in rdr:
                      n += 1
                      print n, f.tell()
                      f.seek(0)
                      rdr = csv.reader(f)
                      for row in rdr:
                      print row
                      >
                      C:\junk>csv2ite rs.py
                      3 32
                      ['blah', 'blah']
                      ['waffle']
                      ['q', 'w', 'e', 'r', 't', 'y']
                      >
                      HTH,
                      John
                      this is all good, and thanks for your time. I need the number of rows
                      because of the nature of the data and what I do with it on reading. I
                      need to initialise some data structures and that is *much* more
                      efficient if I know in advance the number of rows of data. The cost
                      of reading the file is probably less than incrementally extending my
                      internal structures because of their complexity.

                      To be honest these are all good solutions and I think I have a a view
                      of csv reading that comes form different technologies plus lack of
                      experience with python which just means that I don't know where to
                      look for answers.

                      Very happy that I can now proceed.

                      Comment

                      • TYR

                        #12
                        Re: finding out the number of rows in a CSV file [Resolved]

                        Use csv.DictReader to get a list of dicts (you get one for each row,
                        with the values as the vals and the column headings as the keys) and
                        then do a len(list)?

                        Comment

                        • Peter Otten

                          #13
                          Re: finding out the number of rows in a CSV file [Resolved]

                          Jon Clements wrote:
                          On Aug 27, 12:54 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                          >after reading the file throughthe csv.reader for the length I cannot
                          >iterate over the rows. How do I reset the row iterator?
                          >
                          If you're sure that the number of rows is always less than 200.
                          Or 2000. Or 20000...

                          Actually any number that doesn't make your machine fall into a coma will do.
                          Slightly modify Simon Brunning's example and do:
                          >
                          rows = list( csv.reader(open ('filename.csv' )) )
                          row_count = len(rows)
                          for row in rows:
                          # do something
                          Peter

                          Comment

                          • John S

                            #14
                            Re: finding out the number of rows in a CSV file [Resolved]

                            [OP] Jon Clements wrote:
                            On Aug 27, 12:54 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                            >after reading the file throughthe csv.reader for the length I cannot
                            >iterate over the rows. How do I reset the row iterator?
                            A CSV file is just a text file. Don't use csv.reader for counting rows
                            -- it's overkill. You can just read the file normally, counting lines
                            (lines == rows).

                            This is similar to what Jon Clements said, but you don't need the csv
                            module.

                            num_rows = sum(1 for line in open("myfile.cs v"))

                            As other posters have said, there is no free lunch. When you use
                            csv.reader, it reads the lines, so once it's finished you're at the
                            end of the file.

                            Comment

                            • Peter Otten

                              #15
                              Re: finding out the number of rows in a CSV file [Resolved]

                              John S wrote:
                              [OP] Jon Clements wrote:
                              >On Aug 27, 12:54 pm, SimonPalmer <simon.pal...@g mail.comwrote:
                              >>after reading the file throughthe csv.reader for the length I cannot
                              >>iterate over the rows. How do I reset the row iterator?
                              >
                              A CSV file is just a text file. Don't use csv.reader for counting rows
                              -- it's overkill. You can just read the file normally, counting lines
                              (lines == rows).
                              Wrong. A field may have embedded newlines:
                              >>import csv
                              >>csv.writer(op en("tmp.csv", "w")).write row(["a" + "\n"*10 + "b"])
                              >>sum(1 for row in csv.reader(open ("tmp.csv")) )
                              1
                              >>sum(1 for line in open("tmp.csv") )
                              11

                              Peter

                              Comment

                              Working...