Reading Data from a CSV file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • emaghero
    New Member
    • Oct 2006
    • 85

    Reading Data from a CSV file

    Greetings all,

    I have a small amount of data that I want to read from a CSV file.

    The data is stored in an array of size rows*cols, the array is zero-based.

    I open the file using an fstream object

    Code:
    ifstream data;
    data.open(file,ios_base::in);
    I create a buffer to store the data

    Code:
    int buf_size=1024*1024*10;
    char *BUF=new(char[buf_size]);
    char *buf=BUF;
    I read the data from the file into the buffer

    Code:
    data.read(buf,(1024*1024*10));
    I now want to read the data from the buffer and store it an an array of size rows*cols

    Code:
    int pos;
    
    arr=matrix(rows,cols);//This forms a zero-based array
    
    for(int i=0;i<rows;i++){
    	for(int j=0;j<cols;j++){
    		arr[i][j]=(atof(buf));
    		pos=(int)(strcspn(buf,","));
    		buf=&buf[pos+1];
    	}
    }
    The problem is that with this code there needs to be a comma at the end of each line in order for the buffer to move to the next position and read the data in the correct order. For example

    633 , 1.47154 , 0.00021,
    832 , 1.46705 , 0.00022,
    1306 , 1.460342 , 0.00034,
    1544 , 1.457424 , 0.00028 ,

    However, my data is stored without the commas at the end of each line.

    How can I change the code in the last snippet so that I can read the data from the buffer and store it in the correct order?

    As it stands now when I run the code it reads the data as follows

    633 , 1.47154 , 0.00021
    1.46705 , 0.00022 ,1.460342
    0.00034 , 1.457424 , 0.00028
    0, 0 , 0

    The command
    Code:
     pos=(int)(strcspn(buf,","));
    means that the first data point on the new line is skipped and the data is not stored correctly.

    Any suggestions on how to recitfy this, apart from going into each individual file and adding commas, would be much appreciated.

    Thanks.
  • JosAH
    Recognized Expert MVP
    • Mar 2007
    • 11453

    #2
    Also do a search for "\n"; if that position is nearer/smaller than the position for a comma, use that one instead to advance the buffer to the next entry.

    kind regards,

    Jos

    Comment

    • emaghero
      New Member
      • Oct 2006
      • 85

      #3
      Originally posted by JosAH
      Also do a search for "\n"; if that position is nearer/smaller than the position for a comma, use that one instead to advance the buffer to the next entry.

      kind regards,

      Jos
      Thanks very much.

      That did the trick.

      I replaced the line

      Code:
      pos=(int)(strcspn(buf,","));
      with

      Code:
      posa=(int)(strcspn(buf,","));
      posb=(int)(strcspn(buf,"\n"));
      pos=Min(posa,posb); // Min is a template function that returns the minimum of posa and posb
      and that took care of it.

      Comment

      • donbock
        Recognized Expert Top Contributor
        • Mar 2008
        • 2427

        #4
        How general purpose does this program need to be? Both comma and newline can be escaped in a CSV file -- do you need to properly handle those cases?

        Comment

        • emaghero
          New Member
          • Oct 2006
          • 85

          #5
          Originally posted by donbock
          How general purpose does this program need to be? Both comma and newline can be escaped in a CSV file -- do you need to properly handle those cases?
          I only need to be able to read in different sets of experimental data that I already have. So I won't be using it as part of something else that will have to accomodate multiple file formats.

          Comment

          Working...