Hello,
I'm facing a big logical problem while writing a parser in VC++ using C.
I have to parse a file in a chunk of bytes in a round robin fashion.
Means, when I select a file, the parser will read first 512kb(IBUFFSIZE ) of data, then move to next file and parse the same way. This way I can parse a number of file spreaded over different directory uniformly.
I'm keeping a meta data in a file where I'm keeping the track of file parse and the size of file parse.
Now, I'm using fseek() function I'm moving the file pointer.
[CODE=cpp] // if the file is parsing for the first time
if ( TotalFileSize > IBUFFSIZE){
fseek(in_file_p ointer,IBUFFSIZ E,SEEK_SET);
FileSizeToParse = ftell(in_file);
}
//if the file is parsing for the second time
FileSize = TotalFileSize - AlreadyParsedFi leSize;
if ( FileSize > IBUFFSIZE){
fseek(in_file_p ointer,(Already ParsedFileSize+ IBUFFSIZE),SEEK _SET);
FileSizeToParse = ftell(in_file) - AlreadyParsedFi leSize;
}
/* setting the file to the position from where to parse [for the first time its 0, in the second pass it will be the value thats already parse] */
fseek(in_file,A lreadyParsedFil eSize,SEEK_SET) ;
// the loop to read data in buffer and parse the data in memory
while ((EOFFLAG=fgets (ibuff, FileSizeToParse , in_file_pointer ))!= NULL) {
/// parsing logic come here
}
[/CODE]
The PROBLEM with this logic is First time its parsing the chunk of data its parsing ok..
But when the file pointer is moving to the DataAlreadyPars ed and then fetching data from the file with fgets(), its retrieving the entire chunk of data from the beginning of the file to the location its specified. i.e. instead of stating from AlreadyParsedFi leSize to IBUFFSIZE, its taking FileBeginning to AlreadyParsedFi leSize+IBUFFSIZ E.
Is there any method of specifying the From Byte size and To Byte Size in fgets() function. Because for this bug the parser is parsing data that already been parsed. I'm getting duplicate data, and its the number of duplication is the number of times the file is been read.
Can anyone suggest/advice me how to get this thing done. As I'm using windows OS (VC++), I cant use much in built c function in file operation.
I have got a lot of solution form this site.. that helped me to build this parser, so I hope this time also I'll get a solution to this nagging bug.
Thanks
SouravM
I'm facing a big logical problem while writing a parser in VC++ using C.
I have to parse a file in a chunk of bytes in a round robin fashion.
Means, when I select a file, the parser will read first 512kb(IBUFFSIZE ) of data, then move to next file and parse the same way. This way I can parse a number of file spreaded over different directory uniformly.
I'm keeping a meta data in a file where I'm keeping the track of file parse and the size of file parse.
Now, I'm using fseek() function I'm moving the file pointer.
[CODE=cpp] // if the file is parsing for the first time
if ( TotalFileSize > IBUFFSIZE){
fseek(in_file_p ointer,IBUFFSIZ E,SEEK_SET);
FileSizeToParse = ftell(in_file);
}
//if the file is parsing for the second time
FileSize = TotalFileSize - AlreadyParsedFi leSize;
if ( FileSize > IBUFFSIZE){
fseek(in_file_p ointer,(Already ParsedFileSize+ IBUFFSIZE),SEEK _SET);
FileSizeToParse = ftell(in_file) - AlreadyParsedFi leSize;
}
/* setting the file to the position from where to parse [for the first time its 0, in the second pass it will be the value thats already parse] */
fseek(in_file,A lreadyParsedFil eSize,SEEK_SET) ;
// the loop to read data in buffer and parse the data in memory
while ((EOFFLAG=fgets (ibuff, FileSizeToParse , in_file_pointer ))!= NULL) {
/// parsing logic come here
}
[/CODE]
The PROBLEM with this logic is First time its parsing the chunk of data its parsing ok..
But when the file pointer is moving to the DataAlreadyPars ed and then fetching data from the file with fgets(), its retrieving the entire chunk of data from the beginning of the file to the location its specified. i.e. instead of stating from AlreadyParsedFi leSize to IBUFFSIZE, its taking FileBeginning to AlreadyParsedFi leSize+IBUFFSIZ E.
Is there any method of specifying the From Byte size and To Byte Size in fgets() function. Because for this bug the parser is parsing data that already been parsed. I'm getting duplicate data, and its the number of duplication is the number of times the file is been read.
Can anyone suggest/advice me how to get this thing done. As I'm using windows OS (VC++), I cant use much in built c function in file operation.
I have got a lot of solution form this site.. that helped me to build this parser, so I hope this time also I'll get a solution to this nagging bug.
Thanks
SouravM
Comment