Reading from a binary file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • PatriDa
    New Member
    • Jan 2010
    • 3

    Reading from a binary file

    I need to read form a binary file, which stores data, an array of bytes, but without interpreting them (I mean: I cannot read them as int, or double....). I need them just as bytes. Later on, I will be able to interpret them as corresponds, because I will have information about the data type, so I can say "read them as int or as double".

    Can anyone help me, and explain me which is the best way to read them? I have thought to use char * to store them, but I have read also something about uint8_t (in this case, I think the bytes are considered as unsigned int, not sure) and also void *...
  • Banfa
    Recognized Expert Expert
    • Feb 2006
    • 9067

    #2
    Commonly uint8_t (a C99 type) is defined to unsigned char. uint8_t or unsigned char would be data type to use to store file data. char * or void * are not ways to store any data; they are pointers they could point to the location you stored file data.

    However you should not use char or char * for raw data. When dealing with raw data it is best to used unsigned types to avoid sign extension complexities. Whether char is signed or unsigned is platform specific so you should only qualify it with a signed or unsigned keyword.

    The one instance where you should use char unadorned is when it is being used as part of a C style string which hopefully wont be very often for you as you are using C++.

    Comment

    • PatriDa
      New Member
      • Jan 2010
      • 3

      #3
      Thanks Banfa for your quick reply.... If I understood right, I should use an unsigned char array to store what I read from the file. But I still have doubts...

      Let me show you an example:
      I'm supposed to read from the data file 16388 bytes, and then interpret them as 8192 short values, 2bytes per value, (there are 4bytes at the end of the data block that are not valid data). What I'm doing right now is:

      Code:
      unsigned char * dataBlock;
      int bytesToRead = 16388;
      dataBlock = new char[bytesToRead];
      datafile.read(dataBlock, bytesToRead);
      Is this right? In that case, my next problem is: knowing how many values must be in this data block, 8192, what should I do to interpret the bytes in dataBlock as short values, skipping of course the 4bytes at the end, which are not data?

      Sorry if my questions are so basic, but I' stuck on this stuff and I'm not able to see the solution.... Thanks again!!!

      Comment

      • Banfa
        Recognized Expert Expert
        • Feb 2006
        • 9067

        #4
        That code looks more or less correct, you might want to put a try/ctach block round the new statement in case the memory allocatin fails and throws an exception.

        Once you have the data in an array it is fairly easy to convert any 2 bytes to a short value using the shift(<<) and or(|) or addition(+) operators to manipulate the data. However what you will need to know is the byte ordering of the file.

        Assuming a short is 16 bits numbered 0 - 15 from least significant bit to most significant bit. This is is split into 2 bytes 1 byte containing the bits 0 - 7 and the other containing 7 - 15. The computer can store these 2 bytes in memory in any order it wants as long as they are contiguous (first one then the other). The same is true for the data in the file, the bytes making up the shorts can be stored in any order. You need to know that order before you can canculate the actual short value. The you can do a calculation of the form

        shortValue = highByteValue shifted left 8 bits + lowByteValue

        There are 2 common byte orderings called little endian and big endian. In Little endian ordering the least significant byte is placed first in memory, in big endian ordering the most significant byte is placed first in memory.

        Comment

        • donbock
          Recognized Expert Top Contributor
          • Mar 2008
          • 2427

          #5
          Originally posted by PatriDa
          I need to read form a binary file, which stores data, an array of bytes, but without interpreting them (I mean: I cannot read them as int, or double....). I need them just as bytes. Later on, I will be able to interpret them as corresponds, because I will have information about the data type, so I can say "read them as int or as double".
          Are you absolutely certain the binary file will always be created by a program running from the same environment (compiler, operating system, processor) as the one you're using to read the file? If not, then it will be difficult for you to reliably decode the information in the binary file. You can't assume that two environments will have the same encoding rules for integral or floating point numbers.

          Comment

          • PatriDa
            New Member
            • Jan 2010
            • 3

            #6
            Thanks for your answers, Banfa and donbock!!!

            As I had a deadline, I couldn't spend so much time in solving this problem, so finally I managed it in other way.... Just to inform, I delayed the time of reading from the file, until I really knew the type of data I was supossed to read. I just stored the positions in the file where I had to read and how many values.

            I encountered many problems when trying to read float or double values in the way you told me, Banfa, using "formulas" to interpret the bytes in the correct way.... There were differences between the value I got and the real value (about 0.0009, but for the purpose of my program this was too much).

            Nevertheless, thank you so much for your answers, and really sorry if I made you lose your time!!

            PatriDA

            Comment

            Working...