Endianism

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jorba101
    New Member
    • Jun 2008
    • 86

    Endianism

    I'm aware of the importance of the concept of endianism, but I don't really know its formal definition.

    In particular, what does mean "big-endian" when applied to network programming?

    i.e., which byte "comes first" from the bus, in a big-endian tx scheme?

    Regarding compilers, for a type of 16 bits, will it make any difference the operation:

    (0xff & ( t_16_data >> 8 ) )

    regarding compiler / platform endianness?
  • weaknessforcats
    Recognized Expert Expert
    • Mar 2007
    • 9214

    #2
    Memory locations are numbered in sequence, like 1,2,3 and 4.

    If a number has its most significant byte as byte 4 above then it is little endian. That is, the little-end-in byte 1

    Otherwise, if byte 1 is the most signficant byte, then it is big endian. That is, big-end-in byte 1.

    How the number resides in memory determines how to process it. Unix/Linux are big endian whereas Windows is little endian.

    Comment

    • jorba101
      New Member
      • Jun 2008
      • 86

      #3
      Ok.

      So it happens to be the opposite of what I had in mind.

      Anyway, going further:

      In general, if I declare:

      Code:
      char c[1];
      Then, c[0] is the lowest or the highest position? That's what's confusing me.

      Thanks,

      Comment

      • JosAH
        Recognized Expert MVP
        • Mar 2007
        • 11453

        #4
        Originally posted by weaknessforcats
        How the number resides in memory determines how to process it. Unix/Linux are big endian whereas Windows is little endian.
        Sorry, that is not true; the endianess is normally determined by the processor.
        Linux normally runs on an X86 so it uses little endian numbers. Some processors
        can change their endianess to complicate matters even more.

        The de facto standard over the net is big endian so all those X86 machines keep
        on swapping all those bytes back and forth all the time.

        kind regards,

        Jos

        Comment

        • weaknessforcats
          Recognized Expert Expert
          • Mar 2007
          • 9214

          #5
          Originally posted by JosAH
          Sorry, that is not true; the endianess is normally determined by the processor.
          Sorry, that is not true. The processor determines the bit order within the register aka bit-endianess. The order of the bytes as stored in memory is determined by the address space model. It is how the bytes are stored in memory that deternmines big or little endian and not the processor.

          You are correct, however, on the network. It is big endian.

          Comment

          • JosAH
            Recognized Expert MVP
            • Mar 2007
            • 11453

            #6
            Originally posted by weaknessforcats
            Sorry, that is not true. The processor determines the bit order within the register aka bit-endianess. The order of the bytes as stored in memory is determined by the address space model. It is how the bytes are stored in memory that deternmines big or little endian and not the processor.
            It does though; the bit order is determined by how you wire the data bus to the
            processor. The byte order is determined by the processor and how you wire your
            address bus. The processor either expects the lowest to highest bytes from the
            lowest address or vice versa. That's what big/little endian is all about. It don't
            know about a processor that changes its endianess according to its memory
            address space model.

            kind regards,

            Jos

            Comment

            • donbock
              Recognized Expert Top Contributor
              • Mar 2008
              • 2427

              #7
              Originally posted by jorba101
              Regarding compilers, for a type of 16 bits, will it make any difference the operation:

              (0xff & ( t_16_data >> 8 ) )

              regarding compiler / platform endianness?
              To the best of my recollection, you can blithely ignore endianness unless you ...
              1. Use different types to access the same location (for instance, sometimes you access a 'long' variable as a 'long' and sometimes as an array of char's. There are a very few exceptions, but the rule should be: don't do this.
              2. You are doing memory-mapped I/O and you need the data bits to line up properly with the I/O device.
              3. You are transferring information from one computer to another.
              4. You are writing numeric data to a binary file (or other persistent storage device). This could be viewed as the previous case, a computer transferring data to itself. Strictly speaking, you can do this without worrying about endianness, but that leaves you with a brittle program.

              Cheers,
              donbock

              Comment

              • newb16
                Contributor
                • Jul 2008
                • 687

                #8
                Originally posted by jorba101
                Regarding compilers, for a type of 16 bits, will it make any difference the operation:

                (0xff & ( t_16_data >> 8 ) )
                regarding compiler / platform endianness?
                No. The difference is
                Code:
                int t=0x1234;
                char* p = (char*)&t;
                char  x = *p;
                On little endian x is 0x34, on big endian - 0x12.

                Comment

                • jorba101
                  New Member
                  • Jun 2008
                  • 86

                  #9
                  Thanks for the explanations.

                  Anyway, going to simple things.

                  Code:
                  int t=0x1234;
                  char* p = (char*)&t;
                  char  x = *p;
                  On little endian x is 0x34, on big endian - 0x12.
                  According to weaknessforcats , big-endian is "the lowest address has got the MSB" (The big-end in byte 1), and little-endian is "the lowest address has got the LSB" (The little-end in byte 1).

                  Let's say our universe is little endian:

                  Addr0 contains 0x34 (little end), Addr0+1 contains 0x12 (the MSB)

                  You take a char pointer to Addr0, and OK, you've got 0x34.

                  Right. I understand.

                  What about network programming? How is endianness defined on network frames? (i.e. should I assume that a "lower address" is a byte that "I do receive first" from the network?).

                  Comment

                  • JosAH
                    Recognized Expert MVP
                    • Mar 2007
                    • 11453

                    #10
                    Originally posted by jorba101
                    What about network programming? How is endianness defined on network frames? (i.e. should I assume that a "lower address" is a byte that "I do receive first" from the network?).
                    Network byte order is big endian order so if you want to send the two byte
                    number 0x1234 you first send 0x12 and next you send 0x34. If the receiving
                    machine happens to be a little endian machine you have to store the first
                    byte at a higher address or use the ntohs() and ntohl() macros.

                    kind regards,

                    Jos

                    Comment

                    • Banfa
                      Recognized Expert Expert
                      • Feb 2006
                      • 9067

                      #11
                      Originally posted by JosAH
                      Network byte order is big endian order
                      Strictly speaking shouldn't the answer be check the protocol you are using?

                      I know by convention/standard all the protocols do use big-endian byte order but there is nothing actually stopping a program from opening a socket and sending data in a proprietary protocol that uses little endian byte order is there?

                      Comment

                      • JosAH
                        Recognized Expert MVP
                        • Mar 2007
                        • 11453

                        #12
                        Originally posted by Banfa
                        Strictly speaking shouldn't the answer be check the protocol you are using?

                        I know by convention/standard all the protocols do use big-endian byte order but there is nothing actually stopping a program from opening a socket and sending data in a proprietary protocol that uses little endian byte order is there?
                        Of course, there's no physical law that forbids us to send bits over the wire any
                        way we want. It just isn't a wise thing to do.

                        kind regards,

                        Jos

                        Comment

                        • donbock
                          Recognized Expert Top Contributor
                          • Mar 2008
                          • 2427

                          #13
                          Originally posted by JosAH
                          Network byte order is big endian order so if you want to send the two byte number 0x1234 you first send 0x12 and next you send 0x34. If the receiving machine happens to be a little endian machine you have to store the first byte at a higher address or use the ntohs() and ntohl() macros.
                          Actually, I suggest you assemble the received bytes in a variable without regard for your native endianness. The following snippet extracts a 4-byte unsigned value from a big-endian data stream:
                          Code:
                          const unsigned char *p;      /* Pointer to big-endian data stream */
                          unsigned long TheValue;   /* Variable used to accumulate a 4-byte value */
                          TheValue = *p++;
                          TheValue = (TheValue << 8) | *p++;
                          TheValue = (TheValue << 8) | *p++;
                          TheValue = (TheValue << 8) | *p++;
                          A similar process can be used to convert native data to a big-endian data stream without requiring you know your native endianness.

                          I find that I have to look up big-endian or little-endian to remind myself which means high-order bytes go first. That's why I avoid those terms and use more explicit language when specifying a protocol or data format.

                          Notice that my example above referred to an unsigned value. That's because if endianness is relevant then you shouldn't take for granted the encoding scheme for negative numbers. Two's-complement is ubiquitous, but the C Standard does not require all implementations to use it.

                          My preference is to sidestep endianness and numeric-encoding issues by using text files wherever possible. I have so far chosen to ignore the possibility that I might eventually use a machine/compiler that doesn't support ASCII character encoding. (ASCII encoding is like two's-complement: it is so common that we forget that it isn't mandated by the C Standard.)

                          Cheers,
                          donbock

                          Comment

                          Working...