endianness and sscanf/sprintf

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • pramod

    endianness and sscanf/sprintf

    Two different platforms communicate over protocols which consist of
    functions and arguments in ascii form. System might be little
    endian/big endian.

    It is possible to format string using sprintf and retreive it using
    sscanf.
    Each parameter has a delimiter, data type size is ported to the
    platform, and expected argument order is known.

    Is this approach portable w.r.t. endianess ?


    regards,
    Pramod
  • John Carson

    #2
    Re: endianness and sscanf/sprintf

    "pramod" <spramod_in@yah oo.com> wrote in message
    news:c65f94e1.0 312310120.3b1bc e30@posting.goo gle.com[color=blue]
    > Two different platforms communicate over protocols which consist of
    > functions and arguments in ascii form. System might be little
    > endian/big endian.
    >
    > It is possible to format string using sprintf and retreive it using
    > sscanf.
    > Each parameter has a delimiter, data type size is ported to the
    > platform, and expected argument order is known.
    >
    > Is this approach portable w.r.t. endianess ?
    >
    >
    > regards,
    > Pramod[/color]


    endianness only affects the way that integers are stored (and perhaps
    floating point numbers --- I am not sure). It does not affect the storage of
    characters so it is not an issue if you are only sending text.


    --
    John Carson
    1. To reply to email address, remove donald
    2. Don't reply to email address (post here instead)

    Comment

    • EventHelix.com

      #3
      Re: endianness and sscanf/sprintf

      You will be fine as everything is being converted to characters.
      As long as characters are represented as 8 bytes, the numbers
      will be interpreted correctly. Java bytecodes use the same approach.

      The following article discusses the endianness in detail:



      Sandeep
      --
      Sequence diagram based systems engineering and architecture design tool. Built in support for alternative scenarios and multi-tier architectures.

      EventStudio 2.0 - Go Beyond UML Use Case and Sequence Diagrams

      Comment

      • Richard Heathfield

        #4
        Re: endianness and sscanf/sprintf

        EventHelix.com wrote:
        [color=blue]
        > You will be fine as everything is being converted to characters.
        > As long as characters are represented as 8 bytes, the numbers
        > will be interpreted correctly.[/color]

        In C (and, as far as I am aware, C++ too), characters are always represented
        in a single byte. Character /constants/ are represented (in C, but not C++)
        by the int type, which might conceivably be eight bytes. Is that what you
        meant?

        (Followups set to comp.lang.c)

        --
        Richard Heathfield : binary@eton.pow ernet.co.uk
        "Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
        C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
        K&R answers, C books, etc: http://users.powernet.co.uk/eton

        Comment

        • Martijn Lievaart

          #5
          Re: endianness and sscanf/sprintf

          On Wed, 31 Dec 2003 01:20:44 -0800, pramod wrote:
          [color=blue]
          > Two different platforms communicate over protocols which consist of
          > functions and arguments in ascii form. System might be little
          > endian/big endian.
          >
          > It is possible to format string using sprintf and retreive it using
          > sscanf.
          > Each parameter has a delimiter, data type size is ported to the
          > platform, and expected argument order is known.
          >
          > Is this approach portable w.r.t. endianess ?[/color]

          Yes, and a very good way to do it. But only if really using ascii,
          otherwise you may end up mixing codesets. Consider using UTF8 if you use
          characters >=128 (i.e. not ascii).

          HTH,
          M4

          Comment

          • Jeff Schwab

            #6
            Re: endianness and sscanf/sprintf

            EventHelix.com wrote:[color=blue]
            > You will be fine as everything is being converted to characters.
            > As long as characters are represented as 8 bytes,[/color]

            bits?
            [color=blue]
            > the numbers
            > will be interpreted correctly. Java bytecodes use the same approach.
            >
            > The following article discusses the endianness in detail:
            >
            > http://www.eventhelix.com/RealtimeMa...ndOrdering.htm
            >
            > Sandeep
            > --
            > http://www.EventHelix.com/EventStudio
            > EventStudio 2.0 - Go Beyond UML Use Case and Sequence Diagrams[/color]

            Comment

            • Peter Pichler

              #7
              Re: endianness and sscanf/sprintf

              "Jeff Schwab" <jeffplus@comca st.net> wrote...[color=blue]
              > EventHelix.com wrote:[color=green]
              > > You will be fine as everything is being converted to characters.
              > > As long as characters are represented as 8 bytes,[/color]
              >
              > bits?[/color]

              Not that it matters. The second sentence almost invalidates the otherwise
              perfectly correct first ;-)

              Peter


              Comment

              • EventHelix.com

                #8
                Re: endianness and sscanf/sprintf

                Richard Heathfield <invalid@addres s.co.uk.invalid > wrote in message news:<3ff42d01@ news2.power.net .uk>...[color=blue]
                > EventHelix.com wrote:
                >[color=green]
                > > You will be fine as everything is being converted to characters.
                > > As long as characters are represented as 8 bytes, the numbers
                > > will be interpreted correctly.[/color]
                >
                > In C (and, as far as I am aware, C++ too), characters are always represented
                > in a single byte. Character /constants/ are represented (in C, but not C++)
                > by the int type, which might conceivably be eight bytes. Is that what you
                > meant?
                >
                > (Followups set to comp.lang.c)[/color]

                Typo: it should have been "8 bits" (i.e. byte).

                Sandeep

                Comment

                • Richard Heathfield

                  #9
                  Re: endianness and sscanf/sprintf

                  EventHelix.com wrote:
                  [color=blue]
                  > Richard Heathfield <invalid@addres s.co.uk.invalid > wrote in message
                  > news:<3ff42d01@ news2.power.net .uk>...[color=green]
                  >> EventHelix.com wrote:
                  >>[color=darkred]
                  >> > You will be fine as everything is being converted to characters.
                  >> > As long as characters are represented as 8 bytes, the numbers
                  >> > will be interpreted correctly.[/color]
                  >>
                  >> In C (and, as far as I am aware, C++ too), characters are always
                  >> represented in a single byte. Character /constants/ are represented (in
                  >> C, but not C++) by the int type, which might conceivably be eight bytes.
                  >> Is that what you meant?
                  >>[/color]
                  > Typo: it should have been "8 bits" (i.e. byte).[/color]

                  But there is no requirement in either C or C++ for a byte to be exactly 8
                  bits; only that it must be /at least/ 8 bits.

                  --
                  Richard Heathfield : binary@eton.pow ernet.co.uk
                  "Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
                  C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
                  K&R answers, C books, etc: http://users.powernet.co.uk/eton

                  Comment

                  • Martijn Lievaart

                    #10
                    Re: endianness and sscanf/sprintf

                    On Fri, 02 Jan 2004 05:53:31 +0000, Richard Heathfield wrote:
                    [color=blue]
                    > EventHelix.com wrote:
                    >[color=green]
                    >> Richard Heathfield <invalid@addres s.co.uk.invalid > wrote in message
                    >> news:<3ff42d01@ news2.power.net .uk>...[color=darkred]
                    >>> EventHelix.com wrote:
                    >>>
                    >>> > You will be fine as everything is being converted to characters.
                    >>> > As long as characters are represented as 8 bytes, the numbers
                    >>> > will be interpreted correctly.[/color][/color][/color]

                    Even assuming you ment 8 bits, this is not true. If one system uses ascii
                    and the other uses ebcdic, you're screwed. Even the subtle distinctions
                    between iso-latin-1 and iso-latin-15, two almost compatible and often used
                    character sets, might bite you. All of these use 8 bits (well OK, ascii
                    uses 7).
                    [color=blue][color=green][color=darkred]
                    >>>
                    >>> In C (and, as far as I am aware, C++ too), characters are always
                    >>> represented in a single byte. Character /constants/ are represented (in
                    >>> C, but not C++) by the int type, which might conceivably be eight bytes.
                    >>> Is that what you meant?
                    >>>[/color]
                    >> Typo: it should have been "8 bits" (i.e. byte).[/color]
                    >
                    > But there is no requirement in either C or C++ for a byte to be exactly 8
                    > bits; only that it must be /at least/ 8 bits.[/color]

                    But note the unfortunate discrepancy between the meaning of the word byte
                    in C/C++ and that of measoring storage. However, C/C++ is not alone here,
                    Internet standards talk about octets when they mean 8 bits.

                    Same with the unit words. That means different things to different people.
                    The way I learned it at uni, very long time ago, was that a word was the
                    basic unit of storage. Same as the definition of byte in C/C++. Along came
                    MicroSoft and institutionalis ed the word-size of the 8086 as a WORD, so to
                    others a word now is 16 bits. I've seen even different uses of the word
                    'word', anyone got an example?

                    Why am I saying this? Because in the context of C/C++ a byte has a defined
                    meaning. However, in the context of disks and memory, a byte has a
                    different meaning. When the context is not clear it is very easy to get
                    confusion. Ah I here you say, but this is a C/C++ group, so the meaning is
                    clear. That may be true, but:
                    - The problem described a certain context, one where many people
                    (incorrectly) use the word byte to mean 8 bits.
                    - It is very confusing to people anyhow. Youngsters are raised with the
                    notion that a byte is 8 bits.

                    In the end, we can only conclude that this difference in meaning is very
                    unfortunate. Technically, an octet is the correct term for 8 bits. But
                    we're never going to change the common use of byte anymore. In the
                    meantime we'll have to live with it.

                    I just wished the C/C++ standards had used a different term than byte.
                    Even word would have been better.

                    M4

                    Comment

                    • Keith Thompson

                      #11
                      Re: endianness and sscanf/sprintf

                      Martijn Lievaart <m@remove.this. part.rtij.nl> writes:
                      [...][color=blue]
                      > But note the unfortunate discrepancy between the meaning of the word byte
                      > in C/C++ and that of measoring storage. However, C/C++ is not alone here,
                      > Internet standards talk about octets when they mean 8 bits.
                      >
                      > Same with the unit words. That means different things to different people.
                      > The way I learned it at uni, very long time ago, was that a word was the
                      > basic unit of storage. Same as the definition of byte in C/C++. Along came
                      > MicroSoft and institutionalis ed the word-size of the 8086 as a WORD, so to
                      > others a word now is 16 bits. I've seen even different uses of the word
                      > 'word', anyone got an example?[/color]
                      [...][color=blue]
                      > I just wished the C/C++ standards had used a different term than byte.
                      > Even word would have been better.[/color]

                      I agree that it would have avoided a lot of confusion if the C and C++
                      standards had used a term other than "byte" (perhaps "storage unit").
                      While I'm wishing for things that didn't happen, it would also have
                      been nice if the concept hadn't been tied to the size of a character.

                      I think (but I'm not sure, and it doesn't really matter) that the use
                      of the word "word" predates the 8086 (and it probably would have been
                      Intel, not Microsoft, that introduced the word "word" in descriptions
                      of CPU instruction operand sizes). Most or all CPUs I've seen use the
                      words "byte" and "word" to refer to operand sizes. The meaning of a
                      "word" varies across architectures far more than the meaning of
                      "byte".

                      --
                      Keith Thompson (The_Other_Keit h) kst-u@mib.org <http://www.ghoti.net/~kst>
                      San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
                      Schroedinger does Shakespeare: "To be *and* not to be"
                      (Note new e-mail address)

                      Comment

                      • Martijn Lievaart

                        #12
                        Re: endianness and sscanf/sprintf

                        On Fri, 02 Jan 2004 20:45:45 +0000, Keith Thompson wrote:
                        [color=blue]
                        > I think (but I'm not sure, and it doesn't really matter) that the use
                        > of the word "word" predates the 8086 (and it probably would have been
                        > Intel, not Microsoft, that introduced the word "word" in descriptions
                        > of CPU instruction operand sizes). Most or all CPUs I've seen use the
                        > words "byte" and "word" to refer to operand sizes. The meaning of a
                        > "word" varies across architectures far more than the meaning of
                        > "byte".[/color]

                        Exactly what I was trying to say. F.i the CDC used 60-bit words. (No
                        wonder that design is extinct :-).

                        M4

                        Comment

                        • Lew Pitcher

                          #13
                          Re: endianness and sscanf/sprintf

                          Martijn Lievaart wrote:
                          [snip][color=blue]
                          > Same with the unit words. That means different things to different people.
                          > The way I learned it at uni, very long time ago, was that a word was the
                          > basic unit of storage. Same as the definition of byte in C/C++. Along came
                          > MicroSoft and institutionalis ed the word-size of the 8086 as a WORD, so to
                          > others a word now is 16 bits. I've seen even different uses of the word
                          > 'word', anyone got an example?[/color]

                          In the IBM mainframe world, a "word" (or "fullword") has been 32bits for the
                          last 40+ years. A 16bit quantity is a "halfword".

                          [snip]


                          --
                          Lew Pitcher

                          Master Codewright and JOAT-in-training
                          Registered Linux User #112576 (http://counter.li.org/)
                          Slackware - Because I know what I'm doing.

                          Comment

                          • pete

                            #14
                            Re: endianness and sscanf/sprintf

                            Lew Pitcher wrote:[color=blue]
                            >
                            > Martijn Lievaart wrote:
                            > [snip][color=green]
                            > > Same with the unit words.
                            > > That means different things to different people.
                            > > The way I learned it at uni, very long time ago,
                            > > was that a word was the basic unit of storage.
                            > > Same as the definition of byte in C/C++. Along came
                            > > MicroSoft and institutionalis ed the word-size of
                            > > the 8086 as a WORD, so to others a word now is 16 bits.
                            > > I've seen even different uses of the word
                            > > 'word', anyone got an example?[/color]
                            >
                            > In the IBM mainframe world, a "word" (or "fullword")
                            > has been 32bits for the
                            > last 40+ years. A 16bit quantity is a "halfword".[/color]

                            I'm familiar with "word" having a similar meaning as
                            the traditional meaning of "int", having the
                            "natural size suggested by the architecture
                            of the execution environment"

                            --
                            pete

                            Comment

                            • Ron Natalie

                              #15
                              Re: endianness and sscanf/sprintf


                              "Lew Pitcher" <lpitcher@sympa tico.ca> wrote in message news:fq77tb.9uq .ln@merlin.l6s4 x6-4.ca...
                              [color=blue]
                              >
                              > In the IBM mainframe world, a "word" (or "fullword") has been 32bits for the
                              > last 40+ years. A 16bit quantity is a "halfword".[/color]

                              Back when I was heavily into PDP-11's (16 bits), my mainframe friends referred
                              to my computers as halfword machines.

                              Just about every 32 bit processor (with the exception of the x86 stuff) calls a
                              WORD 32 bits. Even on the 386+ the word size really is 32 bits, but since
                              the thing is upward compatible with the old 16 bit 8086... they call words DWORDS.

                              On the 7094 and it's follow ons (including the UNIVAC and the DEC-10/20) the
                              word size is 36 bits. Anything smaller is a "partial word" (which there is no fixed
                              divisions leading to amusing things such as the same hardware supporting byte sizes
                              from 5 to 9 bits).

                              I've worked on 64 bit word machines. The CRAY is word addressed...the re really
                              is NO such hardware datatype other than 64 bit integrals and 64 bit reals. Char's
                              are a unholy kludge in software (they didn't even try anything else, sizeof any non-comoosite
                              type is either 8 or 64).

                              Never say die, the 64 bit word machines are coming back (AMD, IA64, etc...)!

                              Comment

                              Working...