Integer demotion requirements in ANSI-C

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • tsuyois
    New Member
    • Jan 2009
    • 6

    Integer demotion requirements in ANSI-C

    Hi, I just signed in to this excellent network.
    I hope I could get some answers to many questions
    I have in writing C compilers.

    My first question is:

    Is "integer demotion" required in ANSI-C?

    Assumption:
    - CPU: 32-bit RISC (int = long = 4 bytes, short = 2 bytes, char = 1 byte)
    - All integral operations done on 4 bytes (signed or unsigned int)
    - Each 32-bit register holds one integral type data that is
    "already" integer promoted, so that arithmetic conversion is
    unnecessary when used as operands in integral operations.
    - Integer demotion on mem-store and integer promotion on mem-load is
    handled by the memory interface hardware. So if you save a short integer to
    memory, and then read it back, it will have correct "already integer promoted"
    bit pattern in the register.

    The question here is: do I need to perform integer demotion where the destination is another register?

    Example 1:
    char i;
    for(i = 0; i < n; i = i + 1){...}

    Here, "char i" would most likely be allocated to internal register.
    Then integer demotion for (i = i + 1) would require the below code sequence:
    (tmp: signed int register)
    tmp = i + 1;
    tmp = tmp << 24;
    tmp = tmp >> 24;
    i = tmp;

    Such integer demotion seems redundant since:
    - if(n < 128): i never overflows
    - if(n >= 128): (i < n) will always be true (infinite loop: must be a bug??)

    Unless the latter case is the actual intention of the programmer,
    I don't want to produce such integer demotion code here.

    Example 2:
    char c = val;
    if(c < val)

    Here, the intention can be:
    - simply checking the sign of val (programmer assumes -128 <= val < 128)
    - checking the bit-7 of val

    If the former case is guaranteed, then integer promotion is unnecessary.

    Example 3:
    unsigned char c = val;
    <c used in other integral operations>

    Here, the intention COULD be the mask operation: c = val & 0xff; in which case
    the programmer expects the integer demotion code here.

    In Microsoft's C reference, it says:
    "When the value with integral type is demoted to a signed integer with
    smaller size, or an unsigned integer is converted to its corresponding
    signed integer, the value is unchanged if it can be represented in the
    new type. ... If it cannot be represented, the result is implementation-defined."

    This seems to suggest that for my assumed RISC (where register holds integer
    promoted values), integer demotion is unnecessary. But I have checked that
    Microsoft's Visual-C does actually perform integer demotion in my first
    for-loop example. My guess is that this is true for most other compilers.

    So, to summarize my question:
    - Is "integer demotion" required in ANSI-C?
    - If not, as a compiler writer, is it "reasonable " to say to the programmers
    "it's your responsiblity to make sure the values assigned to short integers
    can fit into that short type, because with my compiler, "integer demotion"
    may or may not be performed (depending on the destination (register or memory)"??

    I appreciate any inputs on this issue.

    Regards, tsuyois
  • tsuyois
    New Member
    • Jan 2009
    • 6

    #2
    Sorry. There was one typo:

    Example 2:
    char c = val;
    if(c < 0) /// if(c < val) <-- typo....

    Comment

    • JosAH
      Recognized Expert MVP
      • Mar 2007
      • 11453

      #3
      The ANSI-C Standard text doesn't even mention "integer demotion" but integer promotion is always necessary, especially when temporary results are held in a 32 bits wide register (e.g. an ARM processor); e.g.

      Code:
      unsigned char c= 255;
      if (++c) ...
      If the value of c is held in a register the value should be integer promoted when the result of ++c is needed for the comparison against zero. Either store the register value back at the memory location c, resulting in truncation (which is correct) or mask out all but the low byte of the register.

      kind regards,

      Jos

      Comment

      • tsuyois
        New Member
        • Jan 2009
        • 6

        #4
        Thanks for the input, Jos.

        But I'm now confused with your example:

        -----------
        unsigned char c= 255;
        if (++c) ...
        -----------
        (++c) will evaluate to 0 since it's prefix increment. So the if-condition is FALSE.

        The above statement is equivalent to:

        if(c = c + 1) ...

        and the computation sequence would be:

        int tmp;
        unsigned char c = 255;
        tmp = (int) c; /// integer promotion : tmp = 255
        tmp = tmp + 1; /// tmp = 256
        c = (unsigned char) tmp; /// integer demotion : c = 0
        tmp = (int) c;
        if(tmp) ....

        If c is held in the 32-bit register (say reg_c), then what I could do is:

        reg_c = 0xff; /// unsigned char c = 255;
        reg_c = reg_c + 1; /// reg_c = 0x100
        reg_c = (unsigned char) reg_c; /// reg_c = 0x00
        if(reg_c) ...

        Above code generation strategy (which I think is common), is to always maintain the 32-bit register so that integer promotion comes for free (you don't need to do (int) reg_c before you do reg_c + 1).

        So my point exactly is: whether the last code
        reg_c = (unsigned char) reg_c
        is required by ANSI-C spec. If you say ANSI-C doesn't even mention integer demotion, can I generate the below code sequence and say "this is ANSI-C compliant"?

        reg_c = 0xff;
        reg_c = reg_c + 1;
        if(reg_c) ...

        tsuyois

        Comment

        • JosAH
          Recognized Expert MVP
          • Mar 2007
          • 11453

          #5
          Originally posted by tsuyois
          So my point exactly is: whether the last code
          reg_c = (unsigned char) reg_c
          is required by ANSI-C spec. If you say ANSI-C doesn't even mention integer demotion, can I generate the below code sequence and say "this is ANSI-C compliant"?

          reg_c = 0xff;
          reg_c = reg_c + 1;
          if(reg_c) ...
          The test is supposed to fail so you should make reg_c equal to zero one way or another. The Standard also speaks about an "abstract execution" environment that should behave according to the semantics imposed by the Standard itself.

          There can be an escape if you define a char to be 32 bits wide and the sizeof() operator returns 1 for all integer types but I guess that is not what your want.

          Demotion is just truncation to the byte size of the target int and the semantics should behave as if you had used such a sized int. The test above should always fail so reg_c (or a copy thereof) should always be equal to zero.

          kind regards,

          Jos

          Comment

          • tsuyois
            New Member
            • Jan 2009
            • 6

            #6
            Originally posted by JosAH
            The test is supposed to fail so you should make reg_c equal to zero one way or another. The Standard also speaks about an "abstract execution" environment that should behave according to the semantics imposed by the Standard itself.

            Jos
            OK. I understand that the generated code should behave according to the semantics imposed by the Standard itself. What I still want to chew on is "what is the semantics imposed by the (ANSI-C) standard itself"?

            As I have referred in my original question, Microsoft's C Reference says:
            "When the value with integral type is demoted to a signed integer with
            smaller size, or an unsigned integer is converted to its corresponding
            signed integer, the value is unchanged if it can be represented in the
            new type. ... If it cannot be represented, the result is implementation-defined."

            Now, I am assuming that the above statement refers to the ANSI-C standard.
            Coming back to your example,
            unsigned char c = 255;
            c = c + 1;
            the value c + 1 is 256 before being demoted to unsigned char. And 256 cannot be represented in unsigned char. So doesn't this mean "the result is implementation-defined"?? (that ANSI-C does not define what c = c + 1 should be, if c + 1 overflows?)

            The reason I'm stuck in this integer demotion problem is that some legacy codes sometimes do include the use of short integers as loop counters (one of the loop counter in Dhrystone's main function is char Ch_Index) that, at first glance, seem to behave correctly without demoting the loop counters everytime they are incremented. (But you never know, maybe the person who wrote Dhrystone was expecting that the compilers would insert integer demotion codes and just maybe measuring the integer demotion overhead was part of the intention of this benchmark program...)

            Generating integer demotion code is straightforward (unsigned char: c &= 0xff, signed char: c = (c << 24) >> 24), and I already have them in place in my compiler. What's not so straightforward is determining whether integer demotion code is necessary or not, especially on a 32-bit machine that has load/store instructions that do these integer promotion and demotion in hardware. Good example is the textbook DLX processor which has
            - Load Byte (signed, unsigned), Store Byte
            - Load Half-Word (signed, unsigned), Store Half-Word
            - Load Word, Store Word

            Here, you don't want to insert integer demotion codes if the value is to be stored in memory and never referenced directly from the register later. In these situations, a smart compiler would want to avoid inserting redundant integer demotion codes.

            And now, I am trying to decide how much effort I should put to make my compiler make "smart" decisions about integer demotion codes. Then I came accross the above C-reference on integer demotion, and started thinking "can I just simply skip all integer demotions", since most of the time it's not needed, and IF ANSI-C DOES NOT IMPOSE INTEGER DEMOTION, this is still ANSI-C compliant.

            Tsuyois

            Comment

            • tsuyois
              New Member
              • Jan 2009
              • 6

              #7
              Just to update my latest findings on this issue:

              Below is the description from MSDN C Language Reference
              ---------------------------------------------------------------------------------
              Implementation-Defined Behavior
              ANSI X3.159-1989, American National Standard for Information Systems – Programming Language – C, contains a section called "Portabilit y Issues." The ANSI section lists areas of the C language that ANSI leaves open to each particular implementation.
              ---------------------------------------------------------------------------------
              Demotion of Integers
              ANSI 3.2.1.2 The result of converting an integer to a shorter signed integer, or the result of converting an unsigned integer to a signed integer of equal length, if the value cannot be represented
              ---------------------------------------------------------------------------------
              So my current conclusion is that integer demotion whose value cannot be represented in the new integral type is indeed "implementa tion defined".

              tsuyois

              Comment

              • JosAH
                Recognized Expert MVP
                • Mar 2007
                • 11453

                #8
                True, but a few lines above in that same Standard it reads:

                Originally posted by ANSI C
                When an integer is demoted to an unsigned integer with smaller
                size, the result is the nonnegative remainder on division by the
                number one greater than the largest unsigned number that can be
                represented in the type with smaller size.
                so the following test is supposed to fail:

                Code:
                int i= 0xff; // assume i is stored in a 32 bit wide register
                unsigned char c; // assume chars are 8 bits wide
                
                if (c= ++i) /* never reached */
                kind regards,

                Jos

                Comment

                • tsuyois
                  New Member
                  • Jan 2009
                  • 6

                  #9
                  Thanks Jos for all the advice.

                  So I think my understanding is now clear:
                  ANSI-C Integer Demotion:
                  0. If the value can be represented in the new integral type, the value is preserved.
                  Otherwise: (if the value cannot be represented in the new integral type)
                  1. If the new type is unsigned, the value is truncated to the new type
                  2. If the new type is signed, the value is implementation defined

                  It was my mistake of replying too much on MSDN C Reference which does not quite give the complete description of ANSI-C standard.

                  Is there a good source of ANSI-C(89) standard description in the WEB? What I found was below linked from Wiki(C Programming):

                  Comment

                  • JosAH
                    Recognized Expert MVP
                    • Mar 2007
                    • 11453

                    #10
                    You can order the Standard text at ISO. The standard number is 9899 and it doesn't cost much. As far as I know there is no free copy available on the net. There are some draft texts available (as you already know).

                    kind regards,

                    Jos

                    Comment

                    Working...