[BUG] IMO, but no opinions? Uncle Tim? was: int(float(sys.maxint)) buglet ?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Bengt Richter

    [BUG] IMO, but no opinions? Uncle Tim? was: int(float(sys.maxint)) buglet ?

    Peculiar boundary cases:
    [color=blue][color=green][color=darkred]
    >>> 2.0**31-1.0[/color][/color][/color]
    2147483647.0[color=blue][color=green][color=darkred]
    >>> int(2147483647. 0)[/color][/color][/color]
    2147483647L[color=blue][color=green][color=darkred]
    >>> int(2147483647L )[/color][/color][/color]
    2147483647[color=blue][color=green][color=darkred]
    >>>
    >>> -2.0**31[/color][/color][/color]
    -2147483648.0[color=blue][color=green][color=darkred]
    >>> int(-2147483648.0)[/color][/color][/color]
    -2147483648L[color=blue][color=green][color=darkred]
    >>> int(-2147483648L )[/color][/color][/color]
    -2147483648

    some kind of one-off error? I.e., just inside extremes works:
    [color=blue][color=green][color=darkred]
    >>> [int(x) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)][/color][/color][/color]
    [-2147483648L, -2147483647, 2147483646, 2147483647L]

    But those longs at the extremes can be converted successfully, so int(int(x)) works ;-/
    [color=blue][color=green][color=darkred]
    >>> [int(int(x)) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)][/color][/color][/color]
    [-2147483648, -2147483647, 2147483646, 2147483647]

    ISTM this is a buglet, or at least a wartlet for a 32-bit system ;-)

    Almost forgot:
    Python 2.4b1 (#56, Nov 3 2004, 01:47:27)
    [GCC 3.2.3 (mingw special 20030504-1)] on win32

    but same thing on 2.3.2:

    Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on win32
    Type "help", "copyright" , "credits" or "license" for more information.[color=blue][color=green][color=darkred]
    >>> [int(x) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)][/color][/color][/color]
    [-2147483648L, -2147483647, 2147483646, 2147483647L][color=blue][color=green][color=darkred]
    >>> [int(int(x)) for x in (-2.0**31, -2.0**31+1.0, 2.0**31-2.0, 2.0**31-1.0)][/color][/color][/color]
    [-2147483648, -2147483647, 2147483646, 2147483647]

    Hm, ... except for the thought that CPUs with 64-bit integers might truncate maxint
    when converting to float, I might say maybe these reprs should be tested
    for equality in the system tests?
    [color=blue][color=green][color=darkred]
    >>> import sys
    >>> repr(int(float( sys.maxint))), repr(sys.maxint )[/color][/color][/color]
    ('2147483647L', '2147483647')[color=blue][color=green][color=darkred]
    >>> repr(int(float(-sys.maxint-1))), repr(-sys.maxint-1)[/color][/color][/color]
    ('-2147483648L', '-2147483648')

    or maybe at least check for equality of these?:
    [color=blue][color=green][color=darkred]
    >>> type(int(float( sys.maxint))), type(sys.maxint )[/color][/color][/color]
    (<type 'long'>, <type 'int'>)[color=blue][color=green][color=darkred]
    >>> type(int(float(-sys.maxint-1))), type(-sys.maxint-1)[/color][/color][/color]
    (<type 'long'>, <type 'int'>)

    Regards,
    Bengt Richter
  • jepler@unpythonic.net

    #2
    Re: [BUG] IMO,

    Python 2.2.2:[color=blue][color=green][color=darkred]
    >>> int(float(sys.m axint))[/color][/color][/color]
    2147483647

    Python 2.3.2:[color=blue][color=green][color=darkred]
    >>> int(float(sys.m axint))[/color][/color][/color]
    2147483647L

    If you're looking for a suspicious change, this should narrow it down.

    Jeff

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.6 (GNU/Linux)

    iD8DBQFBtHcmJd0 1MZaTXX0RAt/lAJ91VLt4pPGvCa e6Mi7Ha62Z3339M gCferrW
    XXtJRR45oggqvfn BmD0Q1WE=
    =l90g
    -----END PGP SIGNATURE-----

    Comment

    • Tim Peters

      #3
      Re: [BUG] IMO,

      [Bengt Richter][color=blue]
      > Peculiar boundary cases:
      >[color=green][color=darkred]
      > >>> 2.0**31-1.0[/color][/color]
      > 2147483647.0[color=green][color=darkred]
      > >>> int(2147483647. 0)[/color][/color]
      > 2147483647L[color=green][color=darkred]
      > >>> int(2147483647L )[/color][/color]
      > 2147483647[color=green][color=darkred]
      > >>>
      > >>> -2.0**31[/color][/color]
      > -2147483648.0[color=green][color=darkred]
      > >>> int(-2147483648.0)[/color][/color]
      > -2147483648L[color=green][color=darkred]
      > >>> int(-2147483648L )[/color][/color]
      > -2147483648
      >
      > some kind of one-off error?[/color]

      It would help if you were explicit about what you think "the error"
      is. I see a correct result in all cases there.

      Is it just that sometimes

      int(a_float)

      returns a Python long when

      int(a_long_with _the_same_value _as_that_float)

      returns a Python int? If so, that's not a bug -- there's no promise
      anywhere, e.g., that Python will return an int whenever it's
      physically possible to do so.

      Python used to return a (short) int in all cases above, but that lead
      to problems on some oddball systems. See the comments for float_int()
      in floatobject.c for more detail. Slowing float_int() to avoid those
      problems while returning a short int whenever physically possible is a
      tradeoff I would oppose.

      Comment

      • Bengt Richter

        #4
        Re: [BUG] IMO, but no opinions? Uncle Tim? was: int(float(sys.m axint)) buglet ?

        On Mon, 6 Dec 2004 10:30:06 -0500, Tim Peters <tim.peters@gma il.com> wrote:
        [color=blue]
        >[Bengt Richter][color=green]
        >> Peculiar boundary cases:
        >>[color=darkred]
        >> >>> 2.0**31-1.0[/color]
        >> 2147483647.0[color=darkred]
        >> >>> int(2147483647. 0)[/color]
        >> 2147483647L[color=darkred]
        >> >>> int(2147483647L )[/color]
        >> 2147483647[color=darkred]
        >> >>>
        >> >>> -2.0**31[/color]
        >> -2147483648.0[color=darkred]
        >> >>> int(-2147483648.0)[/color]
        >> -2147483648L[color=darkred]
        >> >>> int(-2147483648L )[/color]
        >> -2147483648
        >>
        >> some kind of one-off error?[/color]
        >
        >It would help if you were explicit about what you think "the error"
        >is. I see a correct result in all cases there.
        >
        >Is it just that sometimes
        >
        > int(a_float)
        >
        >returns a Python long when
        >
        > int(a_long_with _the_same_value _as_that_float)
        >
        >returns a Python int? If so, that's not a bug -- there's no promise
        >anywhere, e.g., that Python will return an int whenever it's
        >physically possible to do so.[/color]
        Ok, I understand the expediency of that policy, but what is now the meaning
        of int, in that case? Is it now just a vestigial artifact on the way to
        transparent unification of int and long to a single integer type?

        Promises or not, ISTM that if int->float succeeds in preserving all significant bits, then
        then a following float->int should also succeed without converting to long.
        [color=blue]
        >
        >Python used to return a (short) int in all cases above, but that lead
        >to problems on some oddball systems. See the comments for float_int()
        >in floatobject.c for more detail. Slowing float_int() to avoid those
        >problems while returning a short int whenever physically possible is a
        >tradeoff I would oppose.[/color]

        The 2.3.2 source snippet in floatobject.c :
        --------------
        static PyObject *
        float_int(PyObj ect *v)
        {
        double x = PyFloat_AsDoubl e(v);
        double wholepart; /* integral portion of x, rounded toward 0 */

        (void)modf(x, &wholepart);
        /* Try to get out cheap if this fits in a Python int. The attempt
        * to cast to long must be protected, as C doesn't define what
        * happens if the double is too big to fit in a long. Some rare
        * systems raise an exception then (RISCOS was mentioned as one,
        * and someone using a non-default option on Sun also bumped into
        * that). Note that checking for >= and <= LONG_{MIN,MAX} would
        * still be vulnerable: if a long has more bits of precision than
        * a double, casting MIN/MAX to double may yield an approximation,
        * and if that's rounded up, then, e.g., wholepart=LONG_ MAX+1 would
        * yield true from the C expression wholepart<=LONG _MAX, despite
        * that wholepart is actually greater than LONG_MAX.
        */
        if (LONG_MIN < wholepart && wholepart < LONG_MAX) {
        const long aslong = (long)wholepart ;
        return PyInt_FromLong( aslong);
        }
        return PyLong_FromDoub le(wholepart);
        }
        --------------

        But this is apparently accessed through a table of pointers, so would you oppose
        an auto-configuration that one time tested whether
        int(float(sys.m axint))==sys.ma xint and int(float(-sys.maxint-1))==-sys.maxint-1
        (assuming that's sufficient, of which I'm not 100% sure ;-) and if so switched
        the pointer to a version that tested if(LONG_MIN <= wholepart && wholepart<=LONG _MAX)
        instead of the safe-for-some-obscure-system version?

        Of course, if int isn't all that meaningful any more, I guess the problem can be moved to the
        ctypes module, if that gets included amongst the batteries ;-)

        Regards,
        Bengt Richter

        Comment

        • Tim Peters

          #5
          Re: [BUG] IMO,

          [Tim Peters][color=blue][color=green]
          >> ... there's no promise anywhere, e.g., that Python will return an int
          >> whenever it's physically possible to do so.[/color][/color]

          [Bengt Richter][color=blue]
          > Ok, I understand the expediency of that policy, but what is now the meaning
          > of int, in that case? Is it now just a vestigial artifact on the way to
          > transparent unification of int and long to a single integer type?[/color]

          I don't really know what you mean by "int". Python isn't C, and the
          distinction between Python's historical short integers and unbounded
          integers is indeed going away. "int" is the name of a specific Python
          type, and the constructor for that type (which old-timers will think
          of as the builtin function named "int()") is happy to return unbounded
          integers in modern Pythons too. Python-level distinctions here have
          become increasingly meaningless over time; I expect that "int" and
          "long" will eventually become synonyms for the same type at the Python
          level.

          The distinction remains very visible at the Python C API level, for
          obvious reasons, but even C code has to be prepared to deal with that
          a PyIntObject or a PyLongObject may be given in contexts where "an
          integer" is required.
          [color=blue]
          > Promises or not, ISTM that if int->float succeeds in preserving all significant bits,
          > then then a following float->int should also succeed without converting to long.[/color]

          Yes, that was obvious <wink>. But you haven't explained why you
          *care*, or, more importantly, why someone else should care. It just
          as obviously doesn't bother me, and I'm bold enough to claim that it
          "shouldn't" bother anyone. This seems as peripheral to me as arguing
          that "there's something wrong" about returning "a long" in either of
          these cases:
          [color=blue][color=green][color=darkred]
          >>> import os
          >>> os.path.getsize ("a.py")[/color][/color][/color]
          165L[color=blue][color=green][color=darkred]
          >>> f = open("a.py")
          >>> f.tell()[/color][/color][/color]
          0L

          The implementations of getsize() and .tell() certainly could have
          endured complications to ensure that "an int", and not "a long", was
          returned whenever physically possible to do so -- but why bother?

          ....[color=blue]
          > The 2.3.2 source snippet in floatobject.c :
          > --------------
          > static PyObject *
          > float_int(PyObj ect *v)
          > {[/color]
          ....
          [color=blue]
          > But this is apparently accessed through a table of pointers, so would you oppose
          > an auto-configuration that one time tested whether
          > int(float(sys.m axint))==sys.ma xint and int(float(-sys.maxint-1))==-sys.maxint-1
          > (assuming that's sufficient, of which I'm not 100% sure ;-) and if so switched
          > the pointer to a version that tested if(LONG_MIN <= wholepart &&[color=green]
          >> wholepart<=LONG _MAX)[/color]
          > instead of the safe-for-some-obscure-system version?[/color]

          In the absence of identifying an actual problem this would solve, I
          would oppose adding *gratuitous* complication. Abusing your sense of
          aesthetics isn't "an actual problem" in this sense to me, although it
          may be to you. Of course you're welcome to make any code changes you
          like in your own copy of Python <wnk>.
          [color=blue]
          > Of course, if int isn't all that meaningful any more, I guess the problem can be
          > moved to the ctypes module, if that gets included amongst the batteries ;-)[/color]

          What problem? If there's an actual bug here, please open a bug report.

          Comment

          • Nick Coghlan

            #6
            Re: [BUG] IMO,

            > In the absence of identifying an actual problem this would solve, I[color=blue]
            > would oppose adding *gratuitous* complication. Abusing your sense of
            > aesthetics isn't "an actual problem" in this sense to me, although it
            > may be to you. Of course you're welcome to make any code changes you
            > like in your own copy of Python <wnk>.[/color]

            ..>>> _int = int
            ..>>> def int(*args): return _int(_int(*args ))
            .....
            ..>>> from sys import maxint
            ..>>> int(maxint)
            ..2147483647
            ..>>> int(-maxint-1)
            ..-2147483648

            Pretty! };>

            Cheers,
            Nick.

            --
            Nick Coghlan | ncoghlan@email. com | Brisbane, Australia
            ---------------------------------------------------------------

            Comment

            • Bengt Richter

              #7
              Re: [BUG] IMO, but no opinions? Uncle Tim? was: int(float(sys.m axint)) buglet ?

              On Tue, 7 Dec 2004 16:44:56 -0500, Tim Peters <tim.peters@gma il.com> wrote:
              [color=blue]
              >[Tim Peters][color=green][color=darkred]
              >>> ... there's no promise anywhere, e.g., that Python will return an int
              >>> whenever it's physically possible to do so.[/color][/color]
              >
              >[Bengt Richter][color=green]
              >> Ok, I understand the expediency of that policy, but what is now the meaning
              >> of int, in that case? Is it now just a vestigial artifact on the way to
              >> transparent unification of int and long to a single integer type?[/color]
              >
              >I don't really know what you mean by "int". Python isn't C, and the[/color]
              Me neither, now that I'd been nudged into thinking about it -- that's why I was *asking* ;-)[color=blue]
              >distinction between Python's historical short integers and unbounded
              >integers is indeed going away. "int" is the name of a specific Python
              >type, and the constructor for that type (which old-timers will think
              >of as the builtin function named "int()") is happy to return unbounded
              >integers in modern Pythons too. Python-level distinctions here have
              >become increasingly meaningless over time; I expect that "int" and
              >"long" will eventually become synonyms for the same type at the Python
              >level.[/color]
              I guess the above is a long spelling of "yes" as an answer to my question ;-)

              I assumed that python int was not indifferent to the underlying platform's
              native C integer representations , and that its presence was a compromise
              (now being deprecated) to permit armslength internal representation control,
              for whatever reason (most likely to ease interfacing with something using a fixed
              representation deriving from a C library).
              [color=blue]
              >
              >The distinction remains very visible at the Python C API level, for
              >obvious reasons, but even C code has to be prepared to deal with that
              >a PyIntObject or a PyLongObject may be given in contexts where "an
              >integer" is required.
              >[color=green]
              >> Promises or not, ISTM that if int->float succeeds in preserving all significant bits,
              >> then then a following float->int should also succeed without converting to long.[/color]
              >
              >Yes, that was obvious <wink>. But you haven't explained why you
              >*care*, or, more importantly, why someone else should care. It just[/color]
              IME a corner-case discrepancy in a 1:1 correspondence is a bug waiting to appear
              and bite. I admit to an aesthetic component to my unease with it though ;-)
              [color=blue]
              >as obviously doesn't bother me, and I'm bold enough to claim that it
              >"shouldn't" bother anyone. This seems as peripheral to me as arguing
              >that "there's something wrong" about returning "a long" in either of
              >these cases:
              >[color=green][color=darkred]
              >>>> import os
              >>>> os.path.getsize ("a.py")[/color][/color]
              >165L[color=green][color=darkred]
              >>>> f = open("a.py")
              >>>> f.tell()[/color][/color]
              >0L
              >
              >The implementations of getsize() and .tell() certainly could have
              >endured complications to ensure that "an int", and not "a long", was
              >returned whenever physically possible to do so -- but why bother?
              >[/color]
              Those calls have obvious reasons for handling large-file sizes, but
              they don't explicitly call for an int representation (whatever it means).

              """
              Help on class int in module __builtin__:

              class int(object)
              | int(x[, base]) -> integer
              |
              | Convert a string or number to an integer, if possible. A floating point
              | argument will be truncated towards zero (this does not include a string
              | representation of a floating point number!) When converting a string, use
              | the optional base. It is an error to supply a base when converting a
              | non-string. If the argument is outside the integer range a long object
              | will be returned instead.
              """
              Maybe the above should be amended to say that the "integer range"
              is [-sys.maxint, sys.maxint-1] when the argument is a float ;-)

              Or explain what "an integer" means ;-)
              [color=blue]
              >...[color=green]
              >> The 2.3.2 source snippet in floatobject.c :
              >> --------------
              >> static PyObject *
              >> float_int(PyObj ect *v)
              >> {[/color]
              >...
              >[color=green]
              >> But this is apparently accessed through a table of pointers, so would you oppose
              >> an auto-configuration that one time tested whether
              >> int(float(sys.m axint))==sys.ma xint and int(float(-sys.maxint-1))==-sys.maxint-1
              >> (assuming that's sufficient, of which I'm not 100% sure ;-) and if so switched
              >> the pointer to a version that tested if(LONG_MIN <= wholepart &&[color=darkred]
              >>> wholepart<=LONG _MAX)[/color]
              >> instead of the safe-for-some-obscure-system version?[/color]
              >
              >In the absence of identifying an actual problem this would solve, I
              >would oppose adding *gratuitous* complication. Abusing your sense of
              >aesthetics isn't "an actual problem" in this sense to me, although it
              >may be to you. Of course you're welcome to make any code changes you
              >like in your own copy of Python <wnk>.
              >[color=green]
              >> Of course, if int isn't all that meaningful any more, I guess the problem can be
              >> moved to the ctypes module, if that gets included amongst the batteries ;-)[/color]
              >
              >What problem? If there's an actual bug here, please open a bug report.[/color]
              I guess there won't be a bug until it bites, so we have to imagine an app
              where it would matter not to be able to get sys.maxint from a float
              (except via int(int(floatar g)) ;-)

              However unlikely, that seemed to me most likely to happen if there was
              interfacing to some external function requiring a C-derived integer
              representation, which is why I thought ctypes might be where a corner case
              long that should have been an int might be detected, and the "problem"
              might be solved there with a corner-case test and conversion, so as not
              to fail on a legitimate value whose specific representation is something
              the Python language per se is disassociating itself from ;-)

              Not saying it's a bug wrt Python's unified integer future, just noting
              the slow death of legacy expectations, and I guess tending to view it
              as a bug so long as int implies anything at all about representation,
              and sys.maxint purports to mean something specific about that ;-)

              Regards,
              Bengt Richter

              Comment

              Working...