Comparing float and decimal

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • D'Arcy J.M. Cain

    Comparing float and decimal

    I'm not sure I follow this logic. Can someone explain why float and
    integer can be compared with each other and decimal can be compared to
    integer but decimal can't be compared to float?
    >>from decimal import Decimal
    >>i = 10
    >>f = 10.0
    >>d = Decimal("10.00" )
    >>i == f
    True
    >>i == d
    True
    >>f == d
    False

    This seems to break the rule that if A is equal to B and B is equal to
    C then A is equal to C.

    --
    D'Arcy J.M. Cain <darcy@druid.ne t | Democracy is three wolves
    http://www.druid.net/darcy/ | and a sheep voting on
    +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
  • Robert Lehmann

    #2
    Re: Comparing float and decimal

    On Tue, 23 Sep 2008 07:20:12 -0400, D'Arcy J.M. Cain wrote:
    I'm not sure I follow this logic. Can someone explain why float and
    integer can be compared with each other and decimal can be compared to
    integer but decimal can't be compared to float?
    In comparisons, `Decimal` tries to convert the other type to a `Decimal`.
    If this fails -- and it does for floats -- the equality comparison
    renders to False. For ordering comparisons, eg. ``D("10") < 10.0``, it
    fails more verbosely::

    TypeError: unorderable types: Decimal() < float()

    The `decimal tutorial`_ states:

    "To create a Decimal from a float, first convert it to a string. This
    serves as an explicit reminder of the details of the conversion
    (including representation error)."

    See the `decimal FAQ`_ for a way to convert floats to Decimals.

    >>>from decimal import Decimal
    >>>i = 10
    >>>f = 10.0
    >>>d = Decimal("10.00" )
    >>>i == f
    True
    >>>i == d
    True
    >>>f == d
    False
    >
    This seems to break the rule that if A is equal to B and B is equal to C
    then A is equal to C.
    I don't see why transitivity should apply to Python objects in general.

    HTH,

    ... _decimal tutorial: http://docs.python.org/lib/decimal-tutorial.html
    ... _decimal FAQ: http://docs.python.org/lib/decimal-faq.html

    --
    Robert "Stargaming " Lehmann

    Comment

    • Michael Palmer

      #3
      Re: Comparing float and decimal

      This seems to break the rule that if A is equal to B and B is equal to C
      then A is equal to C.
      >
      I don't see why transitivity should apply to Python objects in general.
      Well, for numbers it surely would be a nice touch, wouldn't it.
      May be the reason for Decimal to accept float arguments is that
      irrational numbers or very long rational numbers cannot be converted
      to a Decimal without rounding error, and Decimal doesn't want any part
      of it. Seems pointless to me, though.

      Comment

      • Michael Palmer

        #4
        Re: Comparing float and decimal

        On Sep 23, 10:08 am, Michael Palmer <m_palme...@yah oo.cawrote:
        May be the reason for Decimal to accept float arguments is that
        NOT to accept float arguments.

        Comment

        • Marc 'BlackJack' Rintsch

          #5
          Re: Comparing float and decimal

          On Tue, 23 Sep 2008 07:08:07 -0700, Michael Palmer wrote:
          This seems to break the rule that if A is equal to B and B is equal
          to C then A is equal to C.
          >>
          >I don't see why transitivity should apply to Python objects in general.
          >
          Well, for numbers it surely would be a nice touch, wouldn't it. May be
          the reason for Decimal to accept float arguments is that irrational
          numbers or very long rational numbers cannot be converted to a Decimal
          without rounding error, and Decimal doesn't want any part of it. Seems
          pointless to me, though.
          Is 0.1 a very long number? Would you expect ``0.1 == Decimal('0.1')` ` to
          be `True` or `False` given that 0.1 actually is

          In [98]: '%.50f' % 0.1
          Out[98]: '0.100000000000 000005551115123 125782702118158 34045410'

          ?

          Ciao,
          Marc 'BlackJack' Rintsch

          Comment

          • Tim Roberts

            #6
            Re: Comparing float and decimal

            Marc 'BlackJack' Rintsch <bj_666@gmx.net wrote:
            >
            >On Tue, 23 Sep 2008 07:08:07 -0700, Michael Palmer wrote:
            >
            >This seems to break the rule that if A is equal to B and B is equal
            >to C then A is equal to C.
            >>>
            >>I don't see why transitivity should apply to Python objects in general.
            >>
            >Well, for numbers it surely would be a nice touch, wouldn't it. May be
            >the reason for Decimal to accept float arguments is that irrational
            >numbers or very long rational numbers cannot be converted to a Decimal
            >without rounding error, and Decimal doesn't want any part of it. Seems
            >pointless to me, though.
            >
            >Is 0.1 a very long number? Would you expect ``0.1 == Decimal('0.1')` ` to
            >be `True` or `False` given that 0.1 actually is
            >
            >In [98]: '%.50f' % 0.1
            >Out[98]: '0.100000000000 000005551115123 125782702118158 34045410'
            >?
            Actually, it's not. Your C run-time library is generating random digits
            after it runs out of useful information (which is the first 16 or 17
            digits). 0.1 in an IEEE 784 double is this:

            0.1000000000000 000888178419700 125232338905334 47265625
            --
            Tim Roberts, timr@probo.com
            Providenza & Boekelheide, Inc.

            Comment

            • Mark Dickinson

              #7
              Re: Comparing float and decimal

              On Sep 25, 8:55 am, Tim Roberts <t...@probo.com wrote:
              Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:
              0.1 actually is
              >
              In [98]: '%.50f' % 0.1
              Out[98]: '0.100000000000 000005551115123 125782702118158 34045410'
              ?
              >
              Actually, it's not.  Your C run-time library is generating random digits
              after it runs out of useful information (which is the first 16 or 17
              digits).  0.1 in an IEEE 784 double is this:
              >
                   0.1000000000000 000888178419700 125232338905334 47265625
              I get (using Python 2.6):
              >>n, d = 0.1.as_integer_ ratio()
              >>from decimal import Decimal, getcontext
              >>getcontext(). prec = 100
              >>Decimal(n)/Decimal(d)
              Decimal('0.1000 000000000000055 511151231257827 021181583404541 015625')

              which is a lot closer to Marc's answer. Looks like your float
              approximation to 0.1 is 6 ulps out. :-)

              Mark

              Comment

              • Nick Craig-Wood

                #8
                Re: Comparing float and decimal

                Tim Roberts <timr@probo.com wrote:
                Marc 'BlackJack' Rintsch <bj_666@gmx.net wrote:

                On Tue, 23 Sep 2008 07:08:07 -0700, Michael Palmer wrote:
                This seems to break the rule that if A is equal to B and B is equal
                to C then A is equal to C.
                >>
                >I don't see why transitivity should apply to Python objects in general.
                >
                Well, for numbers it surely would be a nice touch, wouldn't it. May be
                the reason for Decimal to accept float arguments is that irrational
                numbers or very long rational numbers cannot be converted to a Decimal
                without rounding error, and Decimal doesn't want any part of it. Seems
                pointless to me, though.
                Is 0.1 a very long number? Would you expect ``0.1 == Decimal('0.1')` ` to
                be `True` or `False` given that 0.1 actually is

                In [98]: '%.50f' % 0.1
                Out[98]: '0.100000000000 000005551115123 125782702118158 34045410'
                ?
                >
                Actually, it's not. Your C run-time library is generating random digits
                after it runs out of useful information (which is the first 16 or 17
                digits). 0.1 in an IEEE 784 double is this:
                >
                0.1000000000000 000888178419700 125232338905334 47265625
                Not according to the decimal FAQ



                ------------------------------------------------------------
                import math
                from decimal import *

                def floatToDecimal( f):
                "Convert a floating point number to a Decimal with no loss of information"
                # Transform (exactly) a float to a mantissa (0.5 <= abs(m) < 1.0) and an
                # exponent. Double the mantissa until it is an integer. Use the integer
                # mantissa and exponent to compute an equivalent Decimal. If this cannot
                # be done exactly, then retry with more precision.

                mantissa, exponent = math.frexp(f)
                while mantissa != int(mantissa):
                mantissa *= 2.0
                exponent -= 1
                mantissa = int(mantissa)

                oldcontext = getcontext()
                setcontext(Cont ext(traps=[Inexact]))
                try:
                while True:
                try:
                return mantissa * Decimal(2) ** exponent
                except Inexact:
                getcontext().pr ec += 1
                finally:
                setcontext(oldc ontext)

                print "float(0.1) is", floatToDecimal( 0.1)
                ------------------------------------------------------------

                Prints this

                float(0.1) is 0.1000000000000 000055511151231 257827021181583 404541015625

                On my platform

                Python 2.5.2 (r252:60911, Aug 8 2008, 09:22:44),
                [GCC 4.3.1] on linux2
                Linux 2.6.26-1-686
                Intel(R) Core(TM)2 CPU T7200

                --
                Nick Craig-Wood <nick@craig-wood.com-- http://www.craig-wood.com/nick

                Comment

                • Mark Dickinson

                  #9
                  Re: Comparing float and decimal

                  On Sep 23, 1:58 pm, Robert Lehmann <stargam...@gma il.comwrote:
                  I don't see why transitivity should apply to Python objects in general.
                  Hmmm. Lack of transitivity does produce some, um, interesting
                  results when playing with sets and dicts. Here are sets s and
                  t such that the unions s | t and t | s have different sizes:
                  >>from decimal import Decimal
                  >>s = set([Decimal(2), 2.0])
                  >>t = set([2])
                  >>len(s | t)
                  2
                  >>len(t | s)
                  1

                  This opens up some wonderful possibilities for hard-to-find bugs...

                  Mark

                  Comment

                  • Tim Roberts

                    #10
                    Re: Comparing float and decimal

                    Mark Dickinson <dickinsm@gmail .comwrote:
                    >On Sep 25, 8:55 am, Tim Roberts <t...@probo.com wrote:
                    >Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:
                    >0.1 actually is
                    >>
                    >In [98]: '%.50f' % 0.1
                    >Out[98]: '0.100000000000 000005551115123 125782702118158 34045410'
                    >?
                    >>
                    >Actually, it's not.  Your C run-time library is generating random digits
                    >after it runs out of useful information (which is the first 16 or 17
                    >digits).  0.1 in an IEEE 784 double is this:
                    >>
                    >     0.1000000000000 000888178419700 125232338905334 47265625
                    >
                    >I get (using Python 2.6):
                    >
                    >>>n, d = 0.1.as_integer_ ratio()
                    >>>from decimal import Decimal, getcontext
                    >>>getcontext() .prec = 100
                    >>>Decimal(n)/Decimal(d)
                    >Decimal('0.100 000000000000005 551115123125782 702118158340454 1015625')
                    >
                    >which is a lot closer to Marc's answer. Looks like your float
                    >approximatio n to 0.1 is 6 ulps out. :-)
                    Hmmph, that makes the vote 3 to 1 against me. I need to go re-examine my
                    "extreme float converter".
                    --
                    Tim Roberts, timr@probo.com
                    Providenza & Boekelheide, Inc.

                    Comment

                    • Tim Roberts

                      #11
                      Re: Comparing float and decimal

                      Mark Dickinson <dickinsm@gmail .comwrote:
                      >On Sep 25, 8:55 am, Tim Roberts <t...@probo.com wrote:
                      >Marc 'BlackJack' Rintsch <bj_...@gmx.net wrote:
                      >0.1 actually is
                      >>
                      >In [98]: '%.50f' % 0.1
                      >Out[98]: '0.100000000000 000005551115123 125782702118158 34045410'
                      >?
                      >>
                      >....  0.1 in an IEEE 784 double is this:
                      >>
                      >     0.1000000000000 000888178419700 125232338905334 47265625
                      >
                      >I get (using Python 2.6):
                      >
                      >>>n, d = 0.1.as_integer_ ratio()
                      >>>from decimal import Decimal, getcontext
                      >>>getcontext() .prec = 100
                      >>>Decimal(n)/Decimal(d)
                      >Decimal('0.100 000000000000005 551115123125782 702118158340454 1015625')
                      >
                      >which is a lot closer to Marc's answer. Looks like your float
                      >approximatio n to 0.1 is 6 ulps out. :-)
                      Yes, foolishness on my part. The hex is 3FB99999_999999 9A,
                      so we're looking at 19999_9999999A / 2^56 or
                      720575940379279 4
                      -------------------
                      720575940379279 36

                      which is the number that Marc, Nick, and you all describe. Apologies all
                      around. I actually dropped one 9 the first time around.

                      Adding one more weird data point, here's what I get trying Marc's original
                      sample on my Windows box:

                      C:\tmp>python
                      Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on
                      win32
                      Type "help", "copyright" , "credits" or "license" for more information.
                      >>'%.50f' % 0.1
                      '0.100000000000 000010000000000 000000000000000 00000000'
                      >>>
                      I assume this is the Microsoft C run-time library at work.
                      --
                      Tim Roberts, timr@probo.com
                      Providenza & Boekelheide, Inc.

                      Comment

                      • Gabriel Genellina

                        #12
                        Re: Comparing float and decimal

                        En Thu, 25 Sep 2008 08:02:49 -0300, Mark Dickinson <dickinsm@gmail .com>
                        escribió:
                        On Sep 23, 1:58 pm, Robert Lehmann <stargam...@gma il.comwrote:
                        >I don't see why transitivity should apply to Python objects in general.
                        >
                        Hmmm. Lack of transitivity does produce some, um, interesting
                        results when playing with sets and dicts. Here are sets s and
                        t such that the unions s | t and t | s have different sizes:
                        >
                        >>>from decimal import Decimal
                        >>>s = set([Decimal(2), 2.0])
                        >>>t = set([2])
                        >>>len(s | t)
                        2
                        >>>len(t | s)
                        1
                        Ouch!
                        This opens up some wonderful possibilities for hard-to-find bugs...
                        And I was thinking all this thread was just a theoretical question without
                        practical consequences...

                        --
                        Gabriel Genellina

                        Comment

                        • Terry Reedy

                          #13
                          Re: Comparing float and decimal

                          Gabriel Genellina wrote:
                          En Thu, 25 Sep 2008 08:02:49 -0300, Mark Dickinson <dickinsm@gmail .com>
                          escribió:
                          >On Sep 23, 1:58 pm, Robert Lehmann <stargam...@gma il.comwrote:
                          >>I don't see why transitivity should apply to Python objects in general.
                          >>
                          >Hmmm. Lack of transitivity does produce some, um, interesting
                          >results when playing with sets and dicts. Here are sets s and
                          >t such that the unions s | t and t | s have different sizes:
                          >>
                          >>>>from decimal import Decimal
                          >>>>s = set([Decimal(2), 2.0])
                          >>>>t = set([2])
                          >>>>len(s | t)
                          >2
                          >>>>len(t | s)
                          >1
                          >
                          Ouch!
                          >
                          >This opens up some wonderful possibilities for hard-to-find bugs...
                          >
                          And I was thinking all this thread was just a theoretical question
                          without practical consequences...
                          To explain this anomaly more clearly, here is a recursive definition of
                          set union.

                          if b: a|b = a.add(x)|(b-x) where x is arbitrary member of b
                          else: a|b = a

                          Since Python only defines set-set and not set-ob, we would have to
                          subtract {x} to directly implement the above. But b.pop() subtracts an
                          arbitrary members and returns it so we can add it. So here is a Python
                          implementation of the definition.

                          def union(a,b):
                          a = set(a) # copy to preserve original
                          b = set(b) # ditto
                          while b:
                          a.add(b.pop())
                          return a

                          from decimal import Decimal
                          d1 = Decimal(1)
                          fd = set((1.0, d1))
                          i = set((1,))
                          print(union(fd, i))
                          print(union(i,f d))

                          # prints

                          {1.0, Decimal('1')}
                          {1}

                          This is a bug in relation to the manual:
                          "union(othe r, ...)
                          set | other | ...
                          Return a new set with elements from both sets."


                          Transitivity is basic to logical deduction:
                          equations: a == b == c ... == z implies a == z
                          implications: (a implies b) and (b implies c)implies (a implies c)
                          The latter covers syllogism and other deduction rules.

                          The induction part of an induction proof of set union commutivity is a
                          typical equality chain:

                          if b:
                          a | b
                          = a.add(x)| b-x for x in b # definition for non-empty b
                          = b-x | a.add(x) # induction hypothesis
                          = (b-x).add(x) | a.add(x)-x # definition for non-empty a
                          = b | a.add(x)-x # definitions of - and .add
                          if x not in a:
                          = b | a # .add and -
                          if x in a:
                          = b | a-x # .add and -
                          = b.add(x) | a-x # definition of .add for x in b
                          = b | a # definition for non-empty a
                          = b | a # in either case, by case analysis

                          By transitivity of =, a | b = b | a !

                          So where does this go wrong for our example? This shows the problems.
                          >>fd - i
                          set()

                          This pretty much says that 2-1=0, or that 2=1. Not good.

                          The fundamental idea of a set is that it only contains something once.
                          This definition assumes that equality is defined sanely, with the usual
                          properties. So, while fd having two members implies d1 != 1.0, the fact
                          that f1 == 1 and 1.0 == 1 implies that they are really the same thing,
                          so that d1 == 1.0, a contradiction.

                          To put this another way: The rule of substitution is that if E, F, and G
                          are expressions and E == F and E is a subexpression of G and we
                          substitute F for E in G to get H, then G == H. Again, this rule, which
                          is a premise of all formal expressional systems I know of, assumes the
                          normal definition of =. When we apply this,

                          fd == {f1, 1.0} == {1,1.0} == {1} == i

                          But Python says
                          >>fd == i
                          False

                          Conclusion: fd is not a mathematical set.

                          Yet another anomaly:
                          >>f = set((1.0,))
                          >>i == f
                          True
                          >>i.add(d1)
                          >>f.add(d1)
                          >>i == f
                          False

                          So much for "adding the same thing to equals yields equals", which is a
                          special case of "doing the same thing to equals, where the thing done
                          only depends on the properties that make the things equal, yields equals."


                          And another
                          >>d1 in i
                          True
                          >>1.0 in i
                          True
                          >>fd <= i
                          False

                          Manual: "set <= other
                          Test whether every element in the set is in other"

                          I bet Python first tests the sizes because the implementer *assumed*
                          that every member of a larger set could not be in a smaller set. I
                          presume the same assumption is used for equality testing.

                          Or

                          Manual: "symmetric_diff erence(other)
                          set ^ other
                          Return a new set with elements in either the set or other but not both."
                          >>d1 in fd
                          True
                          >>d1 in i
                          True
                          >>d1
                          Decimal('1')
                          >>fd ^ i
                          {Decimal('1')}

                          If no one beats me to it, I will probably file a bug report or two, but
                          I am still thinking about what to say and to suggest.

                          Terry Jan Reedy




                          Comment

                          • Mark Dickinson

                            #14
                            Re: Comparing float and decimal

                            On Sep 30, 9:21 am, Terry Reedy <tjre...@udel.e duwrote:
                            If no one beats me to it, I will probably file a bug report or two, but
                            I am still thinking about what to say and to suggest.
                            I can't see many good options here. Some possibilities:

                            (0) Do nothing besides documenting the problem
                            somewhere (perhaps in a manual section entitled
                            'Infrequently Asked Questions', or
                            'Uncommon Python Pitfalls'). I guess the rule is
                            simply that Decimals don't mix well with other
                            numeric types besides integers: if you put both
                            floats and Decimals into a set, or compare a
                            Decimal with a Fraction, you're asking for
                            trouble. I suppose the obvious place for such
                            a note would be in the decimal documentation,
                            since non-users of decimal are unlikely to encounter
                            these problems.

                            (1) 'Fix' the Decimal type to do numerical comparisons
                            with other numeric types correctly, and fix up the
                            Decimal hash appropriately.

                            (2) I wonder whether there's a way to make Decimals
                            and floats incomparable, so that an (in)equality check
                            between them always raises an exception, and any
                            attempt to have both Decimals and floats in the same
                            set (or as keys in the same dict) also gives an error.
                            (Decimals and integers should still be allowed to
                            mix happily, of course.) But I can't see how this could
                            be done without adversely affecting set performance.

                            Option (1) is certainly technically feasible, but I
                            don't like it much: it means adding a whole load
                            of code to the Decimal module that benefits few users
                            but slows down hash computations for everyone.
                            And then any new numeric type that wants to fit in
                            with Python's rules had better worry about hashing
                            equal to ints, floats, Fractions, complexes, *and*
                            Decimals...

                            Option (2) appeals to me, but I can't see how to
                            implement it.

                            So I guess that just leaves updating the docs.
                            Other thoughts?

                            Mark

                            Comment

                            • Terry Reedy

                              #15
                              Re: Comparing float and decimal

                              Mark Dickinson wrote:
                              On Sep 30, 9:21 am, Terry Reedy <tjre...@udel.e duwrote:
                              >If no one beats me to it, I will probably file a bug report or two, but
                              >I am still thinking about what to say and to suggest.
                              >
                              I can't see many good options here. Some possibilities:
                              Thanks for responding. Agreeing on a fix would make it more likely to
                              happen sooner ;-)
                              (0) Do nothing besides documenting the problem
                              somewhere (perhaps in a manual section entitled
                              'Infrequently Asked Questions', or
                              'Uncommon Python Pitfalls'). I guess the rule is
                              simply that Decimals don't mix well with other
                              numeric types besides integers: if you put both
                              floats and Decimals into a set, or compare a
                              Decimal with a Fraction, you're asking for
                              trouble. I suppose the obvious place for such
                              a note would be in the decimal documentation,
                              since non-users of decimal are unlikely to encounter
                              these problems.
                              Documenting the problem properly would mean changing the set
                              documentation to change at least the definitions of union (|), issubset
                              (<=), issuperset (>=), and symmetric_diffe rence (^) from their current
                              math set based definitions to implementation based definitions that
                              describe what they actually do instead of what they intend to do. I do
                              not like this option.
                              (1) 'Fix' the Decimal type to do numerical comparisons
                              with other numeric types correctly, and fix up the
                              Decimal hash appropriately.
                              (1A) All that is needed for fix equality transitivity corruption and the
                              consequent set/dictview problems is to correctly compare integral
                              values. For this, Decimal hash seems fine already. For the int i I
                              tried, hash(i) == hash(float(i)) == hash(Decimal(i) ) ==
                              hash(Fraction(i )) == i.

                              It is fine for transitivity that all fractional decimals are unequal to
                              all fractional floats (and all fractions) since there is no integer (or
                              fraction) that either is equal to, let alone both.

                              This is what I would choose unless there is some 'hidden' problem. But
                              it seem to me this should work: when a float and decimal are both
                              integral (easy to determine) convert either to an int and use the
                              current int-whichever comparison.
                              (2) I wonder whether there's a way to make Decimals
                              and floats incomparable, so that an (in)equality check
                              between them always raises an exception, and any
                              attempt to have both Decimals and floats in the same
                              set (or as keys in the same dict) also gives an error.
                              (Decimals and integers should still be allowed to
                              mix happily, of course.) But I can't see how this could
                              be done without adversely affecting set performance.
                              I pretty strongly believe that equality checks should always work (at
                              least in Python as delivered) just as boolean checks should (and do).
                              Option (1) is certainly technically feasible, but I
                              don't like it much: it means adding a whole load
                              of code to the Decimal module that benefits few users
                              but slows down hash computations for everyone.
                              And then any new numeric type that wants to fit in
                              with Python's rules had better worry about hashing
                              equal to ints, floats, Fractions, complexes, *and*
                              Decimals...
                              I believe (1A) would be much easier both to implement and for new
                              numeric types.
                              >
                              Option (2) appeals to me, but I can't see how to
                              implement it.
                              >
                              So I guess that just leaves updating the docs.
                              Other thoughts?
                              (3) Further isolate decimals by making decimals also unequal to all
                              ints. Like (1A), this would easily fix transitivity breakage, but I
                              would consider the result less desirable.

                              My ranking: 1A 3 0 2. I might put 1 between 1A and 3, but I am
                              not sure.
                              Mark
                              Terry Jan Reedy

                              Comment

                              Working...