Rounding double

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Juha Nieminen

    Re: Rounding double

    Richard Heathfield wrote:
    *value = (int)(*value * p + 0.5) / (double)p;
    Using "int(value+ .5)" is the wrong way to round because it works
    incorrectly with negative values.
    The correct way is "std::floor(val ue+.5)".

    Comment

    • Joe Wright

      Re: Rounding double

      Gordon Burditt wrote:
      >pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn
      >jn.c:16: warning: unused parameter 'argc'
      >pgp@medusa-s2:~/tmp$ ./jn 0.33
      >0.330000000000 0000155: 0 decimals 0.0000000000000 000
      >0.330000000000 0000155: 1 decimals 0.3000000000000 000
      >0.330000000000 0000155: 2 decimals 0.3300000000000 000
      >0.330000000000 0000155: 3 decimals 0.3300000000000 000
      >0.330000000000 0000155: 4 decimals 0.3300000000000 000
      >0.330000000000 0000155: 5 decimals 0.3300000000000 000
      >0.330000000000 0000155: 6 decimals 0.3300000000000 000
      >0.330000000000 0000155: 7 decimals 0.3300000000000 000
      >0.330000000000 0000155: 8 decimals 0.3300000000000 000
      >0.330000000000 0000155: 9 decimals 0.3300000000000 000
      >0.330000000000 0000155: 10 decimals 0.3300000000000 000
      >0.330000000000 0000155: 11 decimals 0.3300000000000 000
      >0.330000000000 0000155: 12 decimals 0.3300000000000 000
      >0.330000000000 0000155: 13 decimals 0.3300000000000 000
      >0.330000000000 0000155: 14 decimals 0.3300000000000 000
      >>
      >I'm disappointed with the misleading results here which suggest that the
      >LHS has trailing cruft which the RHS doesn't, when both have such cruft.
      >
      None of the floating-point numbers printed here can be represented
      exactly in binary floating point, except zero. For debugging
      purposes, I recommend a better output format, say %300.200f ,
      enough to ensure that you can have enough digits to exactly represent
      the number you get as a result (this is quite a bit more than
      FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
      used). Or perhaps someone is using base-10 floating point.
      >
      I think you're asking too much here. Note..

      00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
      Exp = 1021 (-1)
      111 11111111
      Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
      3.3000000000000 002e-01

      The first line is the 64-bit double, then I split it into exponent and
      mantissa. The last line is the *printf format "%.16e" which will print
      in decimal all the precision that a 64-bit double has.

      Representing this value (or any double) wider than 17 decimal digits can
      only yield nonsense.

      --
      Joe Wright
      "Everything should be made as simple as possible, but not simpler."
      --- Albert Einstein ---

      Comment

      • James Kuyper

        Re: Rounding double

        Richard Heathfield wrote:
        James Kuyper said:
        ....
        >The convention I'm familiar with accepts that most floating point
        >operations are inexact,
        >
        The convention I'm familiar with is that we believe what people say unless
        or until we have reason to believe they are lying or mistaken. I believed
        what the OP said. If you choose to believe that he was lying or mistaken,
        that's entirely up to you.
        I believe that he was using English as it is normally used, with
        infinitely many implicit assumptions. Failing to state assumptions that
        are conventionally left unstated would make him neither mistaken nor a liar.

        Comment

        • Richard Heathfield

          Re: Rounding double

          Juha Nieminen said:
          Richard Heathfield wrote:
          > *value = (int)(*value * p + 0.5) / (double)p;
          >
          Using "int(value+ .5)" is the wrong way to round because it works
          incorrectly with negative values.
          Please bear in mind that the above code was not mine. Earlier in that
          article, I wrote:

          "Elsethread , you were given this suggestion (suitably modified so that it
          will actually compile, and with a driver added):"

          and further on in that same article, I wrote:

          "As you can see, it doesn't really round at all."
          The correct way is "std::floor(val ue+.5)".
          No, that doesn't work in C. So we adjust it to:

          floor(value+.5)

          which has the merit of compiling in C, and the disadvantage of being poor
          style in C++.

          But it still fails to round the value of a double to a specified number of
          decimal places. On my system, floor(0.33 * 10.0 + 0.5) / 10.0 yields
          0.2999999999999 99988898 - which is wrong in the first decimal place.

          --
          Richard Heathfield <http://www.cpax.org.uk >
          Email: -http://www. +rjh@
          Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
          "Usenet is a strange place" - dmr 29 July 1999

          Comment

          • Richard Heathfield

            Re: Rounding double

            James Kuyper said:

            <snip>
            You've made it clear that you're aware that the code depends upon C99
            features, but as far as I can tell you've not yet attempted to compile
            it with a compiler in a mode where that compiler supports those features
            of C99.
            Right. I don't have such a compiler. So I did the best I could with the
            features provided by gcc extensions. It is, of course, now clear from the
            results I got that those gcc extensions were not compatible with the C99
            features used by the code.

            <snip>

            --
            Richard Heathfield <http://www.cpax.org.uk >
            Email: -http://www. +rjh@
            Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
            "Usenet is a strange place" - dmr 29 July 1999

            Comment

            • Bart

              Re: Rounding double

              On Nov 22, 11:10 pm, Flash Gordon <s...@flash-gordon.me.ukwro te:
              Which is precisely the problem. If it is wanted for display then there
              are better ways, if it is wanted for further calculations then it is
              very important that the OP understand why it is not possible in general.
              I see the argument is still going on.

              If taken to it's logical conclusion then:

              double x = 0.1;

              is not possible to do in general. In that case we can all give up.

              The rounding problem has a solution which works within the limitations
              of binary floating point, like many things.

              Bart

              #include <stdio.h>
              #include <math.h>

              int main(void)
              {
              double x=0.1;
              double y;

              y=(x*10.0-1.0);

              printf("0.1*10 - 1.0 = %e\n",y);
              }

              Comment

              • Richard Heathfield

                Re: Rounding double

                Bart said:
                On Nov 22, 11:10 pm, Flash Gordon <s...@flash-gordon.me.ukwro te:
                >
                >Which is precisely the problem. If it is wanted for display then there
                >are better ways, if it is wanted for further calculations then it is
                >very important that the OP understand why it is not possible in general.
                >
                I see the argument is still going on.
                >
                If taken to it's logical conclusion then:
                >
                double x = 0.1;
                >
                is not possible to do in general.
                It's possible and legal to initialise x in this way. What we *can't* do
                (and this is the trap that many programmers fall into) is now assume that
                x stores a value that is exactly one-tenth.
                In that case we can all give up.
                No, there's no need to give up - we just have to be aware that we can't
                always do with floating point representation the things we might like to
                do, things that we *can* do with a textual representation. That doesn't
                mean that floating point representation is useless. It just means that we
                shouldn't expect it to do things that, by its very nature, it can't do.

                Mathematicians have a similar problem, in that no sufficiently powerful
                formal system can be both complete and consistent (both highly desirable
                qualities) - but that doesn't stop mathematicians from using mathematics.
                It just means they have to be careful *how* they use it. Similarly,
                computer programmers need to be careful how they use floating point
                representation.

                <snip>

                --
                Richard Heathfield <http://www.cpax.org.uk >
                Email: -http://www. +rjh@
                Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
                "Usenet is a strange place" - dmr 29 July 1999

                Comment

                • Gordon Burditt

                  Re: Rounding double

                  >>pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn
                  >>jn.c:16: warning: unused parameter 'argc'
                  >>pgp@medusa-s2:~/tmp$ ./jn 0.33
                  >>0.33000000000 00000155: 0 decimals 0.0000000000000 000
                  >>0.33000000000 00000155: 1 decimals 0.3000000000000 000
                  >>0.33000000000 00000155: 2 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 3 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 4 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 5 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 6 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 7 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 8 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 9 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 10 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 11 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 12 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 13 decimals 0.3300000000000 000
                  >>0.33000000000 00000155: 14 decimals 0.3300000000000 000
                  >>>
                  >>I'm disappointed with the misleading results here which suggest that the
                  >>LHS has trailing cruft which the RHS doesn't, when both have such cruft.
                  >>
                  >None of the floating-point numbers printed here can be represented
                  >exactly in binary floating point, except zero. For debugging
                  >purposes, I recommend a better output format, say %300.200f ,
                  >enough to ensure that you can have enough digits to exactly represent
                  >the number you get as a result (this is quite a bit more than
                  >FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
                  >used). Or perhaps someone is using base-10 floating point.
                  >>
                  >I think you're asking too much here. Note..
                  No, I'm not. I want you to print out enough digits to get the
                  *EXACT* value of the result you actually got. This doesn't make
                  sense when you are interested in the value you are calculating, but
                  it does make sense when you are debugging floating-point rounding
                  problems. (You will never get an infinite repeating decimal taking
                  binary floating point values with finite mantissa bits and converting
                  them to decimal).

                  >00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
                  >Exp = 1021 (-1)
                  111 11111111
                  >Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
                  >3.300000000000 0002e-01
                  >
                  >The first line is the 64-bit double, then I split it into exponent and
                  >mantissa. The last line is the *printf format "%.16e" which will print
                  >in decimal all the precision that a 64-bit double has.
                  But it's not enough to print the exact value you are getting. When
                  you are debugging rounding problems, why introduce *more* rounding
                  error that may obscure the problem you are trying to debug?
                  >Representing this value (or any double) wider than 17 decimal digits can
                  >only yield nonsense.
                  No, it's not nonsense. The value you *actually got* can be
                  represented exactly if you use enough digits. The value you should
                  have gotten in infinite-precision math, and taking into account the
                  accuracy of the inputs cannot be, and you have a point outside the
                  context of debugging rounding issues.

                  Comment

                  • RoS

                    Re: Rounding double

                    In data Sat, 24 Nov 2007 20:34:46 -0000, Gordon Burditt scrisse:
                    >>>pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn
                    >>>jn.c:16: warning: unused parameter 'argc'
                    >>>pgp@medusa-s2:~/tmp$ ./jn 0.33
                    >>>0.3300000000 000000155: 0 decimals 0.0000000000000 000
                    >>>0.3300000000 000000155: 1 decimals 0.3000000000000 000
                    >>>0.3300000000 000000155: 2 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 3 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 4 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 5 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 6 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 7 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 8 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 9 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 10 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 11 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 12 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 13 decimals 0.3300000000000 000
                    >>>0.3300000000 000000155: 14 decimals 0.3300000000000 000
                    >>>>
                    >>>I'm disappointed with the misleading results here which suggest that the
                    >>>LHS has trailing cruft which the RHS doesn't, when both have such cruft.
                    >>>
                    >>None of the floating-point numbers printed here can be represented
                    >>exactly in binary floating point, except zero. For debugging
                    >>purposes, I recommend a better output format, say %300.200f ,
                    >>enough to ensure that you can have enough digits to exactly represent
                    >>the number you get as a result (this is quite a bit more than
                    >>FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
                    >>used). Or perhaps someone is using base-10 floating point.
                    >>>
                    >>I think you're asking too much here. Note..
                    >
                    >No, I'm not. I want you to print out enough digits to get the
                    >*EXACT* value of the result you actually got. This doesn't make
                    >sense when you are interested in the value you are calculating, but
                    >it does make sense when you are debugging floating-point rounding
                    >problems. (You will never get an infinite repeating decimal taking
                    >binary floating point values with finite mantissa bits and converting
                    >them to decimal).
                    >
                    >
                    >>00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
                    >>Exp = 1021 (-1)
                    > 111 11111111
                    >>Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
                    >>3.30000000000 00002e-01
                    >>
                    >>The first line is the 64-bit double, then I split it into exponent and
                    >>mantissa. The last line is the *printf format "%.16e" which will print
                    >>in decimal all the precision that a 64-bit double has.
                    >
                    >But it's not enough to print the exact value you are getting. When
                    >you are debugging rounding problems, why introduce *more* rounding
                    >error that may obscure the problem you are trying to debug?
                    >
                    >>Representin g this value (or any double) wider than 17 decimal digits can
                    >>only yield nonsense.
                    >
                    >No, it's not nonsense. The value you *actually got* can be
                    >represented exactly if you use enough digits. The value you should
                    >have gotten in infinite-precision math, and taking into account the
                    >accuracy of the inputs cannot be, and you have a point outside the
                    >context of debugging rounding issues.
                    i agree with above
                    the only place where there should be some approximation is
                    in the traslation "float to string" (or "string to float")
                    e.g.
                    1234567
                    fun(7, 12.999999955555 6)="13.0000000 "
                    or when print in stdout or in a file

                    and this could[should?] be done in the "string" not in the float
                    so i could say: "no approximation for float"!

                    Comment

                    • James Kanze

                      Re: Rounding double

                      On Nov 22, 10:12 am, Richard Heathfield <r...@see.sig.i nvalidwrote:
                      Jim Langston said:
                      "Richard Heathfield" <r...@see.sig.i nvalidwrote in message
                      news:v9qdnXSpzf rRsNjanZ2dnUVZ8 sWhnZ2d@bt.com. ..
                      Well, you have a syntax error right there: double &value isn't legal.
                      You presumably meant double *value.
                      Actually, double& value is legal in C++ but not C.
                      Right - but of course a cross-posted article should "work" in all the
                      groups into which it's posted.
                      Certainly. But here, the original question is actually relevant
                      to both groups: for the most part, the C++ standard handles
                      floating point by saying: see the C standard (or alternatively,
                      by duplicating the wording of the C standard). I'd guess that
                      whoever posted the answer didn't notice that he was responding
                      to a cross-posted article.
                      Here, the post worked in clc++ but not in clc, so either it
                      should not have been posted to clc or the pointer syntax
                      should have been used rather than the reference syntax (at
                      which point, of course, there would have been howls of protest
                      from the clc++ crowd, and perhaps rightly so).
                      Shit happens. Since the pointer syntax is a legal in C++ as
                      well, you shouldn't hear to many complaints from that side; it's
                      what I use when I'm aware of a cross-posting. (In this case, I
                      didn't happen to notice the thread until it was fairly long.
                      And just seeing the names of the posters signaled that it was a
                      cross-posting---I've never seen you respond to a posting in
                      clc++ which wasn't cross-posted, and you're not the only one in
                      this case.)
                      This was cross posted to two newsgroups (c.l.c++ and c.l.c)
                      which is usually a bad idea just for this problem.
                      Indeed.
                      The languages aren't unrelated, and in this case, the original
                      question was quite appropriate for a cross-posting. IMHO, it
                      doesn't seem to be asking too much to be somewhat tolerant with
                      regards to the actual syntax used in the answers, on the
                      grounds that whoever is responding may not be aware of the
                      cross-posting. (Not to criticize your original response.
                      Nothing wrong with mentionning that the syntax isn't legal in
                      the language of the group in which you're posting, as long as
                      you go on to address the real issues, as you did. Let's just
                      not get hung up with it.)

                      --
                      James Kanze (GABI Software) email:james.kan ze@gmail.com
                      Conseils en informatique orientée objet/
                      Beratung in objektorientier ter Datenverarbeitu ng
                      9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

                      Comment

                      • James Kanze

                        Re: Rounding double

                        On Nov 23, 12:13 pm, Kai-Uwe Bux <jkherci...@gmx .netwrote:
                        Richard Heathfield wrote:
                        jacob navia said:
                        <snip>
                        2: As you can see, my solution works in your machine.
                        Either RH is not telling the truth (unlikely) or he is
                        using some minor error in the code to trip about.
                        If your code has an error, however minor, then it is broken,
                        in which case I suggest you post a fixed version. The
                        version you posted *does not work* on my system. Quite apart
                        from the fact that you are trying to solve an impossible
                        problem (the problem, remember, is that of rounding the
                        value of a double to a specified number of decimal places,
                        which simply can't be done), you are trying to solve it in a
                        way that produces bizarrely incorrect results on at least
                        one system.
                        Hm. I wonder if this might be a matter of interpreting the
                        problem.
                        The C standard says about sqrt() that it computed the
                        nonnegative square root. C++ inherits this requirement. If
                        your interpretation of the rounding problem is correct and if
                        we transfer it to the other arithmetic operations, then there
                        can be no conforming implementations . In fact, even elementary
                        school arithmetic (+,-,*,/) cannot be done correctly under
                        that interpretation. However, that interpretation of the specs
                        is not the only possible.
                        A different interpretation of floating point computations is
                        that an operation (say multiplication, addition, sqrt, or
                        rounding to a given number of decimal places) should yield a
                        double (or float or whatever types are topical in the
                        corresponding newsgroup) that is closest to the exact
                        mathematical result. If I recall correctly, this is by and
                        large the position taken by IEEE754.
                        The problem is that neither the C nor the C++ standard require
                        such. And that both standards also allow greater precision in
                        the intermediate results. A liberty that is, in fact, used by
                        most compilers for at least one very common architecture today.
                        And which makes "rounding" using just floating point arithmetic
                        very, very difficult. (I remember seeing an implementation of
                        modf, a very long time ago, which used so trick involving
                        multiplication and division in a way to end up with the integral
                        part as a result of loss of precision. The person who was
                        porting the library to the 8086 was astonished to find that the
                        value written through the iptr argument wasn't an integer.)
                        When this (reasonable) interpretation is adopted,
                        I'm not too sure about the "reasonable " part. On an Intel
                        machine, a * b, where a and b are both double, does NOT give the
                        exact result, rounded to the nearest representable double. And
                        although Intel processors are pretty scarce in the milieu where
                        I work (the last Intel I actively programmed on was an 80386),
                        I've heard that they are still in use. (And other processors,
                        such as the AMD 64 bit processor on my home PC, also behave this
                        way.)
                        the problem of rounding to a fixed number of decimals is
                        solvable (and it is indeed not different from any other
                        computational problem). And if you don't adopt an
                        interpretation like that, floating point arithmetic in general
                        is "impossible ".
                        I'd still leave it up to the implementation to handle the tricky
                        parts. Multiply by a power of 10, floor(), ciel() or round(),
                        and then divide by the same power, and you should get something
                        fairly close to a correct result (and if you're talking about
                        "rounding in base 10", close is all you're going to get).

                        --
                        James Kanze (GABI Software) email:james.kan ze@gmail.com
                        Conseils en informatique orientée objet/
                        Beratung in objektorientier ter Datenverarbeitu ng
                        9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

                        Comment

                        • jacob navia

                          Re: Rounding double

                          James Kanze wrote:
                          I'd still leave it up to the implementation to handle the tricky
                          parts. Multiply by a power of 10, floor(), ciel() or round(),
                          and then divide by the same power, and you should get something
                          fairly close to a correct result (and if you're talking about
                          "rounding in base 10", close is all you're going to get).
                          >
                          This is exactly what my function does. It is a close implementation of
                          the rounding, and never claimed to be the *exact* since that is impossible!
                          --
                          James Kanze (GABI Software) email:james.kan ze@gmail.com
                          Conseils en informatique orientée objet/
                          Beratung in objektorientier ter Datenverarbeitu ng
                          9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

                          --
                          jacob navia
                          jacob at jacob point remcomp point fr
                          logiciels/informatique

                          Comment

                          • Mark McIntyre

                            Re: Rounding double

                            On Sun, 25 Nov 2007 18:29:02 +0100, in comp.lang.c , jacob navia
                            <jacob@nospam.c omwrote:
                            >Richard wrote:
                            (trollish stuff)
                            >Of course I agree with your short description of these people.
                            Two trolls agreeing with each other is hardly interesting.
                            --
                            Mark McIntyre

                            "Debugging is twice as hard as writing the code in the first place.
                            Therefore, if you write the code as cleverly as possible, you are,
                            by definition, not smart enough to debug it."
                            --Brian Kernighan

                            Comment

                            • Richard

                              Re: Rounding double

                              Mark McIntyre <markmcintyre@s pamcop.netwrite s:
                              On Sun, 25 Nov 2007 18:29:02 +0100, in comp.lang.c , jacob navia
                              <jacob@nospam.c omwrote:
                              >
                              >>Richard wrote:
                              >
                              (trollish stuff)
                              Facts of life with regard to C.L.C you mean.
                              >
                              >>Of course I agree with your short description of these people.
                              >
                              Two trolls agreeing with each other is hardly interesting.
                              Interesting enough for you to reply to I notice. Of course being in the
                              "small clique" which causes so much repugnance, you are indeed honour
                              bound to reply. In fairness at least you don't tell people to RTFM as
                              frequently as your namesake Blumel.

                              Comment

                              • James Kuyper

                                Re: Rounding double

                                jacob navia wrote:
                                James Kanze wrote:
                                >
                                >I'd still leave it up to the implementation to handle the tricky
                                >parts. Multiply by a power of 10, floor(), ciel() or round(),
                                >and then divide by the same power, and you should get something
                                >fairly close to a correct result (and if you're talking about
                                >"rounding in base 10", close is all you're going to get).
                                >>
                                >
                                This is exactly what my function does. It is a close implementation of
                                the rounding, and never claimed to be the *exact* since that is impossible!
                                No, your function does not use floor(), ceil(), or round() to calculate
                                floating point representations of the intermediate integral value. It
                                uses conversion to long long for that purpose. As a result it relies
                                upon non-portable assumptions about the relationship between the
                                precision of long long and the precision of long double. On
                                implementations where that assumption is invalid, your algorithm
                                unnecessarily produces results for some argument values that are less
                                accurate than they could be with a more appropriate algorithm. Note:
                                such an algorithm should actually use floorl(), ceill() or roundl(),
                                rather than floor(), ceil() and round(), for precisely the same reason
                                that your algorithm uses powl() rather than pow().

                                Comment

                                Working...