Rounding double

**Juha Nieminen** · Nov 24 '07, 02:55 PM

Re: Rounding double

Richard Heathfield wrote:

*value = (int)(*value * p + 0.5) / (double)p;

Using "int(value+ .5)" is the wrong way to round because it works
incorrectly with negative values.
The correct way is "std::floor(val ue+.5)".

**Joe Wright** · Nov 24 '07, 03:05 PM

Re: Rounding double

Gordon Burditt wrote:

>pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn
>jn.c:16: warning: unused parameter 'argc'
>pgp@medusa-s2:~/tmp$ ./jn 0.33
>0.330000000000 0000155: 0 decimals 0.0000000000000 000
>0.330000000000 0000155: 1 decimals 0.3000000000000 000
>0.330000000000 0000155: 2 decimals 0.3300000000000 000
>0.330000000000 0000155: 3 decimals 0.3300000000000 000
>0.330000000000 0000155: 4 decimals 0.3300000000000 000
>0.330000000000 0000155: 5 decimals 0.3300000000000 000
>0.330000000000 0000155: 6 decimals 0.3300000000000 000
>0.330000000000 0000155: 7 decimals 0.3300000000000 000
>0.330000000000 0000155: 8 decimals 0.3300000000000 000
>0.330000000000 0000155: 9 decimals 0.3300000000000 000
>0.330000000000 0000155: 10 decimals 0.3300000000000 000
>0.330000000000 0000155: 11 decimals 0.3300000000000 000
>0.330000000000 0000155: 12 decimals 0.3300000000000 000
>0.330000000000 0000155: 13 decimals 0.3300000000000 000
>0.330000000000 0000155: 14 decimals 0.3300000000000 000
>>
>I'm disappointed with the misleading results here which suggest that the
>LHS has trailing cruft which the RHS doesn't, when both have such cruft.

>
None of the floating-point numbers printed here can be represented
exactly in binary floating point, except zero. For debugging
purposes, I recommend a better output format, say %300.200f ,
enough to ensure that you can have enough digits to exactly represent
the number you get as a result (this is quite a bit more than
FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
used). Or perhaps someone is using base-10 floating point.
>

I think you're asking too much here. Note..

00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
Exp = 1021 (-1)
111 11111111
Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
3.3000000000000 002e-01

The first line is the 64-bit double, then I split it into exponent and
mantissa. The last line is the *printf format "%.16e" which will print
in decimal all the precision that a 64-bit double has.

Representing this value (or any double) wider than 17 decimal digits can
only yield nonsense.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

**James Kuyper** · Nov 24 '07, 03:05 PM

Re: Rounding double

Richard Heathfield wrote:

James Kuyper said:

....

>The convention I'm familiar with accepts that most floating point
>operations are inexact,

>
The convention I'm familiar with is that we believe what people say unless
or until we have reason to believe they are lying or mistaken. I believed
what the OP said. If you choose to believe that he was lying or mistaken,
that's entirely up to you.

I believe that he was using English as it is normally used, with
infinitely many implicit assumptions. Failing to state assumptions that
are conventionally left unstated would make him neither mistaken nor a liar.

**Richard Heathfield** · Nov 24 '07, 03:05 PM

Re: Rounding double

Juha Nieminen said:

Richard Heathfield wrote:

> *value = (int)(*value * p + 0.5) / (double)p;

>
Using "int(value+ .5)" is the wrong way to round because it works
incorrectly with negative values.

Please bear in mind that the above code was not mine. Earlier in that
article, I wrote:

"Elsethread , you were given this suggestion (suitably modified so that it
will actually compile, and with a driver added):"

and further on in that same article, I wrote:

"As you can see, it doesn't really round at all."

The correct way is "std::floor(val ue+.5)".

No, that doesn't work in C. So we adjust it to:

floor(value+.5)

which has the merit of compiling in C, and the disadvantage of being poor
style in C++.

But it still fails to round the value of a double to a specified number of
decimal places. On my system, floor(0.33 * 10.0 + 0.5) / 10.0 yields
0.2999999999999 99988898 - which is wrong in the first decimal place.

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

**Richard Heathfield** · Nov 24 '07, 03:15 PM

Re: Rounding double

James Kuyper said:

<snip>

You've made it clear that you're aware that the code depends upon C99
features, but as far as I can tell you've not yet attempted to compile
it with a compiler in a mode where that compiler supports those features
of C99.

Right. I don't have such a compiler. So I did the best I could with the
features provided by gcc extensions. It is, of course, now clear from the
results I got that those gcc extensions were not compatible with the C99
features used by the code.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

**Bart** · Nov 24 '07, 03:15 PM

Re: Rounding double

On Nov 22, 11:10 pm, Flash Gordon <s...@flash-gordon.me.ukwro te:

Which is precisely the problem. If it is wanted for display then there
are better ways, if it is wanted for further calculations then it is
very important that the OP understand why it is not possible in general.

I see the argument is still going on.

If taken to it's logical conclusion then:

double x = 0.1;

is not possible to do in general. In that case we can all give up.

The rounding problem has a solution which works within the limitations
of binary floating point, like many things.

Bart

#include <stdio.h>
#include <math.h>

int main(void)
{
double x=0.1;
double y;

y=(x*10.0-1.0);

printf("0.1*10 - 1.0 = %e\n",y);
}

**Richard Heathfield** · Nov 24 '07, 03:45 PM

Re: Rounding double

Bart said:

On Nov 22, 11:10 pm, Flash Gordon <s...@flash-gordon.me.ukwro te:
>

>Which is precisely the problem. If it is wanted for display then there
>are better ways, if it is wanted for further calculations then it is
>very important that the OP understand why it is not possible in general.

>
I see the argument is still going on.
>
If taken to it's logical conclusion then:
>
double x = 0.1;
>
is not possible to do in general.

It's possible and legal to initialise x in this way. What we *can't* do
(and this is the trap that many programmers fall into) is now assume that
x stores a value that is exactly one-tenth.

In that case we can all give up.

No, there's no need to give up - we just have to be aware that we can't
always do with floating point representation the things we might like to
do, things that we *can* do with a textual representation. That doesn't
mean that floating point representation is useless. It just means that we
shouldn't expect it to do things that, by its very nature, it can't do.

Mathematicians have a similar problem, in that no sufficiently powerful
formal system can be both complete and consistent (both highly desirable
qualities) - but that doesn't stop mathematicians from using mathematics.
It just means they have to be careful *how* they use it. Similarly,
computer programmers need to be careful how they use floating point
representation.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

**Gordon Burditt** · Nov 24 '07, 08:35 PM

Re: Rounding double

>>pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn

>>jn.c:16: warning: unused parameter 'argc'
>>pgp@medusa-s2:~/tmp$ ./jn 0.33
>>0.33000000000 00000155: 0 decimals 0.0000000000000 000
>>0.33000000000 00000155: 1 decimals 0.3000000000000 000
>>0.33000000000 00000155: 2 decimals 0.3300000000000 000
>>0.33000000000 00000155: 3 decimals 0.3300000000000 000
>>0.33000000000 00000155: 4 decimals 0.3300000000000 000
>>0.33000000000 00000155: 5 decimals 0.3300000000000 000
>>0.33000000000 00000155: 6 decimals 0.3300000000000 000
>>0.33000000000 00000155: 7 decimals 0.3300000000000 000
>>0.33000000000 00000155: 8 decimals 0.3300000000000 000
>>0.33000000000 00000155: 9 decimals 0.3300000000000 000
>>0.33000000000 00000155: 10 decimals 0.3300000000000 000
>>0.33000000000 00000155: 11 decimals 0.3300000000000 000
>>0.33000000000 00000155: 12 decimals 0.3300000000000 000
>>0.33000000000 00000155: 13 decimals 0.3300000000000 000
>>0.33000000000 00000155: 14 decimals 0.3300000000000 000
>>>
>>I'm disappointed with the misleading results here which suggest that the
>>LHS has trailing cruft which the RHS doesn't, when both have such cruft.

>>
>None of the floating-point numbers printed here can be represented
>exactly in binary floating point, except zero. For debugging
>purposes, I recommend a better output format, say %300.200f ,
>enough to ensure that you can have enough digits to exactly represent
>the number you get as a result (this is quite a bit more than
>FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
>used). Or perhaps someone is using base-10 floating point.
>>

>I think you're asking too much here. Note..

No, I'm not. I want you to print out enough digits to get the
*EXACT* value of the result you actually got. This doesn't make
sense when you are interested in the value you are calculating, but
it does make sense when you are debugging floating-point rounding
problems. (You will never get an infinite repeating decimal taking
binary floating point values with finite mantissa bits and converting
them to decimal).

>00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
>Exp = 1021 (-1)
111 11111111
>Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
>3.300000000000 0002e-01
>
>The first line is the 64-bit double, then I split it into exponent and
>mantissa. The last line is the *printf format "%.16e" which will print
>in decimal all the precision that a 64-bit double has.

But it's not enough to print the exact value you are getting. When
you are debugging rounding problems, why introduce *more* rounding
error that may obscure the problem you are trying to debug?

>Representing this value (or any double) wider than 17 decimal digits can
>only yield nonsense.

No, it's not nonsense. The value you *actually got* can be
represented exactly if you use enough digits. The value you should
have gotten in infinite-precision math, and taking into account the
accuracy of the inputs cannot be, and you have a point outside the
context of debugging rounding issues.

**RoS** · Nov 25 '07, 08:55 AM

Re: Rounding double

In data Sat, 24 Nov 2007 20:34:46 -0000, Gordon Burditt scrisse:

>>>pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn
>>>jn.c:16: warning: unused parameter 'argc'
>>>pgp@medusa-s2:~/tmp$ ./jn 0.33
>>>0.3300000000 000000155: 0 decimals 0.0000000000000 000
>>>0.3300000000 000000155: 1 decimals 0.3000000000000 000
>>>0.3300000000 000000155: 2 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 3 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 4 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 5 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 6 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 7 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 8 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 9 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 10 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 11 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 12 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 13 decimals 0.3300000000000 000
>>>0.3300000000 000000155: 14 decimals 0.3300000000000 000
>>>>
>>>I'm disappointed with the misleading results here which suggest that the
>>>LHS has trailing cruft which the RHS doesn't, when both have such cruft.
>>>
>>None of the floating-point numbers printed here can be represented
>>exactly in binary floating point, except zero. For debugging
>>purposes, I recommend a better output format, say %300.200f ,
>>enough to ensure that you can have enough digits to exactly represent
>>the number you get as a result (this is quite a bit more than
>>FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
>>used). Or perhaps someone is using base-10 floating point.
>>>

>>I think you're asking too much here. Note..

>
>No, I'm not. I want you to print out enough digits to get the
>*EXACT* value of the result you actually got. This doesn't make
>sense when you are interested in the value you are calculating, but
>it does make sense when you are debugging floating-point rounding
>problems. (You will never get an infinite repeating decimal taking
>binary floating point values with finite mantissa bits and converting
>them to decimal).
>
>

>>00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
>>Exp = 1021 (-1)
> 111 11111111
>>Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
>>3.30000000000 00002e-01
>>
>>The first line is the 64-bit double, then I split it into exponent and
>>mantissa. The last line is the *printf format "%.16e" which will print
>>in decimal all the precision that a 64-bit double has.

>
>But it's not enough to print the exact value you are getting. When
>you are debugging rounding problems, why introduce *more* rounding
>error that may obscure the problem you are trying to debug?
>

>>Representin g this value (or any double) wider than 17 decimal digits can
>>only yield nonsense.

>
>No, it's not nonsense. The value you *actually got* can be
>represented exactly if you use enough digits. The value you should
>have gotten in infinite-precision math, and taking into account the
>accuracy of the inputs cannot be, and you have a point outside the
>context of debugging rounding issues.

i agree with above
the only place where there should be some approximation is
in the traslation "float to string" (or "string to float")
e.g.
1234567
fun(7, 12.999999955555 6)="13.0000000 "
or when print in stdout or in a file

and this could[should?] be done in the "string" not in the float
so i could say: "no approximation for float"!

**James Kanze** · Nov 25 '07, 04:45 PM

Re: Rounding double

On Nov 22, 10:12 am, Richard Heathfield <r...@see.sig.i nvalidwrote:

Jim Langston said:

"Richard Heathfield" <r...@see.sig.i nvalidwrote in message
news:v9qdnXSpzf rRsNjanZ2dnUVZ8 sWhnZ2d@bt.com. ..

Well, you have a syntax error right there: double &value isn't legal.
You presumably meant double *value.

Actually, double& value is legal in C++ but not C.

Right - but of course a cross-posted article should "work" in all the
groups into which it's posted.

Certainly. But here, the original question is actually relevant
to both groups: for the most part, the C++ standard handles
floating point by saying: see the C standard (or alternatively,
by duplicating the wording of the C standard). I'd guess that
whoever posted the answer didn't notice that he was responding
to a cross-posted article.

Here, the post worked in clc++ but not in clc, so either it
should not have been posted to clc or the pointer syntax
should have been used rather than the reference syntax (at
which point, of course, there would have been howls of protest
from the clc++ crowd, and perhaps rightly so).

Shit happens. Since the pointer syntax is a legal in C++ as
well, you shouldn't hear to many complaints from that side; it's
what I use when I'm aware of a cross-posting. (In this case, I
didn't happen to notice the thread until it was fairly long.
And just seeing the names of the posters signaled that it was a
cross-posting---I've never seen you respond to a posting in
clc++ which wasn't cross-posted, and you're not the only one in
this case.)

This was cross posted to two newsgroups (c.l.c++ and c.l.c)
which is usually a bad idea just for this problem.

Indeed.

The languages aren't unrelated, and in this case, the original
question was quite appropriate for a cross-posting. IMHO, it
doesn't seem to be asking too much to be somewhat tolerant with
regards to the actual syntax used in the answers, on the
grounds that whoever is responding may not be aware of the
cross-posting. (Not to criticize your original response.
Nothing wrong with mentionning that the syntax isn't legal in
the language of the group in which you're posting, as long as
you go on to address the real issues, as you did. Let's just
not get hung up with it.)

--
James Kanze (GABI Software) email:james.kan ze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

**James Kanze** · Nov 25 '07, 05:05 PM

Re: Rounding double

On Nov 23, 12:13 pm, Kai-Uwe Bux <jkherci...@gmx .netwrote:

Richard Heathfield wrote:

jacob navia said:

<snip>

2: As you can see, my solution works in your machine.
Either RH is not telling the truth (unlikely) or he is
using some minor error in the code to trip about.

If your code has an error, however minor, then it is broken,
in which case I suggest you post a fixed version. The
version you posted *does not work* on my system. Quite apart
from the fact that you are trying to solve an impossible
problem (the problem, remember, is that of rounding the
value of a double to a specified number of decimal places,
which simply can't be done), you are trying to solve it in a
way that produces bizarrely incorrect results on at least
one system.

Hm. I wonder if this might be a matter of interpreting the
problem.

The C standard says about sqrt() that it computed the
nonnegative square root. C++ inherits this requirement. If
your interpretation of the rounding problem is correct and if
we transfer it to the other arithmetic operations, then there
can be no conforming implementations . In fact, even elementary
school arithmetic (+,-,*,/) cannot be done correctly under
that interpretation. However, that interpretation of the specs
is not the only possible.

A different interpretation of floating point computations is
that an operation (say multiplication, addition, sqrt, or
rounding to a given number of decimal places) should yield a
double (or float or whatever types are topical in the
corresponding newsgroup) that is closest to the exact
mathematical result. If I recall correctly, this is by and
large the position taken by IEEE754.

The problem is that neither the C nor the C++ standard require
such. And that both standards also allow greater precision in
the intermediate results. A liberty that is, in fact, used by
most compilers for at least one very common architecture today.
And which makes "rounding" using just floating point arithmetic
very, very difficult. (I remember seeing an implementation of
modf, a very long time ago, which used so trick involving
multiplication and division in a way to end up with the integral
part as a result of loss of precision. The person who was
porting the library to the 8086 was astonished to find that the
value written through the iptr argument wasn't an integer.)

When this (reasonable) interpretation is adopted,

I'm not too sure about the "reasonable " part. On an Intel
machine, a * b, where a and b are both double, does NOT give the
exact result, rounded to the nearest representable double. And
although Intel processors are pretty scarce in the milieu where
I work (the last Intel I actively programmed on was an 80386),
I've heard that they are still in use. (And other processors,
such as the AMD 64 bit processor on my home PC, also behave this
way.)

the problem of rounding to a fixed number of decimals is
solvable (and it is indeed not different from any other
computational problem). And if you don't adopt an
interpretation like that, floating point arithmetic in general
is "impossible ".

I'd still leave it up to the implementation to handle the tricky
parts. Multiply by a power of 10, floor(), ciel() or round(),
and then divide by the same power, and you should get something
fairly close to a correct result (and if you're talking about
"rounding in base 10", close is all you're going to get).

--
James Kanze (GABI Software) email:james.kan ze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

**jacob navia** · Nov 25 '07, 05:35 PM

Re: Rounding double

James Kanze wrote:

I'd still leave it up to the implementation to handle the tricky
parts. Multiply by a power of 10, floor(), ciel() or round(),
and then divide by the same power, and you should get something
fairly close to a correct result (and if you're talking about
"rounding in base 10", close is all you're going to get).
>

This is exactly what my function does. It is a close implementation of
the rounding, and never claimed to be the *exact* since that is impossible!

--
James Kanze (GABI Software) email:james.kan ze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique

403 Forbidden

http://www.cs.virginia.edu/~lcc-win32

**Mark McIntyre** · Nov 25 '07, 11:15 PM

Re: Rounding double

On Sun, 25 Nov 2007 18:29:02 +0100, in comp.lang.c , jacob navia
<jacob@nospam.c omwrote:

>Richard wrote:

(trollish stuff)

>Of course I agree with your short description of these people.

Two trolls agreeing with each other is hardly interesting.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

**Richard** · Nov 25 '07, 11:35 PM

Re: Rounding double

Mark McIntyre <markmcintyre@s pamcop.netwrite s:

On Sun, 25 Nov 2007 18:29:02 +0100, in comp.lang.c , jacob navia
<jacob@nospam.c omwrote:
>

>>Richard wrote:

>
(trollish stuff)

Facts of life with regard to C.L.C you mean.

>

>>Of course I agree with your short description of these people.

>
Two trolls agreeing with each other is hardly interesting.

Interesting enough for you to reply to I notice. Of course being in the
"small clique" which causes so much repugnance, you are indeed honour
bound to reply. In fairness at least you don't tell people to RTFM as
frequently as your namesake Blumel.

**James Kuyper** · Nov 26 '07, 02:15 AM

Re: Rounding double

jacob navia wrote:

James Kanze wrote:
>

>I'd still leave it up to the implementation to handle the tricky
>parts. Multiply by a power of 10, floor(), ciel() or round(),
>and then divide by the same power, and you should get something
>fairly close to a correct result (and if you're talking about
>"rounding in base 10", close is all you're going to get).
>>

>
This is exactly what my function does. It is a close implementation of
the rounding, and never claimed to be the *exact* since that is impossible!

No, your function does not use floor(), ceil(), or round() to calculate
floating point representations of the intermediate integral value. It
uses conversion to long long for that purpose. As a result it relies
upon non-portable assumptions about the relationship between the
precision of long long and the precision of long double. On
implementations where that assumption is invalid, your algorithm
unnecessarily produces results for some argument values that are less
accurate than they could be with a more appropriate algorithm. Note:
such an algorithm should actually use floorl(), ceill() or roundl(),
rather than floor(), ceil() and round(), for precisely the same reason
that your algorithm uses powl() rather than pow().

Rounding double

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment