For this problem, you can write one program and display the results at the same
ID: 3781587 • Letter: F
Question
For this problem, you can write one program and display the results at the same time. You must use the Debian version of the UNIX operating system. As discussed in class, round-off errors can cause problems for some classes of calculations. This problem will illustrate round-off errors for a problem that involves a large number of calculations. Write a C++- program to add the number 0.00001 one million times using single - precision arithmetic. Recall that in C++, floating-point literals are double precision by default; to declare a single-precision literal you need to add the suffix "f" or "F". In the same program, repeat (a) using double precision. Can you explain the difference in results between (a) and (b)?Explanation / Answer
Part (a):
#include <iostream>
using namespace std;
int main () {
float round_off = 0.00001f;
for( int i = 1; i < 1000000; i++ ) {
round_off = round_off + 0.00001f;
}
cout << round_off;
}
Output: 9.91735
Expected Output: 10
Part (b):
#include <iostream>
using namespace std;
int main () {
double round_off = 0.00001;
for( int i = 1; i < 1000000; i++ ) {
round_off = round_off + 0.00001;
}
cout << round_off;
}
Output: 10
Expected Output: 10
Part (c):
We can observe a difference of about 0.08265, or we can say an error of 0.08265 in float single precision literal.
The error occurs due to representation of float single precision in 32-bit binary form.
The binary representation of 0.00001 is 0 01101110 01001111100010110101100 divided into sign flag, exponent and mantissa in single precision floating point.
Since 0.00001 is added million times and the number of bits are limited to 32, then at certain point the space starts overflowing, so the bits are rounded up and when the 0.00001 is added again to the rounded result, results a smaller number.
For example, (100001) 2 + (1) 2 = (100010)2 (34)10 whereas (10001) 2 + (1) 2 = (10010)2 (18)10 , we can see that a single bit has made a significant change in result, but in our the rounding occurs at higher bit (32nd bit) so the difference comes out be smaller of 0.08265 in comparison to 16 (in example).
The reason double precision outputs correct result is that it has sufficient bits to store the iteration addition without rounding.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.