a. Suppose A, B, and C are normalized 32-bit IEEE 754 floating point variables w
ID: 3874704 • Letter: A
Question
a. Suppose A, B, and C are normalized 32-bit IEEE 754 floating point variables with A having a real value of 1.1001x^60 and B having a real value of 1.01x^20. After the high level language assignment statement "C=A+B", why is C's value equal to A's value and not the mathematically correct sum?
b. For the same values of A and B in part a, would the high level language assignment statement "C=A+B" assign C the mathematically correct sum if A, B and C were used the 64-bit IEEE 754 floating point format? Explain.
Explanation / Answer
a.
Given, A = 1.1001 * 260 and B = 1.01 * 220 in scientific notation
To do floating point addition, exponents of both the variables should be same. So, to make the exponents same, the CPU right shifts B by 40 which is the difference between the two exponents and then performs addtion.
After addition the sum is rounded by the hardware. Since there is a huge difference between the exponents (> 4), the number would effectively be equal to A.
b. For the same reason as above, the sum would still not change.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.