Logo

Controlling the rounding mode


The
Controlling
Examples
CADNA
The




Before using the tools presented here, please, read this page carefully.


Most often, every floating point operation generates a round-off error. It means that the mathematical result is not a floating point number i.e. it cannot be coded exactly in memory. Therefore, a floating point number has to be chosen to approximate the exact result. The deterministic way which replaces the mathematical value by the floating point number is called the rounding mode.

The IEEE floating point arithmetic includes four rounding modes: to the nearest (by default for all compilers), to zero, to plus infinity and to minus infinity. Unfortunately, there is no useful way, using the C, Fortran or ADA languages, for choosing or changing at the run-time the rounding mode.

Four small subroutines -
rd_near (for rounding to the nearest),
rd_zero (for rounding to zero),
rd_minf (for rounding to minus infinity) and
rd_pinf (for rounding to plus infinity) -
have been written in assembler for changing in real time the rounding mode to the corresponding one for some computers under some unix systems. The following C code gives an example of the use of these functions. It is a same computation performed under the four rounding modes. It must give four different results.

#include <stdio.h>

void main()
{float x, y, z1, z2;
x = 1.0;
y = 1.0e-20;
rd_near();
z1 = x - y; z2 = y - x; z1 = z1 - x; z2 = z2 + x;
printf("near, z1 = %17.10e, z2 = %17.10e \n",z1, z2);
rd_minf();
z1 = x - y; z2 = y - x; z1 = z1 - x; z2 = z2 + x;
printf("minf, z1 = %17.10e, z2 = %17.10e \n",z1, z2);
rd_pinf();
z1 = x - y; z2 = y - x; z1 = z1 - x; z2 = z2 + x;
printf("pinf, z1 = %17.10e, z2 = %17.10e \n",z1, z2);
rd_zero();
z1 = x - y; z2 = y - x; z1 = z1 - x; z2 = z2 + x;
printf("zero, z1 = %17.10e, z2 = %17.10e \n",z1, z2);
};

If the four funtions works and without optimization, the result must be :

near, z1 = 0.0000000000e+00, z2 = 0.0000000000e+00
minf, z1 = -5.9604644775e-08, z2 = -0.0000000000e+00
pinf, z1 = 0.0000000000e+00, z2 = 5.9604644775e-08
zero, z1 = -5.9604644775e-08, z2 = 5.9604644775e-08

These functions are given throught the following assembler source codes which can be compiled on all Unix system by (for instance):

as rounding_pc.s -o rounding_pc.o

If rounding_test.c in the upper C source code, the executable code rounding_test is obtained with the following instruction:

gcc rounding_test.c rounding_pc.o -o rounding_test

THE FOLLOWING CODES ARE ONLY AVAILABLE FOR UNIX SYSTEMS AND WITHOUT ANY GUARANTEE.

In all the following assembler source codes, the labels of functions have been written for the GCC compiler which do not add any underscore on the label of C source codes.

Of course, on Unix systems, it can be used with other compilers (FORTRAN or ADA). But a lot of compilers change "true" labels of original source codes by adding underscores before or after subroutine or function names. For instance, the NagWare F95 compiler adds one underscore at the end of each name. To make the following assembler source codes still available, you just have to edit the source and to add by yourself the corresponding underscore before you compile it.

To choose the rounding mode on Decalpha computers.
To choose the rounding mode on HP computers.
To choose the rounding mode on IBM computers.
To choose the rounding mode on PC computers with Intel or compatible processors.
To choose the rounding mode on SGI computers.
To choose the rounding mode on SUN computers.
To get the C source code for testing the rounding functions.


- More informations can be requested to Cadna Team