Gecko/20101207 Thunderbird/3.1.7   
   comp.lang.pascal.misc:164   
   From: Robert AH Prins    
      
   On 2011-01-16 00:37, Robert AH Prins wrote:   
   > On 2011-01-15 16:54, Wolfgang Ehrhardt wrote:   
   >> Virtual Pascal and Borland Pascal with Robert Prins' bpl70v20 have an   
   >> enhanced exp routine based on a code snippet from Norbert Juffa. This   
   >> is more accurate than Borland's standard Pascal/Delphi exp or older VP   
   >> routines.   
   >>   
   >> The key idea is to calculate exp(x) = 2^(log2(e)*x - 1) + 1 with extra   
   >> precision for frac(log2(e)*x) and then use the x87 instruction f2xm1   
   >> to compute 2^frac(log2(e)*x)-1. f2xm1 requires that the abs value of   
   >> its argument must be<= 1.   
   >>   
   >> Unfortunately the Juffa's code miserably fails for some x values if   
   >> the "round up toward +infinity" is used. Example:   
   >>   
   >> program t_expbug;   
   >> {$N+}   
   >> const   
   >> rmUp: word = $1B72;   
   >> var   
   >> x,y: extended;   
   >> begin   
   >> asm   
   >> fldcw [rmUp]   
   >> end;   
   >> x := 20.0 + 0.7944154167983592825;   
   >> y := exp(x);   
   >> writeln(x:18:14, y:27, 1073741824.0:27);   
   >> x := 2*x;   
   >> y := exp(x);   
   >> writeln(x:18:14, y:27, 1152921504606846976.0:27);   
   >> end.   
   >>   
   >> The output is for both Virtual Pascal 2.1.279 and BP7/bpl70v20:   
   >>   
   >> 20.79441541679836 -2.32830643653869629E-10 1.07374182400000000E+09   
   >> 41.58883083359672 -5.00000000000000000E-01 1.15292150460684698E+18   
   >>   
   >> A fix is to test frac() against 1 and correct. I guess this is   
   >> faster than temporarily change the rounding control to "round to   
   >> nearest".   
   >>   
   >> Here is a corrected version (suppressing most comments in order to   
   >> display properly in newsreaders)   
   >>   
   >> const   
   >> ln2_hi: array[0..7] of byte = ($00,$00,$E0,$FE,$42,$2E,$E6,$3F);   
   >> ln2_lo: array[0..7] of byte = ($76,$3C,$79,$35,$EF,$39,$EA,$3D);   
   >>   
   >> function exp(x: extended): extended;assembler;{&Frame-}{&Uses none}   
   >> {-Accurate and debugged exp}   
   >> asm   
   >> {Based on Norbert Juffa's exp}   
   >> fld [x]   
   >> fldl2e   
   >> fmul st,st(1)   
   >> frndint   
   >> fld qword ptr [ln2_hi]   
   >> fmul st,st(1)   
   >> fsubp st(2),st   
   >> fld qword ptr [ln2_lo]   
   >> fmul st, st(1)   
   >> fsubp st(2),st   
   >> fxch st(1)   
   >> fldl2e   
   >> fmulp st(1),st   
   >>   
   >> {It may happen (especially for rounding modes other than   
   >> "round to nearest") that |frac(z)|> 1. In this case the   
   >> result of f2xm1 is undefined. The next lines will test   
   >> frac(z) and truncate it to +1 or -1 if necessary.}   
   >>   
   >> fld st   
   >> fabs   
   >> fld1   
   >> fcompp   
   >> fstsw ax   
   >> sahf   
   >> jae @@1   
   >> fldz   
   >> fcompp   
   >> sahf   
   >> fld1   
   >> jae @@1   
   >> fchs   
   >> @@1:   
   >>   
   >> f2xm1   
   >> fld1   
   >> faddp st(1),st   
   >> fscale   
   >> fstp st(1)   
   >> fwait   
   >> end;   
   >>   
   >>   
   >> Any comments or improvements?   
   >   
   > I've forwarded your code to the grandmaster of everything FPU, let's   
   > wait and see.   
      
   His initial reply:   
      
   Robert:   
      
   Math library functions targetting x87 were typically designed with the    
   assumption that the dynamic rounding control of the x87 FPU at entry is   
   set to round-to-nearest-or-even. That is certainly the specification to    
   which I designed this code, so the results observed are a consequence of   
   operating my code outside its design specifications.   
      
   Does documentation for Borland Pascal (or Virtual Pascal) state anywhere   
   that common math library routines are guaranteed to work correctly if a   
   user changes the x87 rounding mode away from the default setting, prior    
   to invoking such a function? If so, what is the stated definition of the   
   correct behavior in such a case?   
   For example, the specification may state that the result returned must    
   be the same for all rounding modes, which may require each function to    
   use a wrapper that saves the current rounding mode, sets it to round to   
   nearest, then restores it at the end. Obviously this would be slow.   
      
   I will take a quick look to see if there is a small modification to the   
   code that gets results closer to what the users expects, but no promises.   
      
   I'll get back to you by Sunday night.   
      
   -- Norbert   
      
      
   --    
   Robert AH Prins   
   spamtrap(a)prino(d)org   
      
   --- Internet Rex 2.31   
    * Origin: The gateway at Omicron Theta (1:261/20.999)   
|