From: already5chosen@yahoo.com
On Fri, 28 Nov 2025 12:45:58 +0100
David Brown wrote:
>
> I can believe that. If you have to implement floating point routines
> in general integer hardware (and I expect that is the case for most
> of your implementation here) then I would think it is better to start
> and end with the data in GPR's. On some targets, moving data into
> and out of floating point or vector registers is efficient enough
> that those registers can effectively be used as caches, but it sounds
> like that is not the case here.
>
On Windows the problem is only of moving data between various types of
registers.
On SysV things are worse: there is also a problem of absence of
caller-saved FP/SIMD registers. In theory, the problem could have been
solved by defining specialized ABI for support routines (__addtf3,
__subtf3, __multf3, etc...), but that was not done either.
I think, that it all comes from the old mental model of soft floating
point routines being very slow; so slow that ABI impedance mismatches
lost in noise. But in specific case of binary128 on modern CPUs, it's
simply not true - arithmetic itself is quite fast so ABI mismatches are
significant.
--- SoupGate-Win32 v1.05
* Origin: you cannot sedate... all the things you hate (1:229/2)
|