Saagar Jha (@saagar@saagarjha.com)

Iwasawa 🌟 (one hikari of too many)

07/26/2024

does anyone actually use sNaNs

3 replies →

3 replies

Saagar Jha

07/26/2024 (replying to Iwasawa 🌟 (one hikari of too many))

@hikari We’ve used it at work to track down arithmetic errors

systemd-jaded.timer

07/26/2024 (replying to Iwasawa 🌟 (one hikari of too many))

@hikari they're used for nan boxing being safe

Iwasawa 🌟 (one hikari of too many)

07/26/2024 (replying to systemd-jaded.timer)

@leftpaddotpy oh is that why they only use qNaNs in JavaScript

mcc

07/26/2024 (replying to Iwasawa 🌟 (one hikari of too many))

@hikari i'm not even sure which, if any, of the systems i've ever used/developed for actually support the signaling features.

I Can't Believe It's Not Zero!

07/26/2024 (replying to mcc)

@mcc @hikari At the HW level, x86 and Apple's more recent ARM cores do. Actual language/SW/tooling support is spottier.

3 replies →

3 replies

mcc

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@steve @hikari Does the specific phrasing "Apple's more recent ARM cores" suggest that other ARM cores do not

I Can't Believe It's Not Zero!

07/26/2024 (replying to mcc)

@mcc @hikari Support for trapping on sNaN is optional, and many implementations do not do it, yes. They will get the right result, but you can't stop the world on an invalid operation.

Tom Forsyth

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@steve @mcc @hikari On x86, you can disable throwing actual exceptions on sNaNs. The MXCSR register has "sticky" bits that you can clear before a sequence, then check afterwards. If there were any sNaNs read or produced during the sequence, the "Invalid Operation" sticky bit will be set. I believe qNaNs do not set the sticky bits.

This is rather klunky to use, and relies on the language/library having support for reading/setting the MXCSR, but it can be done.

I Can't Believe It's Not Zero!

07/26/2024 (replying to Tom Forsyth)

@TomF @mcc @hikari That's the default behavior; you have to _enable_ trapping on sNaN.

qNaN does not set sticky bits except for operations that have no other means to preserve the NaN: comparisons and conversions to integer.

(MXCSR defaults to 0000_1F80h, i.e. all exceptions masked)

Josh Simmons

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@steve @mcc @hikari Is the question, "does anyone use snans directly" or does using the fp control registers to make all nans signal count. because we do the latter (and I believe it's reasonably common). (granted, we then combine this with fast-math so it's all kinds of nightmare, but anyway)

I Can't Believe It's Not Zero!

07/26/2024 (replying to Josh Simmons)

@dotstdy @mcc @hikari "We unmask invalid and run under fast-math" is some sort of shortest compiler-writer horror story entry.

I Can't Believe It's Not Zero!

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@dotstdy @mcc @hikari Yes, I would like several hundred ud2 instructions sprinkled randomly throughout my program. I will definitely not regret this.

Josh Simmons

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@steve @mcc @hikari my favorite part is when people work hard to write explicit nan checks and handling, which then immediately get thrown away by the compiler. (this is all only in internal builds, at least, but it's a bit funny)

I Can't Believe It's Not Zero!

07/26/2024 (replying to Josh Simmons)

@dotstdy @mcc @hikari we went to some length to avoid this in Apple's math.h, so you get a fast inline test when __FINITE_MATH_ONLY__ == 0 and a mandatory function call if you turn on fast-math. This still can't catch everything.

Josh Simmons

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@steve @mcc @hikari Another fun case is when you generate nans with fast math, but don't without. Feels very Congrats, You Played Yourself.

Daniel Gibson

07/26/2024 (replying to Josh Simmons)

@dotstdy @steve @mcc @hikari
people seriously use fast math?!

Josh Simmons

07/26/2024 (replying to Daniel Gibson)

@Doomed_Daniel @steve @mcc @hikari well probably associative-math would get most of the buck without so much of the bang, but where's the fun in that.

I Can't Believe It's Not Zero!

07/26/2024 (replying to Josh Simmons)

@dotstdy @Doomed_Daniel @mcc @hikari associative-math and no-signed-zeros generally get you most of the benefits

Paul Khuong

07/26/2024 (replying to I Can't Believe It's Not Zero!)

@steve @dotstdy @Doomed_Daniel @mcc @hikari What I really want is a diff of the source-to-source transforms with and without the flags :/