Saagar Jha

I have noticed a worrying trend in computer science where topics that are have surprising, non-intuitive properties are now taught as “oh, treat this as a magic black box, you will never understand it”. I hate this as people then “teach” by discouraging exploration in the space
9 replies →
9 replies

it's Kanbaru again 🌟

(replying to Saagar Jha)

@saagar knew this would be about floats before i read the follow-up post and yeah, i find this intensely frustrating

it's Kanbaru again 🌟

(replying to it's Kanbaru again 🌟)

@saagar we put so much effort into providing predictable behaviour and then users throw up their hands and say there are no rules and things are useless


William D. Jones

(replying to Saagar Jha)

@saagar I've been playing with integer/fixed point FPGA multipliers and dividers lately.

I do have half a mind to start delving into floating point/reading Sterbenz' book and the IEEE 754 spec so I can make the worlds slowest IEEE 754 FPU on FPGA :D.


🌲

(replying to Saagar Jha)
@saagar There's a CS prof at my uni who does this to such an absurd degree I'm thinking about trying to get him fired

Saagar Jha

(replying to 🌲)
@scathach Good luck I guess

AfeIrasY9gxkZA6dxA.jeff@mk.magicka.org

(replying to Saagar Jha)

@saagar@federated.saagarjha.com i treat that mentality as a challenge


Zhuowei Zhang

(replying to Saagar Jha)
@saagar > oh, treat this as a magic black box, you will never understand it
I see you're trying to add HEVC encoding to your Vision Pro remote desktop app again

Saagar Jha

(replying to Zhuowei Zhang)
@zhuowei Actually my weekend plans involve a bigger black box known as the form 1040
2 replies →
2 replies

Zhuowei Zhang

(replying to Saagar Jha)

@saagar > non-intuitive properties Well, now I see what you did there

Saagar Jha

(replying to Zhuowei Zhang)
@zhuowei Yes one of the non-intuitive properties is how Elon spending $44 billion in 2022 somehow means I’m the one spending a weekend filling out a return by hand in 2024

Félix

(replying to Saagar Jha)

@saagar @zhuowei the biggest, blackest box of all


Eragon

(replying to Saagar Jha)

@saagar@federated.saagarjha.com My problem is that I see that trend in every field of science that I had in school.

Saagar Jha

(replying to Eragon)
@eragon So for the sciences I think there’s a problem that for a lot of the subjects nobody actually knows how they work or there is an endless set of asterisks on everything

Rev. Bussy Body of The Church of America MLM and Shooting Club

(replying to Saagar Jha)
@saagar
🎵 trust and obey
for there's no other way 🎵
🎵 to be happy in CS
but to trust and obey~ 🎵

Wolf480pl

(replying to Saagar Jha)

@saagar hmm do you have an example of that? Cause the only one I can think of that I was told to treat as a black box is nvidia driver...

Saagar Jha

(replying to Wolf480pl)
@wolf480pl See the replies

9579cfc2-3e7d-35c7-af3b-cddda26f71bd

(replying to Saagar Jha)

@saagar
yeah abstractions should have a simpler reliable mental model than the thing they abstract over otherwise they're kinda worthless.
@root

Saagar Jha

(replying to Saagar Jha)
For example, it is now commonly explained to engineers that floating point math does not behave like real-valued arithmetic. This is surprising for most new programmers and this a good topic to introduce. But it’s not done well! We say “here be dragons” but never explain *where*
3 replies →
3 replies

Alex Rosenberg

(replying to Saagar Jha)

@saagar How hard is it to point folks at Goldberg’s paper and educate them?

Saagar Jha

(replying to Alex Rosenberg)
@alexr Which one is that

Rui Paulo

(replying to Saagar Jha)

@saagar this reminds me my teacher in college who presented FP and said “never use float, always double” and ended the class there


catgirl-techsupport:verifiedtrans:

(replying to Saagar Jha)

@saagar@federated.saagarjha.com anecdotal evidence, one of my classmates at the second introduction to programming class was explaining singed overflows to the *TA.* This TA was supposedly at least a sophomore in computer science and had to have completed almost 4 semesters of classes to get there. Yet he didn't know why overflows happen the way they do.

Saagar Jha

(replying to catgirl-techsupport:verifiedtrans:)
@saragon I’ve had multiple (tenured!) professors who wouldn’t be able to answer this well

Saagar Jha

(replying to Saagar Jha)
The result is that I’m reading a Stack Overflow question right now (“why does sin produce different results on different computers”) and a lot of well-meaning people have responses that boil down to “you can’t really expect anything from floating point numbers”.
2 replies →
2 replies

it's Kanbaru again 🌟

(replying to Saagar Jha)

@saagar i recall that someone made a math library that actually gives consistent results for transcendentals on different computers for at least fp32, and does so efficiently, but don't recall the name/author

it's Kanbaru again 🌟

(replying to it's Kanbaru again 🌟)

@saagar oh it's probably this thing, i should've checked the other replies https://discuss.systems/@steve/112056419368964724


Jeroen Postma 🇪🇺

(replying to Saagar Jha)

@saagar To be fair, people suggest to disable cert validation or chmod 777 stuff on that site as well.

I'm not sure if it is a trend, but I do understand your sentiment.

Saagar Jha

(replying to Saagar Jha)
The answer has nothing to do with “limits of double precision” or “rounding” or “you can’t actually expect these kinds of precise results from floating point”. Just because the rules are hard to understand doesn’t mean that nothing is specified or that there aren’t guarantees
3 replies →
3 replies

Félix

(replying to Saagar Jha)

@saagar this is kind of an unfortunate example because IEEE-754 guarantees sin(±0), sin(±inf), sin(NaN) and pretty much nothing else

2 replies →
2 replies

Saagar Jha

(replying to Félix)
@fay59 No see this exactly describes what I am talking about. The guarantees are lax here which is why the results are allowed to differ in a specified way. The answer is not because “oh floating point math is wishy-washy you can only really rely on the first 3-4 decimal places”

Saagar Jha

(replying to Saagar Jha)
@fay59 Actual quotes here

> If you tested maybe 4 or 5 digits of precision, ok. But all 15 / 17 digits? That is bound to fail, if not guaranteed to fail.

> IEEE-745 double precision binary floating point provides no more than 15 decimal significant digits of precision.

I Can't Believe It's Not Zero!

(replying to Félix)

@fay59 @saagar Right, but the actual answer here is "because different math libraries make different implementation choices," not "because floating-point is black fucking magic"

(Also IEEE 754 guarantees _nothing_ about sin, it only recommends)

1 replies →
1 replies

Itai Ferber

(replying to I Can't Believe It's Not Zero!)

@steve @fay59 @saagar Yeah, for guarantees about sin you're gonna have to turn to religion

I Can't Believe It's Not Zero!

(replying to I Can't Believe It's Not Zero!)

@fay59 @saagar also, for anyone who does want to get the same answer everywhere: core-math.gitlabpages.inria.fr

Philip Trettner

(replying to I Can't Believe It's Not Zero!)

@steve @fay59 @saagar "these implementations were tested on x86_64-linux, with and without the use of fma (fused multiply add)." Ok I'm impressed.

I know this thread is about facing the unknown but I was bitten by fma one time too many. Last personal highlight was non-determinism because the same function was inlined in two different places leading to two different fma substitutions and thus different results.


Siguza

(replying to Saagar Jha)

@saagar ok but these guarantees are routinely broken by shitty implementations and then either the spec has to be retroactively loosened, or everyone just looks away and pretends those non-compliant implementations don't exist (but then they are things like Windows or glibc).

1 replies →
1 replies

Saagar Jha

(replying to Siguza)
@siguza Yes but it is important that you know who to blame in those scenarios

Siguza

(replying to Siguza)

@saagar if the guarantees are too complicated for implementers to get right, then there truly are no guarantees.

Saagar Jha

(replying to Siguza)
@siguza Call them whatever you want there is still value to having these

Sven Slootweg (soft-deprecated)

(replying to Saagar Jha)

@saagar Seen a very similar thing with `this` in JS. It's not actually that hard to reason about (and takes like 5 lines to explain!) but because it is counterintuitive, it gets treated as some unpredictable magic and people will confidently pass around 'advice' telling people that it is unreliable and should not be used...

Saagar Jha

(replying to Saagar Jha)
Programmers learn about strings being character arrays and then they learn about Unicode and it traumatizes them so much that they just assume that like you can no longer do any useful operations on strings. And they will confidently go up to people doing this and say as such
1 replies →
1 replies

Philip Trettner

(replying to Saagar Jha)

@saagar Are you telling me there is a way to understand time and timezones?

Saagar Jha

(replying to Philip Trettner)
@artificialmind I mean you don’t have to learn it if you don’t want to. Just don’t go around telling people it’s impossible to understand them

Saagar Jha

(replying to Saagar Jha)
In many cases people will use it to avoid diagnosing real problems! If your code works with dates that doesn’t mean you can point at that part and say “oh this is probably broken because date math is hard” when something behaves in a way you didn’t expect

Saagar Jha

(replying to Saagar Jha)
When working in security I cannot just point at some code and say “this has undefined behavior therefore it is exploitable”. Even though compiler authors will generally tell you otherwise (mostly out of frustration) you can build useful security properties on top of UB
2 replies →
2 replies

Wolf480pl

(replying to Saagar Jha)

@saagar
> you can build useful security properties on top of UB

until a compiler update comes

Saagar Jha

(replying to Wolf480pl)
@wolf480pl Not necessarily, see the replies to https://twitter.com/_saagarjha/status/1765843545112649985 for elaboration

Wolf480pl

(replying to Saagar Jha)

@saagar I can't see those :( I think twitter only shows them to logged-in users

Saagar Jha

(replying to Wolf480pl)
@wolf480pl tl;dr think trapping on (signed) overflow

Wolf480pl

(replying to Saagar Jha)

@saagar are we talking about C? is there a compiler flag that makes that happen?

Saagar Jha

(replying to Wolf480pl)
@wolf480pl -ftrapv in your favorite compiler

Wolf480pl

(replying to Saagar Jha)

@saagar well then the compiler is defining a behaviour that was previously undefined.

You're not guaranteed by C standard that overflows will trap, but you're guaranteed by the compiler that overflows will trap.
It's not UB on that compiler (and hopefully all future versions of that compiler, and maybe other compilers that have the same nonstandard feature). It's just non-portable code.

1/

1 replies →
1 replies

Saagar Jha

(replying to Wolf480pl)
@wolf480pl I don’t really want to interrupt you but the replies to that tweet basically mirror this conversation
1 replies →
1 replies

Wolf480pl

(replying to Saagar Jha)

@saagar I wish nitter was still alive

Saagar Jha

(replying to Wolf480pl)
@wolf480pl Well the other part is half the conversation is with a private account

Saagar Jha

(replying to Saagar Jha)
@wolf480pl I don’t like fundamentally disagree but my take there was “just because your compiler does something on UB does not make it any less UB” and “UB is precisely what gives the compiler leeway to trap”. But you are free to look at however you like

Wolf480pl

(replying to Wolf480pl)

@saagar

That being said I agree that teaching "UB is black box" is wrong. It's best to show an example how a seemingly innocent UB can cause an optimizing compiler to make assumptions and delete half of your code.
2/2

Saagar Jha

(replying to Wolf480pl)
@wolf480pl I bring this up mostly because people who have been trained that UB is only bad without truly understanding it make poor choices about what they think ought to be in the standard
1 replies →
1 replies

Wolf480pl

(replying to Saagar Jha)

@saagar IIRC there's also Implementation-Defined Behaviour which is less evil than UB because it requires the compiler to pick a behaviour upfront and be consistent with it.

Saagar Jha

(replying to Wolf480pl)
@wolf480pl Yeah my past discussion with people I think the viewpoint I have aligned with this one is that trapping should be a permitted implementation-defined behavior and for clearly broken constructs the only permissible behavior

Wolf480pl

(replying to Saagar Jha)

@saagar I don't know enough / haven't thought about it enough to have a definitive opinion on what should be done with UB.

But I kinda like zig's Debug / ReleaseSafe / ReleaseFast build modes.
Debug and ReleaseSafe trap on UB, ReleaseFast assumes no UB and optimizes as much as it can.

Saagar Jha

(replying to Saagar Jha)
@wolf480pl For what it’s worth I am in favor of abolishing undefined behavior by default but surprisingly few people agree with me on how to do that in a security-focused way

Wolf480pl

(replying to Saagar Jha)

@saagar well -O3 isn't a default :P

Saagar Jha

(replying to Wolf480pl)
@wolf480pl Despite my insistence that people who want portable assembler should use -O0, compilers can and do generate code assuming a lack of UB even at that optimization level
2 replies →
2 replies

Wolf480pl

(replying to Saagar Jha)

@saagar oof


lonjil

(replying to Saagar Jha)

@saagar @wolf480pl it's sort of impossible not to. Any transformation of code, no matter how simply, has to rely on the idea that the behavior of the code is the same before and after the transformation. And if there is no known behavior (because UB) you're outta luck.

Wolf480pl

(replying to lonjil)

@lonjil @saagar you (a compiler author) can still define the behaviour yourself, document it, then stick to it

lonjil

(replying to Wolf480pl)

@wolf480pl @saagar there's some low hanging fruit in C that's easy enough to define, but a lot of UB in C is very hard to define, at least on normal hardware.

(and you're probably not defining things differently at O0 and O3, I think?)

Wolf480pl

(replying to lonjil)

@lonjil @saagar no I actually mean define them only at O0 :P
Define them to do something dumb and potentially dangerous but predictable.
Bonus points if it matches most people's idea of a "a high-level assembly"

Wolf480pl

(replying to Wolf480pl)

@lonjil @saagar for example, have -O0 imply -fwrapv

Saagar Jha

(replying to Wolf480pl)
@wolf480pl @lonjil This is awful

Wolf480pl

(replying to Saagar Jha)

@saagar @lonjil but that's what you'd expect from a "high level assembly", right?

I'm not saying anyone should do that, just that there is a way to make O0 behave more like assembly

Saagar Jha

(replying to Wolf480pl)
@wolf480pl @lonjil I would generally want to divorce “do what I mean” from default debug builds tbh

Wyatt (🏳️‍⚧️♀?)

(replying to Saagar Jha)

@saagar OpenSSL did this iirc and debian broke it by trying to fix it. Maybe that was just uninitialized memory but I think that's a kind of undefined behavior.

Saagar Jha

(replying to Wyatt (🏳️‍⚧️♀?))
@wyatt8740 Trying to use undefined behavior in your code for intended functionality is generally a bad idea

Wyatt (🏳️‍⚧️♀?)

(replying to Saagar Jha)

@saagar It was essential to the design of OpenSSL because it increases the entropy of the seeds it generates

Saagar Jha

(replying to Wyatt (🏳️‍⚧️♀?))
@wyatt8740 Yes and this happened to be a bad idea :P

Wyatt (🏳️‍⚧️♀?)

(replying to Saagar Jha)

@saagar How was it a bad idea? It was a good idea; downstream idiots who didn't understand it purposefully broke the code thinking they were "fixing" it because they didn't understand the code and saw valgrind warn about it. That code is still in openssl.

Saagar Jha

(replying to Wyatt (🏳️‍⚧️♀?))
@wyatt8740 Using uninitialized memory as entropy is generally a bad idea. It doesn’t really help and it breaks tools that you would generally want to use to find real bugs. The “fix” was incorrect, of course, but not idiotic. FWIW, the code is gone now: https://github.com/openssl/openssl/commit/75e2c877650444fb829547bdb58d46eb1297bc1a#diff-7540ce8fd73afa23b44db37b090c9aa47f5c361f8f2bb5508be45555e9a1f6bbL191

Wyatt (🏳️‍⚧️♀?)

(replying to Saagar Jha)

@saagar how does drbg do it? Also if you lack the ability to think critically about the warnings a program is giving you when programming in general I think you're sort of cruising for a bruising. Deep understanding is always better than "waving a chicken at it until the warnings go away," or trusting software to be infallible.

Saagar Jha

(replying to Wyatt (🏳️‍⚧️♀?))
@wyatt8740 Not an expert but I think it just uses the ones you’d expect like system entropy and jitter

Saagar Jha

(replying to Saagar Jha)
Anyways the point I am making is that if “I don’t understand this” is in any part of your reasoning chain the final result needs to also be “I don’t understand this” not “things don’t work this way”

Wowfunhappy

(replying to Saagar Jha)

@saagar Playing devils's advocate a bit: it may be the case that you can increment a variable by using butterfly wings to deflect incoming cosmic rays, but I don't understand cosmic rays, and it's easier to write `i++`. Learning something you don't understand is often harder than just implementing something you do, so if you're trying to just get stuff done, saying "please use this solution which I know will work" isn't necessarily bad advice.

Wowfunhappy

(replying to Wowfunhappy)

@saagar Mind, I'm perfectly aware of all the ways this leads to madness. "Just use this library" leading to endless layers of abstraction, for example.

Saagar Jha

(replying to Wowfunhappy)
@Wowfunhappy To be clear I am not against abstraction or selecting what you want to learn here. I just caution against going “oh this thing is hard to understand therefore I will tell other people that they cannot work with it and must fear it”