83
submitted 2 weeks ago* (last edited 2 weeks ago) by WhyEssEff@hexbear.net to c/chat@hexbear.net

puzzling out the proofs for concepts so utterly fundamental to math by myself that it’s like if Genesis 1:3 was And God said, 'Let there be integer,' and there was integer

you are viewing a single comment's thread
view the rest of the comments
[-] context@hexbear.net 4 points 2 weeks ago* (last edited 2 weeks ago)

or, to have the multiplication table as a persistent hash map in memory, pre-computing all possible products.

huh, yeah, i guess that's kinda how humans do long division?

See Low Level Learning’s 5 minute video

he's hopping right past the most interesting part, though. sure, doing math with byte logic is tricky, so you have to make approximations. so in order to divide by 9 the processor does some fixed point math and says to divide by 9 we're actually going to multiply by 2^33 / 9 which equals that long number. but how does the processor know what 2^33/9 equals? how's it doing that? sounds like it's very good at division, because it would take me a while to work out that value even if i started with trying to find 8589934592/9, y'know?

more to the point, how does it do 0b1000000000000000000000000000000000 / 0b1001 = 0b111000111000111000111000111001 so quickly?

[-] DaPorkchop_@lemmy.ml 3 points 2 weeks ago* (last edited 2 weeks ago)

You should understand that "computers very bad at division" is a relative term, on modern desktop x86 chips a single 64-bit division takes somewhere on the order of 20-40 clock cycles (which seems pretty fast when you think about how they're running at like 4 billion cycles per second, but these same chips can do integer multiplication in 2-3 cycles so division is painfully slow by comparison).

The reason multiplying by 2^33 / 9 is faster is because you don't have to actually compute what 2^33 is and then divide by 9 every single time - the compiler can compute that value the "slow" way a single time and bake that into the machine code, and then when the program is running it only has to deal with multiplication.

While we don't know exactly what techniques are being used in a modern Intel/AMD chip, as far as I know the current best approach is basically just long division.

EDIT: The reason long division is slow is that all the partial results depend on the results of previous steps. This means that your division circuit ends up being a really long line of comparator circuits chained end-to-end, and long circuits are bad because it takes a long time for the signal to actually reach the end. Multiplication is fast by comparison because you can compute all the partial products in parallel, and then add them together in a kind of tree shape. The end result is a circuit which is super wide, but the "propagation delay" (the time it takes until the last input signal reaches the end) is pretty low since there's no path which passes through more than a few adders.

this post was submitted on 13 Sep 2024
83 points (100.0% liked)

chat

8147 readers
341 users here now

Chat is a text only community for casual conversation, please keep shitposting to the absolute minimum. This is intended to be a separate space from c/chapotraphouse or the daily megathread. Chat does this by being a long-form community where topics will remain from day to day unlike the megathread, and it is distinct from c/chapotraphouse in that we ask you to engage in this community in a genuine way. Please keep shitposting, bits, and irony to a minimum.

As with all communities posts need to abide by the code of conduct, additionally moderators will remove any posts or comments deemed to be inappropriate.

Thank you and happy chatting!

founded 3 years ago
MODERATORS