Science

Jim__

(14,845 posts)

5. It sounds like they may be obverse problems.

Mon Oct 14, 2024, 07:59 AM

Oct 2024

This post is based on quick reads of the AI paper and the papers you referenced. So, it's sort of food for thought.

In the AI problem, the number of multiplications is O(n²) where n is roughly based on the number of elements in a tensor. They are working with neural networks here, so my current understanding (I've read through the paper, but I need to read it more thoroughly to get a better understanding) is that this is based on the number of nodes (synapses) in the neural net. From the
paper - my best attempt at copying this:

Multiplication operations are generally more complicated than additions, and FP operation are more costly than integers (Horowitz, 2014). Table 1 shows that multiplying two fp32 numbers consumes 37 times higher energy than adding two 32-bit integers. While the complexity of integer addition is O(n) where n is the number of bits used for representing the number, FP multiplication requires O(e) exponent addition, O(m²) mantissa multiplication, and rounding. Here e and m stand for the number of bits used for exponent and mantissa parts of the FP numbers.

Modern LLM training and inference involves a large number of FP calculations in tensor computa-
tion. Consider calculating the element-size and dot products of two 2-D tensors:

Y₁ = A ◦ X, Y₂ = A · X^T ; A, X (elements of) R^(N,k)

Calculating Y₁ involves N² FP multiplications (Mul). If A and X are both fp32 tensors, A ◦ X consumes 37 times higher energy than adding two int32 matrices of the save (SIC - s/b same?) size. Similarly, Calculating Y₂ involves (m × n × k) FP Mul and the same number of FP additions (Add). When A and X are fp32 tensors, each Mul-Add operation for two numbers consumes 0.9 + 3.7 = 4.6 (pJ)energy. If we replace the fp32 Mul with int32 Add, the energy cost becomes
0.1 + 0.9 = 1.0 (pJ),only 21.7% of the original cost. Similarly, if the inference is conducted in fp16, replacing fp16 Mul with int16 Add result in a 1 - (0.05 + 0.4) / (1.1 + 0.4) = 70% energy saving.

...

We propose L-Mul, a FP multiplication algorithm with O(n) complexity, where n is the bit size
of its FP operands. Consider two FP numbers x, y, whose exponents and fractions are x_e, y_e and x_m, y_m respectively, the vanilla FP Mul result is

Mul(x, y) = (1 + x_m) · 2x_e · (1 + y_m) · 2y_e
= (1 + x_m + y_m + x_m · y_m) · 2x_e+y_e

plus an xor operation ( ⊕ ) to decide the sign of the result. Assume x_m and y_m are mantissas of m bits. The O(m²) mantissa multiplication operation is the complexity bottleneck of this calculation. We remove this operation and introduce a new multiplication algorithm that processes mantissas with a computational complexity of O(m):

L-Mul(x, y) = (1 + x_m + y_m + 2-l(m)) · 2x_ee+y_m,
l(m) = | m if m =< 3,
.........| 3 if m = 4,
.........| 4 if m > 4.

And in the encryption and large prime searches, the O(n²) is based on the large number of digits in the very large numbers involved. From the link: https://theconversation.com/weve-found-a-quicker-way-to-multiply-really-big-numbers-114923 you referenced from The Conversation:

We’ve found a quicker way to multiply really big numbers

In 1960, Anatoly Karatsuba, a 23-year-old mathematics student in Russia, discovered a sneaky algebraic trick that reduces the number of multiplications needed.

For example, to multiply four-digit numbers, instead of needing 4² = 16 multiplications, Karatsuba’s method gets away with only nine. When using his method, twice as many digits means only three times as much work.

This stacks up to an impressive advantage as the numbers get bigger. For numbers with a thousand digits, Karatsuba’s method needs about 17 times fewer multiplications than long multiplication.

Can the solution to the AI problem have some impact on the solution to the problems of multiplying large numbers? In the long-term there may be some connections between the solutions. But, I don't think there is a currently recognized connection between them.

Edit history

Please sign in to view edit histories.

Recommendations

0 members have recommended this reply (displayed in chronological order):

7 replies

= new reply since forum marked as read

Highlight:

Integer addition algorithm could reduce energy needs of AI by 95% [View all] Jim__ Oct 2024 OP

Thanks. littlemissmartypants Oct 2024 #1

Screw AI -- how does this affect computational number theory and encryption ? eppur_se_muova Oct 2024 #2

I'm not sure why it would affect computational number theory and encryption. Jim__ Oct 2024 #3

Computational number theory does a great many giant-integer multiplies. The fastest way to do these ... eppur_se_muova Oct 2024 #4

It sounds like they may be obverse problems. Jim__ Oct 2024 #5

Encryption surely requires exact integer arithmetic, rather than approximate floating-point arithmetic muriel_volestrangler Oct 2024 #6

It reminds me of tricks that were used to get 3D gaming effects on slow 8 and 16 bit computers. hunter Oct 2024 #7