Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Jim__

(14,464 posts)
Sun Oct 13, 2024, 03:55 PM Oct 13

Integer addition algorithm could reduce energy needs of AI by 95%

From TechXplore


16-bit, 8-bit floating point numbers defined in IEEE 754 and on various hardware for tensor computations, and the 16-bit integer. MSB stands for most significant bit and LSB stands for least significant bit. Credit: arXiv (2024). DOI: 10.48550/arxiv.2410.00907
___________________________________________________________________________

A team of engineers at AI inference technology company BitEnergy AI reports a method to reduce the energy needs of AI applications by 95%. The group has published a paper describing their new technique on the arXiv preprint server.

...

The new technique is basic—instead of using complex floating-point multiplication (FPM), the method uses integer addition. Apps use FPM to handle extremely large or small numbers, allowing applications to carry out calculations using them with extreme precision. It is also the most energy-intensive part of AI number crunching.

The researchers call their new method Linear-Complexity Multiplication—it works by approximating FPMs using integer addition. They claim that testing, thus far, has shown that the new approach reduces electricity demand by 95%.

The one drawback it has is that it requires different hardware than that currently in use. But the research team also notes that the new type of hardware has already been designed, built and tested.

a little bit more ...


7 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies

eppur_se_muova

(37,500 posts)
2. Screw AI -- how does this affect computational number theory and encryption ?
Sun Oct 13, 2024, 04:47 PM
Oct 13

Suppose large factorizations are suddenly 20x easier -- how does that affect Web security ?

Jim__

(14,464 posts)
3. I'm not sure why it would affect computational number theory and encryption.
Sun Oct 13, 2024, 05:44 PM
Oct 13

My reading is not that they are changing any underlying algorithms, they are changing the implementation of the algorithms. Apparently AI algorithms perform an extremely high number of floating point multiplications. They are converting those floating point multiplications to integer addition, and that uses significantly less electricity.

The computations are less energy intensive. I don't believe they are any faster or any more powerful than the computations they are replacing.

Again that's based on my reading, you may be seeing something different.

eppur_se_muova

(37,500 posts)
4. Computational number theory does a great many giant-integer multiplies. The fastest way to do these ...
Sun Oct 13, 2024, 08:48 PM
Oct 13

... (AFAIK) is something called the irrational base discrete weighted {fast Fourier} transform, which replaces the integer multiply with multiple floating-point calculations. Weirdly, this sounds like the reverse of the technique in the OP. Either one of these represents a really fundamental change in the algorithm -- it's not anything like fine-tuning. This link gives a nice short intro to what's involved, as well as a further development which does not seem to be practical just yet: https://theconversation.com/weve-found-a-quicker-way-to-multiply-really-big-numbers-114923

Detailed technical discussion is found in the Wikipedia articles https://en.wikipedia.org/wiki/Fast_Fourier_transform and https://en.wikipedia.org/wiki/Multiplication_algorithm. Also briefly discussed in Knuth (vol. 2) and in more detail in Crandall and Pomerantz .

Public-key cryptographic security is often based on the assumption that factoring very large integers is so resource-intensive that it will almost never be worth the effort involved to crack a protected file. As the algorithms to find factors of large integers become faster and more powerful, larger and larger keys are required.


Many cryptographic protocols are based on the difficulty of factoring large composite integers or a related problem—for example, the RSA problem. An algorithm that efficiently factors an arbitrary integer would render RSA-based public-key cryptography insecure.

https://en.wikipedia.org/wiki/Integer_factorization

Jim__

(14,464 posts)
5. It sounds like they may be obverse problems.
Mon Oct 14, 2024, 06:59 AM
Oct 14

This post is based on quick reads of the AI paper and the papers you referenced. So, it's sort of food for thought.

In the AI problem, the number of multiplications is O(n2) where n is roughly based on the number of elements in a tensor. They are working with neural networks here, so my current understanding (I've read through the paper, but I need to read it more thoroughly to get a better understanding) is that this is based on the number of nodes (synapses) in the neural net. From the
paper - my best attempt at copying this:

Multiplication operations are generally more complicated than additions, and FP operation are more costly than integers (Horowitz, 2014). Table 1 shows that multiplying two fp32 numbers consumes 37 times higher energy than adding two 32-bit integers. While the complexity of integer addition is O(n) where n is the number of bits used for representing the number, FP multiplication requires O(e) exponent addition, O(m2) mantissa multiplication, and rounding. Here e and m stand for the number of bits used for exponent and mantissa parts of the FP numbers.

Modern LLM training and inference involves a large number of FP calculations in tensor computa-
tion. Consider calculating the element-size and dot products of two 2-D tensors:

Y1 = A ◦ X, Y2 = A · XT ; A, X (elements of) R(N,k)


Calculating Y1 involves N2 FP multiplications (Mul). If A and X are both fp32 tensors, A ◦ X consumes 37 times higher energy than adding two int32 matrices of the save (SIC - s/b same?) size. Similarly, Calculating Y2 involves (m × n × k) FP Mul and the same number of FP additions (Add). When A and X are fp32 tensors, each Mul-Add operation for two numbers consumes 0.9 + 3.7 = 4.6 (pJ)energy. If we replace the fp32 Mul with int32 Add, the energy cost becomes
0.1 + 0.9 = 1.0 (pJ),only 21.7% of the original cost. Similarly, if the inference is conducted in fp16, replacing fp16 Mul with int16 Add result in a 1 - (0.05 + 0.4) / (1.1 + 0.4) = 70% energy saving.

...


We propose L-Mul, a FP multiplication algorithm with O(n) complexity, where n is the bit size
of its FP operands. Consider two FP numbers x, y, whose exponents and fractions are xe, ye and xm, ym respectively, the vanilla FP Mul result is

Mul(x, y) = (1 + xm) · 2xe · (1 + ym) · 2ye
= (1 + xm + ym + xm · ym) · 2xe+ye

plus an xor operation ( ⊕ ) to decide the sign of the result. Assume xm and ym are mantissas of m bits. The O(m2) mantissa multiplication operation is the complexity bottleneck of this calculation. We remove this operation and introduce a new multiplication algorithm that processes mantissas with a computational complexity of O(m):


L-Mul(x, y) = (1 + xm + ym + 2-l(m)) · 2xee+ym,
l(m) = | m if m =< 3,
.........| 3 if m = 4,
.........| 4 if m > 4.



And in the encryption and large prime searches, the O(n2) is based on the large number of digits in the very large numbers involved. From the link: https://theconversation.com/weve-found-a-quicker-way-to-multiply-really-big-numbers-114923 you referenced from The Conversation:

We’ve found a quicker way to multiply really big numbers


In 1960, Anatoly Karatsuba, a 23-year-old mathematics student in Russia, discovered a sneaky algebraic trick that reduces the number of multiplications needed.

For example, to multiply four-digit numbers, instead of needing 42 = 16 multiplications, Karatsuba’s method gets away with only nine. When using his method, twice as many digits means only three times as much work.

This stacks up to an impressive advantage as the numbers get bigger. For numbers with a thousand digits, Karatsuba’s method needs about 17 times fewer multiplications than long multiplication.


Can the solution to the AI problem have some impact on the solution to the problems of multiplying large numbers? In the long-term there may be some connections between the solutions. But, I don't think there is a currently recognized connection between them.

muriel_volestrangler

(102,558 posts)
6. Encryption surely requires exact integer arithmetic, rather than approximate floating-point arithmetic
Mon Oct 14, 2024, 04:18 PM
Oct 14

The paper says their "L-Mul algorithm" achieves higher precision than 8 bit FP multiplication (though 8 bit is not very precise). But encryption is for integers, and I wouldn't think factorization can be "close enough".

hunter

(38,974 posts)
7. It reminds me of tricks that were used to get 3D gaming effects on slow 8 and 16 bit computers.
Sat Oct 19, 2024, 04:33 PM
Oct 19

As with a lot of AI, these games didn't have to precisely reflect any physical model of reality, they just had to "look" right.

Of course "Natural Intelligence" may take such shortcuts too, which might partly explain why even highly intelligent people sometimes do stupid things.

It would be interesting to compare this energy efficient AI to an otherwise identical floating point or large integer AI.

Latest Discussions»Culture Forums»Science»Integer addition algorith...