LLVM's integer types: markgritter

markgritter

LLVM's integer types

Feb 15, 2013 22:14

When I read the LLVM assembly language definition a while back, I thought "cool, it's got support for integer types with any specified number of bits." (Well, up to about 8 million bits but that's plenty.)

With the three-day weekend I thought I would relax by getting more familiar with LLVM, and maybe build a toy language idea I had kicking around, which would perform type inference on integer lengths. (i32 * i32 != i32, dammit!)

Unfortunately, while LLVM version 3.2 seems to generate valid code to add large integers, it does not seem to multiply them correctly. (And, for sufficiently large integers, the x86 assembler asserts.)

Here's a function that adds two 132-bit unsigned integers:

define i133 @add132( i132 %a, i132 %b ) nounwind uwtable {
%wideA = zext i132 %a to i133
%wideB = zext i132 %b to i133
%result = add i133 %wideA, %wideB
ret i133 %result
}
LLVM assembles to code which truncates the arguments to 132 bits and then adds them correctly:

add132:
andq $15, %rdx
andq $15, %r9
addq %rcx, %rdi
adcq %r8, %rsi
adcq %rdx, %r9
andq $31, %r9
movq %rdi, %rax
movq %rsi, %rdx
movq %r9, %rcx
ret
However, multiplying two 35-bit numbers:

define i70 @mul35( i35 %a, i35 %b ) nounwind uwtable {
%wideA = zext i35 %a to i70
%wideB = zext i35 %b to i70
%result = mul i70 %wideA, %wideB
ret i70 %result
}
ends up using a bare mulq instruction, which is only 64-bit:

mul35:
movabsq $34359738367, %rax # imm = 0x7FFFFFFFF
andq %rax, %rsi
andq %rax, %rdi
movq %rsi, %rax
mulq %rdi
ret

but, log2( (2^35-1)(2^35-1) ) is just a shade south of 70 so this code will result in wraparound.

mathematics, programming