a 64-bit number does not fit in the register int in x86_64 mode

I have an intel processor on my pc here, running in 64-bit mode, x86_64, where the registers have 64 bits in size, if I use the word register, or use some flag optimization the variable tends to be placed in the register, but if I put a value above 32 bits, the compiler complains, ie my int is not being 64 bits, why does this happen? It was not to be 64 bits, if the variable were placed in the register, and I did not even get its memory address? That is, it is not even placed on the stack.

#include <stdio.h>

int main(void) {

    register int x = 4294967296;

    return 0;
}

Compile:

gcc example.c -o example -Wall -Wextra

Output:

warning: overflow in implicit constant conversion [-Woverflow] register int x = 4294967296;

Answer

The C register keyword does nothing to override the width of C type you chose. int is a 32-bit type in all 32 and 64-bit x86 ABIs, including x86-64 System V and Windows x64. (long is also 32-bit on Windows x64, but 64-bit on Linux / Mac / everything else x86-641.)

A register int is still an int, subject to all the limits of INT_MAX, INT_MIN, and signed overflow being undefined behaviour. It doesn’t turn your C source into a portable assembly language.

Using register just tells the compiler to stop you from taking the variable’s address, so even in debug mode (with minimal optimization) a very naive compiler can keep the variable in (the low half of) a register without running into any surprises later in the function. (Of course modern compilers don’t need this help normally, but for some register actually does have an effect in debug mode.)


If a register has 64 bits, why should the convention of int be 32 bits

int is 32-bit because nobody wants an array of int to become 64-bit on a 64-bit machine, having twice the cache footprint.

A very few C implementations have int = int64_t (for example on some Cray machines, I think I’ve read), but it’s extremely rare even outside of x86-64 where 32-bit is the “natural” and most efficient operand-size for machine code. Even DEC Alpha (which was aggressively 64-bit and designed from scratch for 64-bit) I think still used 32-bit int.

Making int 32-bit when growing from 16 to 32-bit machines made sense, because 16-bit is “too small” sometimes. (But remember that ISO C only guarantees that int is at least 16 bits. If you need more than that, in a truly portable program you’d better use long or int_least32_t.)

But 32 bits is “big enough” for most programs, and 64-bit machines always have fast 32-bit integers, so int stayed 32-bit when moving from 32 to 64-bit machines.

On some machines, 16-bit integers aren’t very well supported. e.g. implementing the wrapping to 16 bits with uint16_t on MIPS would require extra AND-immediate instructions. So making int a 16-bit type would have been a poor choice there.

On x86 you could just use 16-bit operand size, and use movzx instead of mov when copying, but it’s “normal” for int to be 32-bit on 32-bit machines so x86 32-bit ABIs all chose that.

When ISAs were extended from 32 to 64-bit, there was zero performance reason to make int wider, unlike the 16->32 case. (Also in that case, short stayed 16-bit so there was a typename for both 16 and 32-bit integers, even before C99 stdint.h existed).

On x86-64, the default operand-size is still 32-bit; mov rax, rcx takes an extra prefix byte (REX.W) vs. mov eax, ecx, so 32-bit is slightly more efficient. Also, 64-bit multiply was slightly slower on some CPUs, and 64-bit division is significantly slower than 32-bit even on current Intel CPUs. The advantages of using 32bit registers/instructions in x86-64


Also, compilers need a primitive type for int32_t, if they want to provide the optional int32_t at all. (The fixed-width 2’s complement types are optional, unlike the int_least32_t and so on which isn’t guaranteed to be 2’s complement or free of padding.)

Compilers with 16-bit short and 64-bit int could have an implementation-specific type name like __int32 that they use as the typedef for int32_t / uint32_t, so this argument isn’t a total showstopper. But it would be weird.

When growing from 16 to 32, it made sense to change int to be wider than the ISO C minimum, because you still have short as a name for 16-bit. (This argument isn’t super great because you do have long as a name for 32-bit integers on 32-bit systems.)

But when growing to 64-bit, you want some type to be a 32-bit integer type. (And long can’t be narrower than int). char / short / int / long (or long long) covers all 4 possible operand-sizes. int32_t isn’t guaranteed to be available on all systems, so expecting everyone to use that if they want 32-bit signed integers is not a viable option for portable code.


Footnote 1:

You could argue either way whether it’s better for long to be a 32 or 64-bit type. Microsoft’s choice of keeping it 32-bit meant that struct layouts using long might not change between 32 and 64-bit code (but they would if they included pointers).

ISO C requires long to be at least a 32-bit type (actually they define it in terms of the min and max value that can be represented, but elsewhere they do require that integer types are binary integers with optional padding).

Anyway, some code uses long because it needs a 32-bit type, but it doesn’t need 64; in many cases more bits isn’t better, they’re just not needed.

Within a single ABI like x86-64 System V, it’s semi-convenient to always have long be the same width as a pointer, but since portable code always needs to use unsigned long long or uint64_t or uint_least64_t or uintptr_t depending on the use-case, it might be a mistake for x86-64 System V to have chosen 64-bit long.

OTOH, wider types for locals can sometimes save instructions by avoiding sign-extending when indexing a pointer, but the fact that signed overflow is undefined behaviour often lets compilers widen int in the asm when convenient.

Leave a Reply

Your email address will not be published. Required fields are marked *