Encoding x86-64 instructions: some worked examples
To perform its compile-time evaluation duties CIR includes an x86-64 JIT. Unfortunately, when I looked around I couldn't find a ready-to-use C assembler library (there's one for C++ though), so I had to clobber together my own. While many websites explain how to do the encoding for some simple instructions, few would teach you how to encode any x86-64 instruction. With this post I hope to bridge the gap by showing, though some worked examples, my experience of encoding x86-64 instructions using information from an x86 instruction encoding reference such as http://ref.x86asm.net/index.html.
Before starting: verifying our encoding with
When encoding x86-64 instructions by hand,
it's very useful to verify that your encoded instructions are correct (to save yourself some painful debugging). This can be easily done with objdump.
Save the following as
After making it executable, we can execute it as follows:
Starting off with some x86 instruction examples
Since the x86-64 instruction set simply extends the x86 instruction set, it's a good idea to first take a look at some x86 encoding examples so that we can better understand how the x86-64 encoding is derived from the x86 encoding later.
x86 has 8 registers, 32-bits wide:
They are internally numbered
Now, let's take a look at some examples:
A simple encoding:
ADD eax, 0x4351ff23
Let's start with a simple example of adding a 32-bit immediate value to the
0x4351ff23 is just a random value I picked, you can use your own).
Looking at http://ref.x86asm.net/geek32-abc.html,
we can see many entries listed under
(I've cut out most of the columns of the table since the table was going off the screen)
This is because while the assembly might call all of these instructions "ADD", they all have different opcodes internally. Which opcode you should use (and overall, the encoding of the instruction) depends on the type of your operands. This information is listed under the op1 and op2 columns of the table.
For our instruction, we are looking to operate on
eAX as operand1 and
Iv (meaning Immediate 32-bit value) as operand2.
Looking at the table, we can see that Row 6 (highlighted yellow) matches what we want.
The opcode of the instruction is listed under the po column, and is
So, the first byte of our instruction is
+------+ | 0x05 | +------+ ^ First byte is the opcode.
Next comes our immediate value, which is 32-bits (4 bytes) long. Note that immediates values are encoded with little endian byte order --- that is, least significant byte first.
+------+------+------+------+------+ | 0x05 | 0x23 0xff 0x51 0x43 | +------+------+------+------+------+ ^ The immediate value (least significant byte first)
And, we are done! Dump these bytes into a file (
foo.bin) and then disassemble
it with our disassembly script:
echo -n -e \\x05\\x23\\xff\\x51\\x43 >foo.bin ./disasm.sh tmp.bin foo.bin: file format binary Disassembly of section .data: 0000000000000000 <.data>: 0: 05 23 ff 51 43 add eax,0x4351ff23
So, everything seems correct!
Now, you might notice that while we are telling the processor
which operation to perform via the opcode and the
immediate value, we aren't specifically telling the processor
to use the
This is because opcode
0x05 can only be used with the EAX register
(which is why the EAX register is called the accumulator register).
These kinds of opcodes with specially-blessed registers are quite
a frequent occurrence in the x86 instruction set -- they
trade-off some flexibility to have a shorter instruction encoding.
Introducing the ModR/M byte with
ADD ecx, esi
Let's consider the instruction
ADD ecx, esi, which does a 32-bit
add with the contents of register
I randomly picked
esi --- you can use your own if you want.
For this instruction, the opcode for
ADD that we should use
is one that accepts
Ev as operand1 and
Gv as operand2.
G means that the operands are specified in the ModR/M byte,
v means that the operands are 32-bit in size).
Looking at the table, the opcode that we should use is
and I colored it yellow in the table below:
I also colored opcode
0x03 gray as well. I'll cover that later.
Now, same as usual, we start off with the opcode byte of
+------+ | 0x01 | +------+ ^ First byte is the opcode.
When the processor sees the opcode
it would then expect the ModR/M byte to come next.
The ModR/M byte allows us to specify the register operands of instructions
https://wiki.osdev.org/X86-64_Instruction_Encoding has this to say about
the format of the ModR/M byte:
7 0 +---+---+---+---+---+---+---+---+ | mod | reg | rm | +---+---+---+---+---+---+---+---+
|MODRM.mod||2 bits||When this field is b11, then register-direct addressing mode is used; otherwise register-indirect addressing mode is used.|
This field can have one of two values:
|MODRM.rm||3 bits||Specifies a direct or indirect register operand, optionally with a displacement. (This specifies the |
As such, for the instruction
ADD ecx, esi, our values for the fields of the ModR/M byte are:
- MODRM.mod = b11 (since we are using register-direct addressing mode)
- MODRM.reg = b110 (
esi, which is register 6)
- MODRM.rm = b001 (
ecx, which is register 1)
You might notice that the description in the table above also mentions register-indirect addressing mode.
The ModR/M byte, along with the SIB byte,
can be used to encode an
where an operand is loaded from memory (e.g.
ADD ecx, [esp + 0x4]),
but we won't cover it here.
So, putting the bits together we get the byte b11110001 =
Let's add it after our opcode:
+------+------+ | 0x01 | 0xf1 | +------+------+ ^ ModR/M byte
And run it through our disassembler:
echo -n -e \\x01\\xf1 >foo.bin ./disasm.sh foo.bin foo.bin: file format binary Disassembly of section .data: 0000000000000000 <.data>: 0: 01 f1 add ecx,esi
And everything looks about right!
While in this example we used opcode
you might notice that opcode
0x03 (which I highlighted gray in the table above)
also takes in
Gv as operands --- just that their positions
are swapped. With opcode
Ev is operand 1 and
Gv is operand 2,
but with opcode
Ev is operand 2 and
Gv is operand 1.
So, can you guess what happens if we used opcode
0x03 instead of opcode
In fact, we can actually use both opcodes
0x03 to implement the same
add ecx, esi)!
Moving on to 64-bit
Now, let's move on to see how x86-64 instructions can be encoded.
x86-64 added 8 new registers, for a total of 16 registers.
All 16 registers were also extended to be 64-bits wide.
The full list of registers are:
These are internally numbered
All eight 32-bit x86 registers still exist in x86-64 though,
but what they do is that they now map to the lower 32-bits of their
respective 64-bit register. i.e.
eax maps to the lower 32-bits of
ecx maps to the lower 32-bits of
Also, x86-64, the default operand size is the same as x86, which is 32-bits. This means that the exact same opcodes and instruction encodings that we covered above can be used in x86-64 to mean the same thing (32-bit addition).
So, for example,
0x01 0xf1, which means
add ecx, esi (which we covered above),
will take the lower 32-bits of
rsi, add it to the lower 32-bits of
and write the result to the lower 32-bits of
rcx. The upper-32 bits of
rcx will be zeroed.
That's nice (and the reason why 32-bit executables can easily run without changes on 64-bit processors), but what happens if we want to do 64-bit addition that operates on 64-bit registers instead?
ADD rcx, rsi with the REX prefix byte
To turn the encoding of the
ADD ecx, esi instruction
into the encoding of the
ADD rcx, rsi,
we need to tell the processor to run the ADD instruction with
a 64-bit operand size. This can be done with a REX prefix byte added in front of the instruction.
The REX prefix byte has the following format (Source):
7 0 +---+---+---+---+---+---+---+---+ | 0 1 0 0 | W | R | X | B | +---+---+---+---+---+---+---+---+
|0100||4 bits||Fixed bit pattern|
|W||1 bit||When 1, a 64-bit operand size is used. Otherwise, when 0, the default operand size is used (which is 32-bit for most but not all instructions)|
|R||1 bit||This 1-bit value is an extension to the MODRM.reg field.|
|X||1 bit||This 1-bit value is an extension to the SIB.index field.|
|B||1 bit||This 1-bit value is an extension to the MODRM.rm field or the SIB.base field.|
As we can see in the table above, the
W field is what we want.
So, let's set it to 1 and the rest of the fields 0.
The resulting bit pattern is b01001000, which is
So, we add the byte 0x48 in front of our previous encoding of
ADD ecx, esi (
+------+ +------+------+ | 0x48 | | 0x01 | 0xf1 | +------+ +------+------+ ^ REX byte ^ Encoding of ADD ecx,esi
And run it through our disassembler:
$ echo -n -e \\x48\\x01\\xf1 >foo.bin $ ./disasm.sh foo.bin foo.bin: file format binary Disassembly of section .data: 0000000000000000 <.data>: 0: 48 01 f1 add rcx,rsi
And we get
add rcx, rsi now! Great!
But wait, doesn't x86-64 have 16 registers which go from 0 to 15? Yet, the ModR/M byte only allocates 3 bits each for us to specify register operands. So, how can we specify registers 8 to 15?
As mentioned in the REX byte table above, the
B fields can be used.
ADD rcx, r9
r9 has a register number of 9 which is b1001 in binary.
However, MODRM.reg is only 3 bits long.
As such, we encode the lower 3-bits in the MODRM.reg field, and the upper 1-bit
As such, our final REX byte configuration is:
- REX.W = b1 (since we still want to do a 64-bit ADD)
- REX.R = b1 (upper 1-bit of b1001 (
- REX.X = b0
- REX.B = b0 (upper 1-bit of b0001 (
And our final MODRM byte configuration is:
- MODRM.mod = b11 (since we are still using register-direct addressing mode)
- MODRM.reg = b001 (lower 3-bits of b1001 (
- MODRM.rm = b001 (lower 3-bits of b0001 (
As such, our final REX byte is
0x4c and our
final MODRM byte is
We are still using opcode
0x01 as usual.
As such, our encoding is:
+------+------+------+ | 0x4c | 0x01 | 0xc9 | +------+------+------+
Running it though the disassembler:
echo -n -e \\x4c\\x01\\xc9 >foo.bin ./disasm.sh foo.bin foo.bin: file format binary Disassembly of section .data: 0000000000000000 <.data>: 0: 4c 01 c9 add rcx,r9
This post is still kind of under construction --- In the future I would like to add some more info on encoding the SIB byte. However, hopefully these worked examples would be enough for readers to extrapolate and encode their own instructions using information from http://ref.x86asm.net/index.html.