Encoding x86-64 instructions: some worked examples
To perform its compile-time evaluation duties CIR includes an x86-64 JIT. Unfortunately, when I looked around I couldn't find a ready-to-use C assembler library (there's one for C++ though), so I had to clobber together my own. While many websites explain how to do the encoding for some simple instructions, few would teach you how to encode any x86-64 instruction. With this post I hope to bridge the gap by showing, though some worked examples, my experience of encoding x86-64 instructions using information from an x86 instruction encoding reference such as http://ref.x86asm.net/index.html.
References
Before starting: verifying our encoding with objdump
When encoding x86-64 instructions by hand,
it's very useful to verify that your encoded instructions are correct (to save yourself some painful debugging). This can be easily done with objdump.
Save the following as disasm.sh
:
After making it executable, we can execute it as follows:
$ ./disasm.sh foo.bin
Starting off with some x86 instruction examples
Since the x86-64 instruction set simply extends the x86 instruction set, it's a good idea to first take a look at some x86 encoding examples so that we can better understand how the x86-64 encoding is derived from the x86 encoding later.
x86 has 8 registers, 32-bits wide:
eax
, ecx
, edx
, ebx
,
rsp
, ebp
, esi
, edi
.
They are internally numbered 0
to 7
respectively.
Now, let's take a look at some examples:
A simple encoding: ADD eax, 0x4351ff23
Let's start with a simple example of adding a 32-bit immediate value to the eax
register.
(The value 0x4351ff23
is just a random value I picked, you can use your own).
Looking at http://ref.x86asm.net/geek32-abc.html,
we can see many entries listed under ADD
:
mnemonic | op1 | op2 | po |
---|---|---|---|
ADD | Eb | Gb | 00 |
ADD | Ev | Gv | 01 |
ADD | Gb | Eb | 02 |
ADD | Gv | Ev | 03 |
ADD | AL | Ib | 04 |
ADD | eAX | Iv | 05 |
ADD | Eb | Ib | 80 |
ADD | Ev | Iv | 81 |
(I've cut out most of the columns of the table since the table was going off the screen)
This is because while the assembly might call all of these instructions "ADD", they all have different opcodes internally. Which opcode you should use (and overall, the encoding of the instruction) depends on the type of your operands. This information is listed under the op1 and op2 columns of the table.
For our instruction, we are looking to operate on eAX
as operand1 and
Iv
(meaning Immediate 32-bit value) as operand2.
Looking at the table, we can see that Row 6 (highlighted yellow) matches what we want.
The opcode of the instruction is listed under the po column, and is 0x05
.
So, the first byte of our instruction is 0x05
:
+------+
| 0x05 |
+------+
^ First byte is the opcode.
Next comes our immediate value, which is 32-bits (4 bytes) long. Note that immediates values are encoded with little endian byte order --- that is, least significant byte first.
+------+------+------+------+------+
| 0x05 | 0x23 0xff 0x51 0x43 |
+------+------+------+------+------+
^ The immediate value (least significant byte first)
And, we are done! Dump these bytes into a file (foo.bin
) and then disassemble
it with our disassembly script:
$echo -n -e \\x05\\x23\\xff\\x51\\x43 >foo.bin
$ ./disasm.sh tmp.bin
foo.bin: file format binary
Disassembly of section .data:
0000000000000000 <.data>:
0: 05 23 ff 51 43 add eax,0x4351ff23
So, everything seems correct!
Now, you might notice that while we are telling the processor
which operation to perform via the opcode and the
immediate value, we aren't specifically telling the processor
to use the EAX
register.
This is because opcode 0x05
can only be used with the EAX register
(which is why the EAX register is called the accumulator register).
These kinds of opcodes with specially-blessed registers are quite
a frequent occurrence in the x86 instruction set -- they
trade-off some flexibility to have a shorter instruction encoding.
Introducing the ModR/M byte with ADD ecx, esi
Let's consider the instruction ADD ecx, esi
, which does a 32-bit
add with the contents of register esi
with ecx
.
I randomly picked ecx
and esi
--- you can use your own if you want.
For this instruction, the opcode for ADD
that we should use
is one that accepts Ev
as operand1 and Gv
as operand2.
(E
and G
means that the operands are specified in the ModR/M byte,
and v
means that the operands are 32-bit in size).
Looking at the table, the opcode that we should use is 0x01
,
and I colored it yellow in the table below:
mnemonic | op1 | op2 | po |
---|---|---|---|
ADD | Eb | Gb | 00 |
ADD | Ev | Gv | 01 |
ADD | Gb | Eb | 02 |
ADD | Gv | Ev | 03 |
ADD | AL | Ib | 04 |
ADD | eAX | Iv | 05 |
ADD | Eb | Ib | 80 |
ADD | Ev | Iv | 81 |
I also colored opcode 0x03
gray as well. I'll cover that later.
Now, same as usual, we start off with the opcode byte of 0x01
:
+------+
| 0x01 |
+------+
^ First byte is the opcode.
When the processor sees the opcode 0x01
,
it would then expect the ModR/M byte to come next.
The ModR/M byte allows us to specify the register operands of instructions
that take E
and G
operands.
https://wiki.osdev.org/X86-64_Instruction_Encoding has this to say about
the format of the ModR/M byte:
7 0
+---+---+---+---+---+---+---+---+
| mod | reg | rm |
+---+---+---+---+---+---+---+---+
Field | Length | Description |
---|---|---|
MODRM.mod | 2 bits | When this field is b11, then register-direct addressing mode is used; otherwise register-indirect addressing mode is used. |
MODRM.reg | 3 bits | This field can have one of two values:
|
MODRM.rm | 3 bits | Specifies a direct or indirect register operand, optionally with a displacement. (This specifies the E operand) |
As such, for the instruction ADD ecx, esi
, our values for the fields of the ModR/M byte are:
- MODRM.mod = b11 (since we are using register-direct addressing mode)
- MODRM.reg = b110 (
esi
, which is register 6) - MODRM.rm = b001 (
ecx
, which is register 1)
You might notice that the description in the table above also mentions register-indirect addressing mode.
The ModR/M byte, along with the SIB byte,
can be used to encode an ADD
instruction
where an operand is loaded from memory (e.g. ADD ecx, [esp + 0x4]
),
but we won't cover it here.
So, putting the bits together we get the byte b11110001 = 0xf1
.
Let's add it after our opcode:
+------+------+
| 0x01 | 0xf1 |
+------+------+
^ ModR/M byte
And run it through our disassembler:
$echo -n -e \\x01\\xf1 >foo.bin
$ ./disasm.sh foo.bin
foo.bin: file format binary
Disassembly of section .data:
0000000000000000 <.data>:
0: 01 f1 add ecx,esi
And everything looks about right!
Opcode 0x01
vs 0x03
While in this example we used opcode 0x01
,
you might notice that opcode 0x03
(which I highlighted gray in the table above)
also takes in Ev
and Gv
as operands --- just that their positions
are swapped. With opcode 0x01
, Ev
is operand 1 and Gv
is operand 2,
but with opcode 0x01
, Ev
is operand 2 and Gv
is operand 1.
So, can you guess what happens if we used opcode 0x03
instead of opcode 0x01
?
In fact, we can actually use both opcodes 0x01
and 0x03
to implement the same
instruction (add ecx, esi
)!
Moving on to 64-bit ADD
Now, let's move on to see how x86-64 instructions can be encoded.
x86-64 added 8 new registers, for a total of 16 registers.
All 16 registers were also extended to be 64-bits wide.
The full list of registers are:
rax
, rcx
, rdx
, rbx
,
rsp
, rbp
, rsi
, rdi
,
r8
, r9
, r10
,
r11
, r12
, r13
,
r14
, r15
.
These are internally numbered 0
to 15
respectively.
All eight 32-bit x86 registers still exist in x86-64 though,
but what they do is that they now map to the lower 32-bits of their
respective 64-bit register. i.e. eax
maps to the lower 32-bits of rax
,
ecx
maps to the lower 32-bits of rcx
etc.
Also, x86-64, the default operand size is the same as x86, which is 32-bits. This means that the exact same opcodes and instruction encodings that we covered above can be used in x86-64 to mean the same thing (32-bit addition).
So, for example, 0x01 0xf1
, which means add ecx, esi
(which we covered above),
will take the lower 32-bits of rsi
, add it to the lower 32-bits of rcx
,
and write the result to the lower 32-bits of rcx
. The upper-32 bits of rcx
will be zeroed.
That's nice (and the reason why 32-bit executables can easily run without changes on 64-bit processors), but what happens if we want to do 64-bit addition that operates on 64-bit registers instead?
Encoding ADD rcx, rsi
with the REX prefix byte
To turn the encoding of the ADD ecx, esi
instruction
into the encoding of the ADD rcx, rsi
,
we need to tell the processor to run the ADD instruction with
a 64-bit operand size. This can be done with a REX prefix byte added in front of the instruction.
The REX prefix byte has the following format (Source):
7 0
+---+---+---+---+---+---+---+---+
| 0 1 0 0 | W | R | X | B |
+---+---+---+---+---+---+---+---+
Field | Length | Description |
---|---|---|
0100 | 4 bits | Fixed bit pattern |
W | 1 bit | When 1, a 64-bit operand size is used. Otherwise, when 0, the default operand size is used (which is 32-bit for most but not all instructions) |
R | 1 bit | This 1-bit value is an extension to the MODRM.reg field. |
X | 1 bit | This 1-bit value is an extension to the SIB.index field. |
B | 1 bit | This 1-bit value is an extension to the MODRM.rm field or the SIB.base field. |
As we can see in the table above, the W
field is what we want.
So, let's set it to 1 and the rest of the fields 0.
The resulting bit pattern is b01001000, which is 0x48
.
So, we add the byte 0x48 in front of our previous encoding of ADD ecx, esi
(0x01 0xf1
):
+------+ +------+------+
| 0x48 | | 0x01 | 0xf1 |
+------+ +------+------+
^ REX byte ^ Encoding of ADD ecx,esi
And run it through our disassembler:
$ echo -n -e \\x48\\x01\\xf1 >foo.bin
$ ./disasm.sh foo.bin
foo.bin: file format binary
Disassembly of section .data:
0000000000000000 <.data>:
0: 48 01 f1 add rcx,rsi
And we get add rcx, rsi
now! Great!
But wait, doesn't x86-64 have 16 registers which go from 0 to 15? Yet, the ModR/M byte only allocates 3 bits each for us to specify register operands. So, how can we specify registers 8 to 15?
As mentioned in the REX byte table above, the R
and B
fields can be used.
Encoding ADD rcx, r9
r9
has a register number of 9 which is b1001 in binary.
However, MODRM.reg is only 3 bits long.
As such, we encode the lower 3-bits in the MODRM.reg field, and the upper 1-bit
in the R
field.
As such, our final REX byte configuration is:
- REX.W = b1 (since we still want to do a 64-bit ADD)
- REX.R = b1 (upper 1-bit of b1001 (
r9
)) - REX.X = b0
- REX.B = b0 (upper 1-bit of b0001 (
rcx
))
And our final MODRM byte configuration is:
- MODRM.mod = b11 (since we are still using register-direct addressing mode)
- MODRM.reg = b001 (lower 3-bits of b1001 (
r9
)) - MODRM.rm = b001 (lower 3-bits of b0001 (
rcx
))
As such, our final REX byte is b01001100
= 0x4c
and our
final MODRM byte is b11001001
= 0xc9
.
We are still using opcode 0x01
as usual.
As such, our encoding is:
+------+------+------+
| 0x4c | 0x01 | 0xc9 |
+------+------+------+
Running it though the disassembler:
$echo -n -e \\x4c\\x01\\xc9 >foo.bin
$ ./disasm.sh foo.bin
foo.bin: file format binary
Disassembly of section .data:
0000000000000000 <.data>:
0: 4c 01 c9 add rcx,r9
Success!
Closing Words
This post is still kind of under construction --- In the future I would like to add some more info on encoding the SIB byte. However, hopefully these worked examples would be enough for readers to extrapolate and encode their own instructions using information from http://ref.x86asm.net/index.html.