20th September 2019

Encoding x86-64 instructions: some worked examples

To perform its compile-time evaluation duties CIR includes an x86-64 JIT. Unfortunately, when I looked around I couldn't find a ready-to-use C assembler library (there's one for C++ though), so I had to clobber together my own. While many websites explain how to do the encoding for some simple instructions, few would teach you how to encode any x86-64 instruction. With this post I hope to bridge the gap by showing, though some worked examples, my experience of encoding x86-64 instructions using information from an x86 instruction encoding reference such as http://ref.x86asm.net/index.html.

References

Before starting: verifying our encoding with objdump

When encoding x86-64 instructions by hand, it's very useful to verify that your encoded instructions are correct (to save yourself some painful debugging). This can be easily done with objdump. Save the following as disasm.sh:

disasm.sh
#!/bin/sh
objdump -D -b binary -mi386:x86-64 -M intel "$@"

After making it executable, we can execute it as follows:

$ ./disasm.sh foo.bin

Starting off with some x86 instruction examples

Since the x86-64 instruction set simply extends the x86 instruction set, it's a good idea to first take a look at some x86 encoding examples so that we can better understand how the x86-64 encoding is derived from the x86 encoding later.

x86 has 8 registers, 32-bits wide: eax, ecx, edx, ebx, rsp, ebp, esi, edi. They are internally numbered 0 to 7 respectively.

Now, let's take a look at some examples:

A simple encoding: ADD eax, 0x4351ff23

Let's start with a simple example of adding a 32-bit immediate value to the eax register. (The value 0x4351ff23 is just a random value I picked, you can use your own).

Looking at http://ref.x86asm.net/geek32-abc.html, we can see many entries listed under ADD:

mnemonicop1op2po
ADDEbGb00
ADDEvGv01
ADDGbEb02
ADDGvEv03
ADDALIb04
ADDeAXIv05
ADDEbIb80
ADDEvIv81

(I've cut out most of the columns of the table since the table was going off the screen)

This is because while the assembly might call all of these instructions "ADD", they all have different opcodes internally. Which opcode you should use (and overall, the encoding of the instruction) depends on the type of your operands. This information is listed under the op1 and op2 columns of the table.

For our instruction, we are looking to operate on eAX as operand1 and Iv (meaning Immediate 32-bit value) as operand2. Looking at the table, we can see that Row 6 (highlighted yellow) matches what we want. The opcode of the instruction is listed under the po column, and is 0x05. So, the first byte of our instruction is 0x05:

+------+
| 0x05 |
+------+
  ^ First byte is the opcode.

Next comes our immediate value, which is 32-bits (4 bytes) long. Note that immediates values are encoded with little endian byte order --- that is, least significant byte first.

+------+------+------+------+------+
| 0x05 | 0x23   0xff   0x51   0x43 |
+------+------+------+------+------+
            ^ The immediate value (least significant byte first)

And, we are done! Dump these bytes into a file (foo.bin) and then disassemble it with our disassembly script:

$echo -n -e \\x05\\x23\\xff\\x51\\x43 >foo.bin
$ ./disasm.sh tmp.bin

foo.bin:     file format binary


Disassembly of section .data:

0000000000000000 <.data>:
   0:	05 23 ff 51 43       	add    eax,0x4351ff23

So, everything seems correct!

Now, you might notice that while we are telling the processor which operation to perform via the opcode and the immediate value, we aren't specifically telling the processor to use the EAX register. This is because opcode 0x05 can only be used with the EAX register (which is why the EAX register is called the accumulator register). These kinds of opcodes with specially-blessed registers are quite a frequent occurrence in the x86 instruction set -- they trade-off some flexibility to have a shorter instruction encoding.

Introducing the ModR/M byte with ADD ecx, esi

Let's consider the instruction ADD ecx, esi, which does a 32-bit add with the contents of register esi with ecx. I randomly picked ecx and esi --- you can use your own if you want.

For this instruction, the opcode for ADD that we should use is one that accepts Ev as operand1 and Gv as operand2. (E and G means that the operands are specified in the ModR/M byte, and v means that the operands are 32-bit in size). Looking at the table, the opcode that we should use is 0x01, and I colored it yellow in the table below:

mnemonicop1op2po
ADDEbGb00
ADDEvGv01
ADDGbEb02
ADDGvEv03
ADDALIb04
ADDeAXIv05
ADDEbIb80
ADDEvIv81

I also colored opcode 0x03 gray as well. I'll cover that later.

Now, same as usual, we start off with the opcode byte of 0x01:

+------+
| 0x01 |
+------+
  ^ First byte is the opcode.

When the processor sees the opcode 0x01, it would then expect the ModR/M byte to come next. The ModR/M byte allows us to specify the register operands of instructions that take E and G operands. https://wiki.osdev.org/X86-64_Instruction_Encoding has this to say about the format of the ModR/M byte:

7                               0
+---+---+---+---+---+---+---+---+
|  mod  |    reg    |     rm    |
+---+---+---+---+---+---+---+---+
FieldLengthDescription
MODRM.mod2 bitsWhen this field is b11, then register-direct addressing mode is used; otherwise register-indirect addressing mode is used.
MODRM.reg3 bits
This field can have one of two values:
  • A 3-bit opcode extension, which is used by some instructions but has no further meaning other than distinguishing the instruction from other instructions. (We won't be using this)
  • A 3-bit register reference, which can be used as the source or the destination of an instruction (depending on the instruction). (This specifies the G operand)
MODRM.rm3 bitsSpecifies a direct or indirect register operand, optionally with a displacement. (This specifies the E operand)

As such, for the instruction ADD ecx, esi, our values for the fields of the ModR/M byte are:

  • MODRM.mod = b11 (since we are using register-direct addressing mode)
  • MODRM.reg = b110 (esi, which is register 6)
  • MODRM.rm = b001 (ecx, which is register 1)

You might notice that the description in the table above also mentions register-indirect addressing mode. The ModR/M byte, along with the SIB byte, can be used to encode an ADD instruction where an operand is loaded from memory (e.g. ADD ecx, [esp + 0x4]), but we won't cover it here.

So, putting the bits together we get the byte b11110001 = 0xf1. Let's add it after our opcode:

+------+------+
| 0x01 | 0xf1 |
+------+------+
         ^ ModR/M byte

And run it through our disassembler:

$echo -n -e \\x01\\xf1 >foo.bin
$ ./disasm.sh foo.bin

foo.bin:     file format binary


Disassembly of section .data:

0000000000000000 <.data>:
   0:	01 f1                	add    ecx,esi

And everything looks about right!

Opcode 0x01 vs 0x03

While in this example we used opcode 0x01, you might notice that opcode 0x03 (which I highlighted gray in the table above) also takes in Ev and Gv as operands --- just that their positions are swapped. With opcode 0x01, Ev is operand 1 and Gv is operand 2, but with opcode 0x01, Ev is operand 2 and Gv is operand 1.

So, can you guess what happens if we used opcode 0x03 instead of opcode 0x01?

In fact, we can actually use both opcodes 0x01 and 0x03 to implement the same instruction (add ecx, esi)!

Moving on to 64-bit ADD

Now, let's move on to see how x86-64 instructions can be encoded.

x86-64 added 8 new registers, for a total of 16 registers. All 16 registers were also extended to be 64-bits wide. The full list of registers are: rax, rcx, rdx, rbx, rsp, rbp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15. These are internally numbered 0 to 15 respectively.

All eight 32-bit x86 registers still exist in x86-64 though, but what they do is that they now map to the lower 32-bits of their respective 64-bit register. i.e. eax maps to the lower 32-bits of rax, ecx maps to the lower 32-bits of rcx etc.

Also, x86-64, the default operand size is the same as x86, which is 32-bits. This means that the exact same opcodes and instruction encodings that we covered above can be used in x86-64 to mean the same thing (32-bit addition).

So, for example, 0x01 0xf1, which means add ecx, esi (which we covered above), will take the lower 32-bits of rsi, add it to the lower 32-bits of rcx, and write the result to the lower 32-bits of rcx. The upper-32 bits of rcx will be zeroed.

That's nice (and the reason why 32-bit executables can easily run without changes on 64-bit processors), but what happens if we want to do 64-bit addition that operates on 64-bit registers instead?

Encoding ADD rcx, rsi with the REX prefix byte

To turn the encoding of the ADD ecx, esi instruction into the encoding of the ADD rcx, rsi, we need to tell the processor to run the ADD instruction with a 64-bit operand size. This can be done with a REX prefix byte added in front of the instruction.

The REX prefix byte has the following format (Source):

7                               0
+---+---+---+---+---+---+---+---+
| 0   1   0   0 | W | R | X | B |
+---+---+---+---+---+---+---+---+
FieldLengthDescription
01004 bitsFixed bit pattern
W1 bitWhen 1, a 64-bit operand size is used. Otherwise, when 0, the default operand size is used (which is 32-bit for most but not all instructions)
R1 bitThis 1-bit value is an extension to the MODRM.reg field.
X1 bitThis 1-bit value is an extension to the SIB.index field.
B1 bitThis 1-bit value is an extension to the MODRM.rm field or the SIB.base field.

As we can see in the table above, the W field is what we want. So, let's set it to 1 and the rest of the fields 0. The resulting bit pattern is b01001000, which is 0x48. So, we add the byte 0x48 in front of our previous encoding of ADD ecx, esi (0x01 0xf1):

+------+ +------+------+
| 0x48 | | 0x01 | 0xf1 |
+------+ +------+------+
  ^ REX byte  ^ Encoding of ADD ecx,esi

And run it through our disassembler:

$ echo -n -e \\x48\\x01\\xf1 >foo.bin
$ ./disasm.sh foo.bin

foo.bin:     file format binary


Disassembly of section .data:

0000000000000000 <.data>:
   0:	48 01 f1             	add    rcx,rsi

And we get add rcx, rsi now! Great!

But wait, doesn't x86-64 have 16 registers which go from 0 to 15? Yet, the ModR/M byte only allocates 3 bits each for us to specify register operands. So, how can we specify registers 8 to 15?

As mentioned in the REX byte table above, the R and B fields can be used.

Encoding ADD rcx, r9

r9 has a register number of 9 which is b1001 in binary. However, MODRM.reg is only 3 bits long. As such, we encode the lower 3-bits in the MODRM.reg field, and the upper 1-bit in the R field.

As such, our final REX byte configuration is:

  • REX.W = b1 (since we still want to do a 64-bit ADD)
  • REX.R = b1 (upper 1-bit of b1001 (r9))
  • REX.X = b0
  • REX.B = b0 (upper 1-bit of b0001 (rcx))

And our final MODRM byte configuration is:

  • MODRM.mod = b11 (since we are still using register-direct addressing mode)
  • MODRM.reg = b001 (lower 3-bits of b1001 (r9))
  • MODRM.rm = b001 (lower 3-bits of b0001 (rcx))

As such, our final REX byte is b01001100 = 0x4c and our final MODRM byte is b11001001 = 0xc9. We are still using opcode 0x01 as usual. As such, our encoding is:

+------+------+------+
| 0x4c | 0x01 | 0xc9 |
+------+------+------+

Running it though the disassembler:

$echo -n -e \\x4c\\x01\\xc9 >foo.bin
$ ./disasm.sh foo.bin

foo.bin:     file format binary


Disassembly of section .data:

0000000000000000 <.data>:
   0:	4c 01 c9             	add    rcx,r9

Success!

Closing Words

This post is still kind of under construction --- In the future I would like to add some more info on encoding the SIB byte. However, hopefully these worked examples would be enough for readers to extrapolate and encode their own instructions using information from http://ref.x86asm.net/index.html.