The idea is to put all reference information about x86 assembly language on the one page. Some rarely-used instructions such as LDS, BOUNDS or AAA are skipped. The cheat sheet use common notation for operands: reg means register, mem means memory location, and imm is. Egress Cheat Sheet - 2015 IBC. Occupancies is 100 feet or that the travel distance limit for assembly occupancies is 250 feet, without needing to refer to the code. This guide describes the basics of 32-bit x86 assembly language programming, covering a small but useful subset of the available instructions and assembler directives. There are several different assembly languages for generating x86 machine code. The one we will use in CS421 is the GNU Assembler (gas) assembler. Intel Assembler CodeTable 80x86 - Overview of instructions (Cheat Sheet) Intel Assembler x86 CodeTable: Handy overview containing all instructions (transfer, arithmetic, logic, jumps), flags, registers, demo program. Free PDF file, fits on one single sheet.
Calling external functions in C, and calling C functions from other languages, is a common issue in OS programming, especially where the other language is assembly. This page will concentrate primarily on the latter case, but some consideration is made for other languages as well.
Some of what is described here is imposed by the x86 architecture, some is special to the GNU GCC toolchain. Some is configurable, and you could be making your own GCC target to support a different calling convention. Currently, this page makes no effort of differentiating which is what.
|
Basics
As a general rule, a function which follows the C calling conventions, and is appropriately declared (see below) in the C headers, can be called as a normal C function. Most of the burden for following the calling rules falls upon the assembly program.
Cheat Sheets
Here is a quick overview of common calling conventions. Note that the calling conventions are usually more complex than represented here (for instance, how is a large struct returned? How about a struct that fits in two registers? How about va_list's?). Look up the specifications if you want to be certain. It may be useful to write a test function and use gcc -S to see how the compiler generates code, which may give a hint of how the calling convention specification should be interpreted.
Platform | Return Value | Parameter Registers | Additional Parameters | Stack Alignment | Scratch Registers | Preserved Registers | Call List |
---|---|---|---|---|---|---|---|
System V i386 | eax, edx | none | stack (right to left)1 | eax, ecx, edx | ebx, esi, edi, ebp, esp | ebp | |
System V X86_642 | rax, rdx | rdi, rsi, rdx, rcx, r8, r9 | stack (right to left)1 | 16-byte at call3 | rax, rdi, rsi, rdx, rcx, r8, r9, r10, r11 | rbx, rsp, rbp, r12, r13, r14, r15 | rbp |
Microsoft x64 | rax | rcx, rdx, r8, r9 | stack (right to left)1 | 16-byte at call3 | rax, rcx, rdx, r8, r9, r10, r11 | rbx, rdi, rsi, rsp, rbp, r12, r13, r14, r15 | rbp |
ARM | r0, r1 | r0, r1, r2, r3 | stack | 8 byte4 | r0, r1, r2, r3, r12 | r4, r5, r6, r7, r8, r9, r10, r11, r13, r14 |
Note 1: The called function is allowed to modify the arguments on the stack and the caller must not assume the stack parameters are preserved. The caller should clean up the stack.
Note 2: There is a 128 byte area below the stack called the 'red zone', which may be used by leaf functions without increasing %rsp. This requires the kernel to increase %rsp by an additional 128 bytes upon signals in user-space. This is not done by the CPU - if interrupts use the current stack (as with kernel code), and the red zone is enabled (default), then interrupts will silently corrupt the stack. Always pass -mno-red-zone to kernel code (even support libraries such as libc's embedded in the kernel) if interrupts don't respect the red zone.
Note 3: Stack is 16 byte aligned at time of call. The call pushes %rip, so the stack is 16-byte aligned again if the callee pushes %rbp.
Note 4: Stack is 8 byte aligned at all times outside of prologue/epilogue of function.
System V ABI
- Main article:System V ABI
The System V ABI is one of the major ABIs in use today and is virtually universal among Unix systems. It is the calling convention used by toolchains such as i686-elf-gcc and x86_64-elf-gcc.
External References
In order to call a foreign function from C, it must have a correct C prototype. Thus, is if the function fee() takes the arguments fie, foe, and fum, in C calling order, and returns an integer value, then the corresponding header file should have the following prototype:
Similarly, an global variables in the assembly code must be declared extern:
C functions in assembly or other languages must be declared as appropriate for the language. For example, in NASM, the C function
would be declared
Also, in most assembly languages, a function or variable that it to be exported must be declared global:
Name Mangling
In some object formats (a.out), the name of a C function is automagically mangled by prepending it with an underscore ('_'). Thus, to call a C function foo() in assembly with such a format, you must define it as extern _foo instead of extern foo. This requirement does not apply to most modern formats such as COFF, PE, and ELF.
C++ name mangling is much more severe, as the C++ compiler encodes the type information from the parameter list into the symbol. (This is what enables function overloading in C++ in the first place.) The Binutils package contains the tool c++filt that can be used to determine the correct mangled name.
Registers
The general register EBX, ESI, EDI, EBP, DS, ES, and SS, must be preserved by the called function. If you use them, you must save them first and restore them afterwards. Conversely, EAX and EDX are used for return values, and thus should not be preserved. The other registers do not need to be saved by the called function, but if they are in use by the calling function, then the calling function should save them before the call is made, and restored afterwards.
Passing Function Arguments
GCC/x86 passes function arguments on the stack. These arguments are pushed in reverse order from their order in the argument list. Furthermore, since the x86 protected-mode stack operations operate on 32-bit values, the values are always pushed as a 32-bit value, even if the actual value is less than a full 32-bit value. Thus, for function foo(), the value of quux (a 48-bit FP value) is pushed first as two 32-bit values, low-32-bit-value first; the value of baz is pushed as the first byte of in 32-bit value; and then finally bar is pushed as a 32-bit value.
To pass arguments to a C function, the calling function must push the argument values as described above. Thus, to call foo() from a NASM assembly program, you would do something like this
Accessing Function Arguments
Nasm Cheat Sheet
In the GCC/x86 C calling convention, the first thing any function that accepts formal arguments should do is push the value of EBP (the frame base pointer of the calling function), then copy the value of ESP to EBP. This sets the function's own frame pointer, which is used to track both the arguments and (in C, or in any properly reentrant assembly code) the local variables.
To access arguments passed by a C function, you need to use the EBP an offset equal to 4 * (n + 2), where n is the number of the parameter in the argument list (not the number in the order it was pushed by), zero-indexed. The + 2 is an added offset for the calling function's saved frame pointer and return pointer (pushed automatically by CALL, and popped by RET).
Thus, in function fee, to move fie into EAX, foe into BL, and fum into EAX and EDX, you would write (in NASM):
As stated earlier, return values in GCC are passed using EAX and EDX. If a value exceeds 64 bits, it must be passed as a pointer.
See Also
External Links
The cheat sheet is intended for 32-bit Windows programming with FASM. One A4 page contains almost all general-purpose x86 instructions (except FPU, MMX and SSE instructions).
What is included
You will find various kinds of moves (MOV, CMOV, XCHG), arithmetical (ADD, SUB, MUL, DIV) and logical (AND, OR, XOR, NOT) instructions here. Several charts illustrate shifts (SHL/SHR, ROL/ROR, RCL/RCR) and stack frames. Code samples for typical high-level language constructs (if conditions, while and for loops, switches, function calls) are shown. Also included are quick references for RDTSC and CPUID instructions, description of string operations such as REP MOVSB, some code patterns for branchless conditions, a list of registers that should be saved in functions, and a lot of other useful stuff.
The idea is to put all reference information about x86 assembly language on the one page. Some rarely-used instructions such as LDS, BOUNDS or AAA are skipped.
X86 Assembly Cheat Sheet
Notation
The cheat sheet use common notation for operands: reg means register, [mem] means memory location, and imm is an immediate operand. Also, x, y, and z denote the first, the second, and the third operand. Instruction mnemonics are written in capital letters to make them easier to find when you are skipping through the cheat sheet.
Example
For example, let's look at multiplication and division section. There are instructions for signed (IMUL) and unsigned (MUL) multiplication. Both instructions take one operand, which may be register (reg) or memory ([mem]). There are three possible cases:
- If operand size is one byte, MUL or IMUL multiplies it by al and stores the result in ax
- If operand size is a word, MUL or IMUL multiplies it by ax and stores the high-order word of the result in dx and the low-order word in ax.
- If operand size is a double word, MUL or IMUL multiplies it by eax and stores the high-order dword in edx and the low-order dword in eax.
There are also two-operand and three-operand forms of IMUL shown on the figure above.
Other features of assembly language are described in a similar way.
Download
The cheat sheet is designed for A4 page size; if you print it on US Letter paper, you will get large margins. You can print the cheat sheet and put it on your table to look for some instructions when you forget them.
Serbo-Croatian translation of this article by WHG Team.