4. Writing Inline Assembly
In embedded programming, there are occasions when you need to get right down to the machine level. Inline assembly is often used for this purpose. This section provides a primer on writing inline assembly code and integrating it with your C programs.
1. Basic Assembly Instructions
Before diving into the inline assembly, it's useful to know some basic assembly instructions. Below is a table that summarizes some of the most commonly used ARM Cortex-M4 assembly instructions:
Instruction | Description | Example | Plain English Description |
---|---|---|---|
MOV |
Moves value between registers | MOV R0, R1 |
Move the value in R1 to R0 |
ADD |
Adds values | ADD R0, R0, #1 |
Add 1 to the value in R0 |
SUB |
Subtracts values | SUB R0, R0, #1 |
Subtract 1 from the value in R0 |
MUL |
Multiplies values | MUL R0, R1, R2 |
Multiply R1 and R2, store result in R0 |
BL |
Branch with Link (calls a function) | BL my_function |
Call the function my_function |
BX |
Branch and exchange (return from function) | BX LR |
Return from the function |
LDR |
Load from memory | LDR R0, [R1] |
Load value at address in R1 into R0 |
STR |
Store to memory | STR R0, [R1] |
Store the value in R0 at address in R1 |
CMP |
Compares values | CMP R0, #0 |
Compare value in R0 to 0 |
BNE |
Branch if Not Equal | BNE loop_start |
If R0 is not 0, jump to loop_start |
BEQ |
Branch if Equal | BEQ exit_loop |
If R0 is 0, jump to exit_loop |
B |
Unconditional Branch | B loop_end |
Jump to loop_end unconditionally |
PUSH |
Push register onto stack | PUSH {R0} |
Push R0 onto the stack |
POP |
Pop value from stack into register | POP {R0} |
Pop top of stack into R0 |
NOP |
No Operation | NOP |
Do nothing for one cycle |
AND |
Bitwise AND | AND R0, R0, R1 |
Perform AND on R0 and R1, store in R0 |
ORR |
Bitwise OR | ORR R0, R0, R1 |
Perform OR on R0 and R1, store in R0 |
EOR |
Bitwise Exclusive OR | EOR R0, R0, R1 |
Perform XOR on R0 and R1, store in R0 |
LSL |
Logical Shift Left | LSL R0, R0, #2 |
Shift R0 left by 2 bits |
LSR |
Logical Shift Right | LSR R0, R0, #2 |
Shift R0 right by 2 bits |
Please refer to Chapter 3 ("The Cortex-M4 Instruction Set") of the ARM Cortex-M4 Generic User Guide for a complete list of instructions.
Square Brackets
Note that the square brackets [ ]
are used to denote memory access, specifically dereferencing the address stored in a register.
R1
: When you see just the register (e.g.,R1
), it refers to the value stored in that register.[R1]
: When you see a register inside square brackets (e.g.,[R1]
), it means you're working with the value stored in memory at the address that is inR1
.
For example:
LDR R0, [R1]
: This instruction will load intoR0
the value stored in memory at the address contained inR1
.STR R0, [R1]
: This instruction will store the value inR0
into the memory location whose address is stored inR1
.
Curly Braces
Curly braces {}
in ARM assembly are typically used for register lists, especially in instructions that work with multiple registers at once. These braces can be used to specify a range or a list of registers.
For example:
PUSH {R0, R1, R2}
: This will push the contents of registers R0, R1, and R2 onto the stack.POP {R0, R1, R2}
: This will pop the top values from the stack into registers R0, R1, and R2.
You can also specify ranges:
PUSH {R0-R3}
: This will push R0, R1, R2, and R3 onto the stack.
In ARM assembly, the #
symbol is used to indicate an immediate value, which is a constant value that's directly provided in the instruction.
#
:
- Example:
MOV R0, #4
- In this example, the immediate value
4
is directly loaded into the registerR0
.
=#
:
- This syntax is often used in ARM's Unified Assembly Language (UAL) to specify that a literal constant should be loaded into a register. The
=
tells the assembler to generate appropriate instructions to load the constant, even if it involves multiple steps. - Example:
LDR R1, =0x12345678
- In this case,
0x12345678
is too large to fit into an immediate operand, so the assembler will actually store the constant in a literal pool and generate aLDR
instruction to load it intoR1
.
2. Inline Assembly: Basic Syntax
To include assembly code in a C program, you can use the __asm volatile
construct. The volatile
keyword tells the compiler not to optimize out the assembly instructions, which is crucial when we're doing low-level hardware manipulations.
Basic Example
You can write single instructions on separate lines:
__asm volatile("LDR R1,=#0x20001000");
__asm volatile("LDR R2,=#0x20001004");
__asm volatile("LDR R0,[R1]");
__asm volatile("LDR R1,[R2]");
__asm volatile("ADD R0,R0,R1");
__asm volatile("STR R0,[R2]");
The purpose of this code is to read two values from specific memory locations, add them together, and then store the result back into one of those memory locations.
Another way to write this is to put all the instructions in a single line, separated by \n\t
:
__asm volatile("LDR R1,=#0x20001000\n\t"
"LDR R2,=#0x20001004\n\t"
"LDR R0,[R1]\n\t"
"LDR R1,[R2]\n\t"
"ADD R0,R0,R1\n\t"
"STR R0,[R2]\n\t");
Note the use of \n\t
to separate the assembly instructions. This ensures each instruction is on a new line followed by a tab character. If you don't use \n\t
, the compiler might try to optimize your assembly code in ways you don't expect, like merging multiple instructions together.
Build the code and check the disassembly (located at ./Debug/002InlineAssembly.list
) to see how the compiler has translated your inline assembly code into machine code:
08000204 <main>:
*/
#include <stdint.h>
int main(void)
{
8000204: b480 push {r7}
8000206: af00 add r7, sp, #0
__asm volatile("LDR R1,=#0x20001000");
8000208: 4903 ldr r1, [pc, #12] ; (8000218 <main+0x14>)
__asm volatile("LDR R2,=#0x20001004");
800020a: 4a04 ldr r2, [pc, #16] ; (800021c <main+0x18>)
__asm volatile("LDR R0,[R1]");
800020c: 6808 ldr r0, [r1, #0]
__asm volatile("LDR R1,[R2]");
800020e: 6811 ldr r1, [r2, #0]
__asm volatile("ADD R0,R0,R1");
8000210: 4408 add r0, r1
__asm volatile("STR R0,[R2]");
8000212: 6010 str r0, [r2, #0]
for(;;);
8000214: e7fe b.n 8000214 <main+0x10>
8000216: 0000 .short 0x0000
8000218: 20001000 .word 0x20001000
800021c: 20001004 .word 0x20001004
As you can see, the compiler has translated your assembly code into machine code. Now to see the code in action, you can use the debugger to step through the assembly instructions one by one. Let's first open the memory window and set the address 0x20001000
and 0x20001004
to some values:
Now, let's step through the assembly instructions while monitoring the registers window. At the end, the result of the addition should be as expected:
3. Inline Assembly: Input and Output
Instead of hard-coding values into the assembly instructions, you can pass them in as input parameters. You can also return values from the assembly code as output parameters.
Syntax
The basic syntax for __asm volatile
with input and output parameters looks like this:
__asm volatile (
"assembly code"
: "constraint_for_output" (output_var)
: "constraint_for_input" (input_var)
);
- assembly code: This is where you place the assembly instructions.
- constraint_for_output: Specifies the type or constraint of the output operand(s). This is a string that tells the compiler how the operand should be used.
- output_var: The C variable that will hold the output value.
- constraint_for_input: Similar to the output constraint, it specifies the type or constraint of the input operand(s).
- input_var: The C variable that will serve as the input value.
Constraints
You need to define constraints that specify how the variables will be used within the assembly code. Constraints are like placeholders. Some common constraints are:
Input Constraints
Constraint | Description |
---|---|
"r" |
Operand should be stored in a general-purpose register |
"m" |
Operand is a memory operand |
"i" |
Immediate integer operand with a known value |
"g" |
Operand can be a register, memory location, or immediate integer |
Output Constraints
Constraint | Description |
---|---|
"=r" |
Write-only operand stored in a general-purpose register |
"=m" |
Write-only operand as a memory operand |
"+r" |
Read-write operand in a general-purpose register |
"+m" |
Read-write operand in a memory location |
These tables are not exhaustive but cover some of the most commonly used constraints for inline assembly in GCC. Constraints can also be more specific, depending on what you're trying to do and which assembly language you're working with.
Example
Here's a simple example:
int foo = 10, result;
__asm volatile (
"ADD %0, %1, %2"
: "=r" (result) // Output
: "r" (foo), "i" (20) // Inputs
);
In this example, the assembly ADD
instruction adds the contents of foo
and 20
and stores the result in result
. The %0
, %1
, and %2
are placeholders for the output and input operands. The %0
refers to the first operand, %1
refers to the second operand, and so on.
We can again build and check the disassembly to see how the compiler has translated our inline assembly code into machine code:
int foo = 10, result;
8000216: 230a movs r3, #10
8000218: 607b str r3, [r7, #4]
__asm volatile (
800021a: 687b ldr r3, [r7, #4]
800021c: f103 0314 add.w r3, r3, #20
8000220: 603b str r3, [r7, #0]
"ADD %0, %1, %2"
: "=r" (result) // Output
: "r" (foo), "i" (20) // Inputs
);
Here we can see foo
and result
are stored in registers R3
and R0
, respectively. The ADD
instruction is translated into f103 0314
, which is the machine code for ADD R3, R3, #20
. The #20
is the immediate value 20
.
Clobbers: Notifying the Compiler about Side Effects
When you write inline assembly, the compiler isn't aware of what your assembly code is doing. In some cases, your assembly code may alter the state of certain registers or memory locations that the compiler might otherwise assume are unchanged. This is known as "clobbering."
To handle this, you need to inform the compiler explicitly about any such side effects. This is done using the clobber list, a part of the inline assembly syntax.
Here's the layout again for context:
__asm volatile (
"assembly code"
: "output" (output_var)
: "input" (input_var)
: "clobbered_reg_1", "clobbered_reg_2"
);
Commonly Used Clobber Identifiers
-
"cc"
: Stands for "condition code." If your assembly code changes the flags in the processor's status register, you need to indicate this so the compiler can manage the condition flags appropriately. -
"memory"
: Tells the compiler that the assembly code performs read/write operations on memory, and that this could affect other variables stored there. This ensures that the compiler will not cache values and will reload them after the assembly code has executed.
Example:
__asm volatile (
"LDR R0, [R1]"
: "=r" (my_output)
: "r" (my_input)
: "cc", "memory"
);
In this example, the clobber list "cc", "memory"
informs the compiler that the condition codes and memory could be modified by the assembly instructions, even though these changes are not visible through my_output
and my_input
. This ensures that the compiler generates correct, safe code around your inline assembly.