In this section, you will learn how the allocator was modified compared to the one used in the Write a Dynamic Memory Allocator learning path.
All the code snippets shown here are from the sources shown in the previous section.
Memory with tag storage is not allocated by the kernel by default. Therefore, the application (the heap in this case) must ask for it specifically.
This heap does that in simple_heap_init
.
int got = prctl(
PR_SET_TAGGED_ADDR_CTRL,
PR_TAGGED_ADDR_ENABLE |
PR_MTE_TCF_SYNC |
(0xfffe << PR_MTE_TAG_SHIFT), 0, 0, 0);
However, we need to set up how memory tagging will act first. PR_SET_TAGGED_ADDR_CTRL
enables
“Tagged Address ABI”
. This means that the kernel will allow us to
use tagged addresses when interacting with it.
PR_TAGGED_ADDR_ENABLE
enables the ABI and MTE.
PR_MTE_TCF_SYNC
enables synchronous exceptions from tag mismatches. This means
that the exception is reported as it happens, rather than waiting for a future
point to report them (for example, the next syscall).
0xfffe << PR_MTE_TAG_SHIFT
is a 16 bit bitmask that tells the kernel which
memory tag values are allowed to be generated by MTE’s random tag generation
instructions like irg
. The value used here means “generate any value apart from 0”.
The reason for this will be explained later.
You do not need to understand assembly for this learning path, specific instructions are only mentioned for context.
Now the kernel is ready for us to ask for tagged memory. We do this using the standard
mmap
syscall but with an extra option PROT_MTE
, as shown below:
storage = mmap(0, STORAGE_SIZE,
PROT_READ | PROT_WRITE |
// Memory should have memory tagging enabled.
PROT_MTE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
Using PROT_MTE
means that the memory allocated by the kernel will have
allocation tags associated with it.
The kernel promises us that this memory’s allocation tags will always be 0
when first allocated. We will trust this and also use 0
for marking free
memory in the allocator. This is why we previously excluded the value 0
from
random tag generation.
To allocate memory we must walk the heap’s memory ranges until:
Memory tagging changes the details of this process in a few ways.
MTE protects memory in “granules” of 16 bytes. This means that allocations smaller than 16 bytes need to be rounded up to that size (other strategies are possible but are not implemented in this allocator).
Remember that a range header comes before the usable part of the allocation and
that the header is 8 bytes in size (see the Header
type). So if we called
simple_malloc(4)
, the memory allocated would look like this:
| Header | Allocation | Padding |
| 8 bytes | 4 bytes | 4 bytes |
For a total of 16 bytes, the same logic applies no matter the size of the allocation. The total size must always be rounded up to a multiple of 16 bytes.
Those 16 byte chunks must also be aligned to 16 bytes. This is ensured first by us
knowing that the memory returned by mmap
will already be aligned and by never
using the memory in anything less than 16 byte chunks.
Padding in this way allows the allocation tags to look like this:
| 4 byte allocation | Free memory... |
| non-zero tag | tag 0, tag 0 (repeats) |
An allocation will never straddle 2 granules. This allows us to assign
different tags to each allocation (and assign tag 0
to free space).
The ranges have headers that are stored in tagged memory. This means we cannot simply implement a pointer that pointed to one range and expect it to be able to read from a subsequent range.
For example, if we have done one allocation, the ranges might be:
[0x0100400000802000 -> 0x0100400000802010) : [memory tag: 0x1] [allocated, size = 16 bytes]
[0x0000400000802010 -> 0x0000400000803000) : [memory tag: 0x0] [ free, size = 4080 bytes]
The pointer to the first range will have a logical tag of 1
. Then we skip
forward by the size of that range to the second range. However, the logical tag
is still the same as it was before but it must be 0
for us to access the
second header without causing an exception.
This is why the function read_header
calls get_memory_tag
to get the real
allocation tag of the header that we’re about to read from. That function calls a
function from the Arm C Language Extensions (ACLE) called __arm_mte_get_tag
.
__arm_mte_get_tag
takes a pointer to an address. This pointer can have any
logical tag. From the location it’s pointed to, it reads the allocation tag, stores
it back into the pointer value you gave to it and overwrites the existing logical tag.
This corrects the pointer so that it always has the correct logical tag for the header we are about to access.
Once we have found a free range to allocate the space in, we need to know what its
allocation tag should be. It cannot be 0
because we’ve reserved that for free
space. It must be one of the other 15 values.
Remember that we passed (0xfffe << PR_MTE_TAG_SHIFT)
to prctl
earlier. This
mask means “generate any tag value apart from 0”.
In a production scenario, this random generation is preferable as it prevents attacks where someone with knowledge of the program execution predicts what the tags will be.
However, this does mean that the output of the program is different every time and due to the probabilistic nature of MTE, some issues may not always be caught because of this.
For demo purposes we have added a randomise_memory_tags
option in heap.c
.
If you set this to false
, tag values are generated from a loop of values 1-15.
If it is true
then random tags are generated in the range 1-15.
With randomise_memory_tags
set to true
, subsequent allocations will
always have different tags.
[0x0100400000802000 -> 0x0100400000802010) : [memory tag: 0x1] [allocated, size = 16 bytes]
[0x0200400000802010 -> 0x0200400000802020) : [memory tag: 0x2] [allocated, size = 16 bytes]
This is good for testing. However, an attacker who knows this could predict that the next allocation would have tag 3 and forge pointers to it.
Randomizing the tags mitigates against that but also means that subsequent (and possibly neighbouring) allocations may have the same tag, thereby reducing the protection MTE can give.
The memory allocator can mitigate against this too, by excluding neighbouring tags when generating the new tag. The allocator shown here does not do that but the idea is discussed further at the end of this learning path.
Once we’ve chosen a range to use and what tag it should have, we have to set the allocation tag of all granules (16 byte chunks) in the range.
This is done by tag_range
. This function assumes that range
points
to the start of the range and contains a logical tag equal to the allocation
tag we want to set. size
is the size of the range.
We assume that range
is 16 byte aligned and that size
is a multiple of 16, which should be true given the steps we took earlier.
The actual tag setting is done by another ACLE function __arm_mte_set_tag
.
This sets the allocation tag of a location to the logical tag of the pointer
to that location. We do this in a loop until the entire range has a new tag value.
One of the great features of MTE is that it uses part of the top byte of the pointer. This builds on a feature called “Top Byte Ignore”, which is enabled by many operating systems, including Linux.
Top Byte Ignore means that the CPU does not care what is stored in the top byte of a pointer. User software and kernel interfaces may need further adjustments but, for simple use cases, a tagged pointer can be used anywhere.
For our allocator, it means that all it has to do is return the pointer to the application with a logical tag set. The application does not have to be aware that memory tagging is being used.
In this demo, the application does have a memory tagging aware signal handler, which is explained later.
Find the header of the range the pointer refers to and mark that as free memory.
To do this, we write the range header using a pointer with the same logical tag
as the one given to simple_free
by the program. This operation could fault
if the program is trying to free memory multiple times or uses the wrong
pointer. This situation will be explained later.
Once this is done, the range is given an allocation tag of 0, which we are using for free memory. This is sometimes called “untagging” since the default allocation tag is also 0.
This “untagging” means that if the application tries to use a pointer to this allocation after it has been freed, the access will cause an exception (in the majority of cases, it is possible that it has been reallocated with the same tag).