1 | // This module exists only to provide a separate page for the implementation |
2 | // documentation. |
3 | |
4 | //! Notes on `sharded-slab`'s implementation and design. |
5 | //! |
6 | //! # Design |
7 | //! |
8 | //! The sharded slab's design is strongly inspired by the ideas presented by |
9 | //! Leijen, Zorn, and de Moura in [Mimalloc: Free List Sharding in |
10 | //! Action][mimalloc]. In this report, the authors present a novel design for a |
11 | //! memory allocator based on a concept of _free list sharding_. |
12 | //! |
13 | //! Memory allocators must keep track of what memory regions are not currently |
14 | //! allocated ("free") in order to provide them to future allocation requests. |
15 | //! The term [_free list_][freelist] refers to a technique for performing this |
16 | //! bookkeeping, where each free block stores a pointer to the next free block, |
17 | //! forming a linked list. The memory allocator keeps a pointer to the most |
18 | //! recently freed block, the _head_ of the free list. To allocate more memory, |
19 | //! the allocator pops from the free list by setting the head pointer to the |
20 | //! next free block of the current head block, and returning the previous head. |
21 | //! To deallocate a block, the block is pushed to the free list by setting its |
22 | //! first word to the current head pointer, and the head pointer is set to point |
23 | //! to the deallocated block. Most implementations of slab allocators backed by |
24 | //! arrays or vectors use a similar technique, where pointers are replaced by |
25 | //! indices into the backing array. |
26 | //! |
27 | //! When allocations and deallocations can occur concurrently across threads, |
28 | //! they must synchronize accesses to the free list; either by putting the |
29 | //! entire allocator state inside of a lock, or by using atomic operations to |
30 | //! treat the free list as a lock-free structure (such as a [Treiber stack]). In |
31 | //! both cases, there is a significant performance cost — even when the free |
32 | //! list is lock-free, it is likely that a noticeable amount of time will be |
33 | //! spent in compare-and-swap loops. Ideally, the global synchronzation point |
34 | //! created by the single global free list could be avoided as much as possible. |
35 | //! |
36 | //! The approach presented by Leijen, Zorn, and de Moura is to introduce |
37 | //! sharding and thus increase the granularity of synchronization significantly. |
38 | //! In mimalloc, the heap is _sharded_ so that each thread has its own |
39 | //! thread-local heap. Objects are always allocated from the local heap of the |
40 | //! thread where the allocation is performed. Because allocations are always |
41 | //! done from a thread's local heap, they need not be synchronized. |
42 | //! |
43 | //! However, since objects can move between threads before being deallocated, |
44 | //! _deallocations_ may still occur concurrently. Therefore, Leijen et al. |
45 | //! introduce a concept of _local_ and _global_ free lists. When an object is |
46 | //! deallocated on the same thread it was originally allocated on, it is placed |
47 | //! on the local free list; if it is deallocated on another thread, it goes on |
48 | //! the global free list for the heap of the thread from which it originated. To |
49 | //! allocate, the local free list is used first; if it is empty, the entire |
50 | //! global free list is popped onto the local free list. Since the local free |
51 | //! list is only ever accessed by the thread it belongs to, it does not require |
52 | //! synchronization at all, and because the global free list is popped from |
53 | //! infrequently, the cost of synchronization has a reduced impact. A majority |
54 | //! of allocations can occur without any synchronization at all; and |
55 | //! deallocations only require synchronization when an object has left its |
56 | //! parent thread (a relatively uncommon case). |
57 | //! |
58 | //! [mimalloc]: https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf |
59 | //! [freelist]: https://en.wikipedia.org/wiki/Free_list |
60 | //! [Treiber stack]: https://en.wikipedia.org/wiki/Treiber_stack |
61 | //! |
62 | //! # Implementation |
63 | //! |
64 | //! A slab is represented as an array of [`MAX_THREADS`] _shards_. A shard |
65 | //! consists of a vector of one or more _pages_ plus associated metadata. |
66 | //! Finally, a page consists of an array of _slots_, head indices for the local |
67 | //! and remote free lists. |
68 | //! |
69 | //! ```text |
70 | //! ┌─────────────┐ |
71 | //! │ shard 1 │ |
72 | //! │ │ ┌─────────────┐ ┌────────┐ |
73 | //! │ pages───────┼───▶│ page 1 │ │ │ |
74 | //! ├─────────────┤ ├─────────────┤ ┌────▶│ next──┼─┐ |
75 | //! │ shard 2 │ │ page 2 │ │ ├────────┤ │ |
76 | //! ├─────────────┤ │ │ │ │XXXXXXXX│ │ |
77 | //! │ shard 3 │ │ local_head──┼──┘ ├────────┤ │ |
78 | //! └─────────────┘ │ remote_head─┼──┐ │ │◀┘ |
79 | //! ... ├─────────────┤ │ │ next──┼─┐ |
80 | //! ┌─────────────┐ │ page 3 │ │ ├────────┤ │ |
81 | //! │ shard n │ └─────────────┘ │ │XXXXXXXX│ │ |
82 | //! └─────────────┘ ... │ ├────────┤ │ |
83 | //! ┌─────────────┐ │ │XXXXXXXX│ │ |
84 | //! │ page n │ │ ├────────┤ │ |
85 | //! └─────────────┘ │ │ │◀┘ |
86 | //! └────▶│ next──┼───▶ ... |
87 | //! ├────────┤ |
88 | //! │XXXXXXXX│ |
89 | //! └────────┘ |
90 | //! ``` |
91 | //! |
92 | //! |
93 | //! The size of the first page in a shard is always a power of two, and every |
94 | //! subsequent page added after the first is twice as large as the page that |
95 | //! preceeds it. |
96 | //! |
97 | //! ```text |
98 | //! |
99 | //! pg. |
100 | //! ┌───┐ ┌─┬─┐ |
101 | //! │ 0 │───▶ │ │ |
102 | //! ├───┤ ├─┼─┼─┬─┐ |
103 | //! │ 1 │───▶ │ │ │ │ |
104 | //! ├───┤ ├─┼─┼─┼─┼─┬─┬─┬─┐ |
105 | //! │ 2 │───▶ │ │ │ │ │ │ │ │ |
106 | //! ├───┤ ├─┼─┼─┼─┼─┼─┼─┼─┼─┬─┬─┬─┬─┬─┬─┬─┐ |
107 | //! │ 3 │───▶ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ |
108 | //! └───┘ └─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘ |
109 | //! ``` |
110 | //! |
111 | //! When searching for a free slot, the smallest page is searched first, and if |
112 | //! it is full, the search proceeds to the next page until either a free slot is |
113 | //! found or all available pages have been searched. If all available pages have |
114 | //! been searched and the maximum number of pages has not yet been reached, a |
115 | //! new page is then allocated. |
116 | //! |
117 | //! Since every page is twice as large as the previous page, and all page sizes |
118 | //! are powers of two, we can determine the page index that contains a given |
119 | //! address by shifting the address down by the smallest page size and |
120 | //! looking at how many twos places necessary to represent that number, |
121 | //! telling us what power of two page size it fits inside of. We can |
122 | //! determine the number of twos places by counting the number of leading |
123 | //! zeros (unused twos places) in the number's binary representation, and |
124 | //! subtracting that count from the total number of bits in a word. |
125 | //! |
126 | //! The formula for determining the page number that contains an offset is thus: |
127 | //! |
128 | //! ```rust,ignore |
129 | //! WIDTH - ((offset + INITIAL_PAGE_SIZE) >> INDEX_SHIFT).leading_zeros() |
130 | //! ``` |
131 | //! |
132 | //! where `WIDTH` is the number of bits in a `usize`, and `INDEX_SHIFT` is |
133 | //! |
134 | //! ```rust,ignore |
135 | //! INITIAL_PAGE_SIZE.trailing_zeros() + 1; |
136 | //! ``` |
137 | //! |
138 | //! [`MAX_THREADS`]: https://docs.rs/sharded-slab/latest/sharded_slab/trait.Config.html#associatedconstant.MAX_THREADS |
139 | |