This is a simple post, inspired by many of the questions I have seen while on the Rust Programming Language Community Discord Server.
Is a
&str
a pointer to the stack or the heap?
Are all
Sized
types stored on the stack and unsized types are on the heap?
Are references automatically stored on the stack or the heap depending on type?
These all seem like reasonable questions to ask. However, they are based on an important misconception
In a typical application process, your application will request memory, and the operating system will provide it.
The stack is a linear region of memory that the OS will start a process with. As the name suggests, all memory is allocated in stack order. This means that the most recently allocated value is the first value to be deallocated (commonly known as LIFO).
It's not unreasonable to want some values to live longer. To support that we must allocate them elsewhere. The "heap" is for these allocations and they can be deallocated with no required order.
This is the tricky part. Languages like Java have 'stack' and 'heap' values.
In Java, the primitive type int
is stored on the stack
and the object type Integer
is stored on the heap.
In Rust, this might be represented by i32
and Box<i32>
.
The heap is 'slow'...
The heap isn't slow. Pointer indirection and heap allocations are slow. Let's break it down.
All modern CPUs improve performance by using cache. Accessing your main memory can take a few hundred CPU cycles, which is far too slow. Instead, the CPU will cache values. Because most memory is accessed linearly, it also caches data around in 'cache lines'.
When you read data from a pointer, that requires loading new memory. A random pointer read will more often lead to a cache miss and require loading from memory. This is a performance loss known as pointer indirection.
Sometimes this doesn't matter. Take a &str
for instance.
Since text strings can be arbitrarily long, we can't determine exactly at compile time how much stack space to reserve when passing a string by value.
Therefore Rust decides that we pass it as a (pointer, length)
pair.
Now, since it's behind a pointer anyway, it doesn't matter whether that points to the stack or the heap.
We are forced to do pointer lookups and potentially risk cache misses regardless.
The heap allocator is a complicated data structure with linked lists, tables, and syscalls. When you perform an allocation it requires finding a free slot in an available page. Deallocation requires releasing the slot on that page and making sure any free lists are updated. These can all be quite slow. My measurements on my M2 Pro suggest roughly 25ns for an allocation and another 25ns for deallocation. While it's not that slow, it's at least 50x slower than a floating point operation which I measured average speeds of half a nanosecond.
So if the allocations are slower, surely that means the heap is slower? Well no. If you need many short-lived allocations, it can add up, but a one-off allocation that will live for a while will not slow down your application.
If you want to write high-performance code, keep pointers and allocations out of a hot loop and optimise the memory layout for the cache. That doesn't mean that allocating and using pointers in your code will make your program slow. Writing in Rust will likely be significantly faster by default compared to Java, Python, or TypeScript.
Is a
&str
a pointer to the stack or the heap?
It is a pointer to wherever it was allocated.
// Allocate a String on the heap
let heap: String = String::from("foo");
// Create a stack value containing 3 bytes worth of string data
let stack: [u8; 3] = *b"bar";
// Create a static string stored in the application data segment
static data: &'static str = "baz";
// All 3 can be valid pointees for &str.
let heap_str: &str = &*heap;
let stack_str: &str = std::str::from_utf8(&stack).unwrap();
let data_str: &str = data;
Are all
Sized
types stored on the stack and unsized types are on the heap?
No. You can allocate Sized
values on the heap,
and you can create unsized references to values stored on the stack
let sized_heap = Box::new(i32);
let stack: [u8; 3] = *b"bar";
let unsized_stack: &str = std::str::from_utf8(&stack).unwrap();
Are references automatically stored on the stack or the heap depending on type?
See above
&'static str
is stored in the data segment as static memory
'static
does not mean a reference to a static
.
It means that it lives as long as a static
.
In other words, this means that it has no other owner who is waiting to deallocate the value.
If we use String::leak
we can guarantee that the heap-allocated memory will not be deallocated.
let mut s = String::new();
s.push_str("foo");
s.push_str("bar");
let static_str: &'static str = String::leak(s);
let
allocates on the stack
async
functions are 'stackless coroutines' - they don't have a stack.
All memory that an async function will use needs to be pre-allocated.
This is often allocated on the heap if you were to spawn this as a tokio
task.
async fn foo() -> i32 {
let bar = 1;
tokio::task::yield_now().await;
bar
}
// `let bar` will be initialised in this heap allocation
let boxed_future = Box::new(foo());