Incredibuild Team
reading time:
C++ has a reputation for having a memory model that’s difficult to handle, especially for programmers coming from managed-memory languages. It is notorious for out-of-bounds reference errors and memory leaks.
Still, modern C++ is considerably safer than it used to be and is now safer (and more performant) than even managed-memory models.
In Part 1 of this series, we explored memory models both in managed-memory languages, and in C and old-style C++. We saw the value that smart pointers can deliver and explored one of the standard smart pointers: unique_ptr.
This post, Part 2, will introduce another standard smart pointer, the shared pointer (shared_ptr), and explore some of its uses.
The concept of shared ownership, and why you’d want it
You will recall from Part 1 that the primary function of smart pointers is to release objects that have been allocated on the heap (and other resources) when they are no longer needed. unique_ptr introduced the idea of ownership (like ownership in Rust). It bound the lifetime of the referent to the lifetime of the owner: the unique_ptr itself. Since the lifetime of the owner was predictable, so too would be the lifetime of the referent.
But what happens if the lifetime of the referent is not so predictable? You would find this in multi-threaded or asynchronous programs, where one process might set up an object, another process might populate it, and a third might forward it to an external endpoint. It’s only when every process has finished with the object that it can be deleted.
This is what a shared_ptr is for. Superficially, it’s like a unique_ptr, in that the lifetime of the referent is bound to the lifetimes of its owners. But the big difference is that a shared pointer can have multiple owners, and they all must give up their ownership before the referent gets deleted.
This sounds a lot like the memory model that’s used in managed-memory languages, where objects remain in memory as long as they are accessible to other objects in the program. But, just as with unique pointers (and unlike in managed-memory languages), there’s no garbage collector involved, so the referent is deleted at the earliest possible moment.
Counting references
A shared_ptr works using a technique called reference counting, which was invented to manage memory in functional languages. Alongside every referent, there is a record that counts how many other objects are interested in it. Every time a shared pointer is created that links to the referent, behind the scenes it increments the counter. If the shared pointer is deleted or disconnected from the referent, the shared pointer decrements the counter again. Eventually, when the counter reaches zero, the departing pointer deletes the referent and recycles its memory.
To summarize, this technique involves three distinct entities:
- The smart pointer, which manages the counter data structure and dereferences the referent
- The counter object, which is the ultimate owner of the referent
- The referent itself
Figure 1: Reference counting strategy
This process is obviously a little more complex than the unique_ptr, so there’s a very slight overhead involved in its use.
Firstly, there’s a space overhead involved in managing the counter. If the allocated resources are large and complex, the counter’s overhead is minimal, but if you’re allocating a large number of tiny objects, the overhead may become significant.
Depending on how the counter is allocated, there may also be a time overhead, as the counter is allocated and deallocated alongside the referent. As the heap gets more fragmented, that too may become significant. We’ll explore some alternative strategies for dealing with this later on.
Beware circular references
Although the reference-counted pointer looks a lot like references in managed-memory languages, there’s one big difference that you need to be aware of. Consider an object owned by a shared_ptr, which itself contains shared_ptrs. When the object gets destroyed, the contained shared_ptrs also get destroyed, which destroys their referents. The whole tree gets deleted all at once.
Figure 2: Cascading deletes
But suppose the shared_ptrs refer to each other in a loop. In that case, every object is kept alive by the preceding shared pointer in the chain, even if there are no references to the whole structure. That’s a memory leak!
Figure 3: Object 1 is still owned by Object 3, so it doesn’t get deleted.
There are several observations to be made here:
- Immutability is safe. If the shared_ptr only ever refers to const objects (or at least, objects that implement an immutable interface), there is no possibility of creating these loops. Immutable objects are immune to this problem.
- Pinning. This is a very good technique for allowing an object to pin itself in memory. Suppose you have an object that is participating in a long-running, asynchronous process. It can contain a shared pointer referring to itself, and when the process is finished, it resets the pointer. If nobody else cares, the object will be deleted.
- Weak references. Finally, if self-referential pointers are essential, you can create a weak_ptr. We’ll explore this in a moment.
Basic usage
Similarly to the unique_ptr, the simplest way to create a shared_ptr is with make_shared:
#include <memory>
shared_ptr<MyClass> c = make_shared<MyClass>(p1, p2);
When you create the pointer and the referent this way, the shared pointer allocates only one memory block on the heap, containing both the referent and the counter.
Figure 4: The memory layout after make_shared
Unlike with the unique_ptr, we can assign one shared_ptr to another:
shared_ptr<MyClass> d = c;
This produces a memory layout like this:
Figure 5: Memory layout after a shared_ptr assignment.
We can prove this works using an instrumentation class as follows:
#include <memory>
Class Test {
Test () {
cout << “Test ctor” << endl;
}
~Test () {
cout << “Test dtor” << endl;
}
}
shared_ptr<Test> s;
{
cout << “Creating t” << endl;
shared_ptr<Test> t = make_shared<Test>();
s = t; // The reference count is now 2
}
// When t goes out of scope, the reference count falls to 1
cout << “t has been destroyed” << endl;
// when the program ends, and s gets destroyed, the count falls to zero
Which produces the output:
Creating t
Test ctor
t has been destroyed
Test dtor
We saw in the introduction that cycles of shared_ptrs pin their memory in place, and we mentioned in passing that a weak_ptr can help with that. We can now explore what a weak_ptr actually does.
In effect, the weak_ptr is like a shared_ptr, but the reference counter doesn’t count the weak_ptrs. More significantly, you can’t dereference a weak_ptr: The shared_ptrs from which it was created may have been destroyed, but even though the weak_ptr still exists, the referent will have gone.
To use a weak_ptr, you first have to try to convert it to a shared_ptr.
weak_ptr<Test> w;
{
cout << “Creating t” << endl;
shared_ptr<Test> t = make_shared<Test>();
w = t;
shared_ptr<Test> s = w.lock(); // try to get the shared_ptr
if (!s) cout << “t has expired” << endl;
// otherwise use s
}
cout << “t has been deleted” << endl;
shared_ptr<Test> s = w.lock(); // try to get the shared_ptr
if (!s) cout << “t has expired” << endl;
This produces the output:
Creating t
Test ctor
Test dtor
t has been deleted
t has expired
Advanced usage
Just like a unique pointer, a shared pointer can be initialized with a pre-existing referent:
MyClass *c = new MyClass();
shared_ptr<C> p = new shared_ptr<MyClass>(c)
When you construct the shared_ptr like this, the reference counter and the referent need to be in two different allocated blocks.
Figure 6: Memory layout when constructing the shared_ptr using new MyClass
This different layout has some implications. First, every shared_ptr takes two heap allocations: one for the reference and one for the counter. That’s twice the allocations compared to make_shared. It’s also twice the heap activity when destroying them again.
Finally, there is a significant difference in the handling of a weak_ptr. For a weak pointer to work, it needs access to the reference counter. The referent can be deleted as soon as the last shared_ptr has been destroyed, but the counter will get deleted only when there are no more shared pointers s or weak pointers using it:
Figure 7: Memory layout when a weak_ptr is holding a reference counter
But this works slightly differently when the shared_ptr is created using make_shared. You have seen that the counter and the referent are created in the same memory block. So, when the shared_ptr was deleted in the weak_ptr demonstration above, its destructor was called, but the memory wasn’t recycled until the weak_ptr also went out of scope. The referent’s memory could hang around for a lot longer than you’d expect!
Conclusion
At first sight, shared_ptr looks like a close analog of managed-memory pointers. But there are important differences.
Do you need shared_ptrs at all? Shared ownership, by its nature, means that you have non-local effects: When every owner has equal rights, the object being shared can be changed by any owner at any time. This makes it difficult to predict the behavior of any particular function that relies on them. Sharing state is a recognized anti-pattern.
Assuming shared pointers are right for your solution:
- Share only immutable state. To alleviate the problems above, you should, wherever possible, either explicitly create shared_ptr<const C> objects or share only objects whose interfaces are const.
- Mix make_shared and new shared_ptr<C>(new C) judiciously. As with the unique_ptr, you should avoid mixing new/delete with a shared_ptr. Instead, use one or the other.
Note: While there is a distinct advantage to always using make_unique to create pointers, with shared_ptr, it’s not so clear-cut. Where the referents are large or use limited system resources, and where you expect to rely on weak_ptrs to manage lifetimes, new shared_ptr<C>(new C) has a distinct advantage over make_shared because it is able to recycle the referent’s memory even when weak_ptrs are still referring to it.
- Pass shared_ptrs by value. In contrast to unique_ptr, where you should pass only references into functions, it is generally preferred to pass shared_ptrs by value.
If the source of a shared_ptr<>& goes out of scope, it (and its referent) will be deleted, and if the function contains any asynchronous operations, both the pointer and its referent will have gone. Copying the shared_ptr has virtually no overhead, but it keeps the referent in memory until the asynchronous operation has finished with it.
Table of Contents
Shorten your builds
Incredibuild empowers your teams to be productive and focus on innovating.