
Incredibuild Team
reading time:
C++ has a reputation for having a memory model that’s difficult to handle, especially for programmers coming from managed-memory languages. It is notorious for out-of-bounds reference errors and memory leaks.
Still, modern C++ is considerably safer than it used to be and is now safer (and more performant) than even managed-memory models.
In Part 1 of this two-part series, we’ll explain the principles of memory management in both managed-memory languages and traditional C and C++, explain the problems with each of those approaches, and then suggest how smart pointers can help. Finally, we’ll explore in depth an important built-in smart pointer, the unique pointer (unique_ptr).
The second part of this article will introduce another useful smart pointer: the shared pointer, along with some of its friends, and compare it with the unique pointer.
Take a managed-memory language, where you instantiate a variable—on the stack, say, or inside a new object—like this:
MyClass p = new MyClass(p1, p2, …);
p.field = 0;
Behind the scenes there are two steps:
Figure 1: A pointer on the stack referencing an object on the heap (managed memory version)
If you don’t initialize the variable, then the pointer is still created, but doesn’t indicate anything—this is a null. Either way, it looks like the variable you’ve created is of class MyClass, but in fact, it’s only a pointer to the real MyClass object.
In C and C++, there is a clear distinction between the MyClass object and the pointer, indicated by a *. The equivalent of the above code in C or C++ looks like this:
MyClass* c = new MyClass(p1, p2, …);
*c.field = 0; // or, equivalently, c->field = 0;
Figure 2: A pointer on the stack referencing an object on the heap (C++ version)
The reason for this distinction is that, unlike in a managed-memory language, it is possible to create objects in places other than the heap: on the stack, inside another object, or even inside some protected piece of memory.
Here’s how you’d create the object on the stack:
MyClass s(p1, p2, …);
Figure 3: Creating the object on the stack instead of the heap
There’s another difference between managed memory and C++’s memory: cleanup.
In a managed-memory language, the memory used by objects is reclaimed automatically using a garbage collector. There are many different strategies for this, but ultimately they all work by searching through all the objects allocated by a program, looking for any that are no longer accessible. It then deletes those orphaned objects and reclaims the memory used by them, returning them to the heap.
The process is not perfect. It often causes unexpected slowdowns because the garbage collector can’t work while the program is operating and, of course, you have no idea when the objects get reclaimed, meaning other, expensive resources might be left lying around unnecessarily. But it is automatic.
In contrast, C++’s objects get destroyed the moment they go out of scope. That MyClass that is on the stack: It deletes its resources and its memory gets reclaimed when the containing method ends.
But what about the pointer c? The pointer gets deleted, but the referent is still there, and unless the programmer takes pains to delete it when it’s no longer accessible, nobody else will ever be able to delete it because nobody else has a reference to it.
This is what we call a memory leak. Sometimes—during an exception, for instance— it’s not even possible for the creator to safely delete the referent.
Figure 4: A deleted pointer. The MyClass object can never be freed, so its memory has “leaked.”
In C and C++, a pointer implicitly exposes an interface containing operator * and operator ->. Both these operators dereference the pointer: that is, they return the pointer’s referent.
In C++ we can create classes that implement these operators. The idea is that they look and work like those simple pointers we just saw, but they can be somewhat customized. They’re ordinary classes, so they can apply special processing while they are being constructed, while they’re being destroyed, while they’re being dereferenced, and even based on signals from elsewhere in the system. These are called “smart pointers.”
This series explores two kinds of built-in smart pointers: unique_ptr and shared_ptr. However, here in Part 1, we will only focus on unique_ptr, which is generally the more useful one.
A unique pointer was introduced in C++ 11. It’s a smart pointer (so it exports operator * and operator ->), but it adds the concept of ownership, rather like the idea of ownership in Rust.
Just like the conventional pointer, a unique pointer indicates a chunk of memory that has been allocated on the heap. But unlike the conventional pointer, when the unique_ptr is deleted, it also destructs and frees the referent.
Figure 5: Deleting the unique_ptr automatically deletes the referent
Of course, this works only because each referent has exactly one unique_ptr managing it: just one owner. The trick is that unique_ptr itself enforces that, as you will see.
Compared to managed-memory garbage collection, there are tremendous benefits:
Compared to C++‘s conventional new and delete, there are significant advantages to unique_ptr. Memory recycling is completely automatic. There’s no need for the programmer to delete the memory manually, and it’s not even possible without using advanced, unsafe features.
It’s also inherently exception-safe. As you may know, C++ doesn’t have a finally statement; during exception processing, any resources you’ve allocated need to be deleted by the exception handlers, and that is often impossible because you don’t know where the exception has been thrown from. And exceptions cause memory leaks.
C++ automatically deletes objects when the objects’ context is destroyed, e.g., when a function ends. unique_ptr leverages that mechanism to automatically delete the referents as well, keeping the memory safe.
The simplest way to use unique_ptr is to create the pointer and the referent at the same time, using make_unique. Both are defined in the memory namespace, so:
#include <memory>
unique_ptr<MyClass> c = make_unique<MyClass>(p1, p2, …);
c->field=0;
As you see, the creation of the pointer and the referent looks very similar to before: make_unique takes the same parameters as MyClass’s constructor; and if MyClass has multiple, overloaded constructors, they work just as you’d expect.
We can prove this works by defining an instrumentation class:
#include <memory>
Class Test {
Test () {
cout << “Test ctor” << endl;
}
~Test () {
cout << “Test dtor” << endl;
}
}
unique_ptr<Test> t = make_unique<Test>();
This outputs:
Test ctor
Test dtor
But if we try to copy the unique pointer to another:
unique_ptr<Test> q = t;
You get a compile-time error, complaining that it can’t be done. That makes sense: Copying the pointer would give the referent two owners (t and q), but the whole purpose of the unique_ptr is to ensure the referent has only one.
So, you can’t copy unique pointers, but you can move them. For example:
unique_ptr<Test> make_test () {
return make_unique<Test>();
}
unique_ptr<Test> s = make_test();
You can also place them into STL containers:
vector<unique_ptr<Test>> v;
v.emplace_back (make_unique<Test>());
Or even swap them:
unique_ptr<Test> a; // initialized to “null”
unique_ptr<Test> b = make_unique<Test>();
swap (a, b); // b is now null, a has a referent, and nothing has been deleted
As you saw above, it is possible to create a unique_ptr that isn’t associated with any referent. unique_ptr exposes operator bool, which tests whether there’s a referent;
unique_ptr<Test> a;
if (a) { /* use *a */ }
In fact, there’s a whole collection of methods that allow you to reach in and “help” the unique_ptr do its work:
Generally, when you use these functions, you’re giving up a large part of a unique pointer’s safety net. If you call a function with a reference to your unique_ptr, you don’t know what changes the function can make to it before it comes back to you. The pointer may own a completely different referent from what you thought—or none at all.
Except in certain, very specific circumstances, you shouldn’t allow functions to move, release, or reset your unique pointers. You can prevent this by declaring your unique_ptr to be const:
const unique_ptr<Test> c = make_unique<Test>();
Unique pointers behave a lot like conventional pointers, but there’s one thing they don’t do: indexing. For example;
unique_ptr<C> c = make_unique<C>();
auto x = c[2]; // forbidden
auto y = *c+1; // forbidden
++c; // forbidden
It makes sense: the referent is only one object, not an array of them, and arrays are treated very differently in the heap from single objects, which is why we have new and new[]; delete and delete[].
But there is a way for unique_ptr to handle arrays. Like this:
unique_ptr<C[]> cc = make_unique<C[]>(5);
cout << *(cc.Get()+1) << nl;
cout << cc[3] << nl;
++cc; // still forbidden!
This only works when the template type is an unbounded array, i.e., C[] not C[5], and it only works if C has a default constructor. In other words, you can’t initialize the array from an initializer list.
unique_ptr is a very simple way to protect your programs against memory errors. It binds the lifetime of heap objects to the lifetimes of other, more predictable objects, and you can transfer the ownership at any time. Used properly, it can prevent both null-pointer, and stray-pointer errors, and can completely obviate memory leaks. Lastly, unique_ptr involves virtually no runtime overhead and hardly any memory overhead when compared with C-style pointers.
Before you use them, though, you should check that they’re the right solution to your problem:
Assuming you decide you need to allocate and deallocate heap memory, follow these rules:
Provided you follow these rules, your program will be much easier to write and read than if you’d used simple pointers. You also won’t suffer from memory leaks or memory collisions.
The errors you’re likely to encounter will, for example, involve releasing a pointer prematurely by moving ownership of the referent to something short-lived, like a function parameter. The errors caused by this are visible right away; the unique_ptr reference becomes null, so you’ll reliably get an error if you attempt to dereference it, and the debugger will show you the moment the ownership gets passed.
Use smart pointers comprehensively, and you’ll rarely need to break out memory-checking tools like Valgrind.
Looking to learn more? The definitive source for unique_ptr is, of course, at cpp reference. Otherwise, stay tuned for Part 2 of this series, where we will discuss shared pointers and compare them with unique pointers!
Table of Contents
Shorten your builds
Incredibuild empowers your teams to be productive and focus on innovating.
Incredibuild empowers your teams to be productive and focus on innovating.
| Cookie | Duration | Description |
|---|---|---|
| cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
| cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
| cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
| cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
| cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
| viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |