Modern memory management using smart pointers in C++23 – Part 1

memory management

Incredibuild logo

Incredibuild Team 

reading time: 

11 minutes

Memory management and unique pointers

C++ has a reputation for having a memory model that’s difficult to handle, especially for programmers coming from managed-memory languages. It is notorious for out-of-bounds reference errors and memory leaks. 

Still, modern C++ is considerably safer than it used to be and is now safer (and more performant) than even managed-memory models.

In Part 1 of this two-part series, we’ll explain the principles of memory management in both managed-memory languages and traditional C and C++, explain the problems with each of those approaches, and then suggest how smart pointers can help. Finally, we’ll explore in depth an important built-in smart pointer, the unique pointer (unique_ptr). 

The second part of this article will introduce another useful smart pointer: the shared pointer, along with some of its friends, and compare it with the unique pointer.

How conventional memory management works

Take a managed-memory language, where you instantiate a variable—on the stack, say, or inside a new object—like this:

MyClass p = new MyClass(p1, p2, …);

p.field = 0;

Behind the scenes there are two steps:

  1. The new object is allocated in the heap and then initialized according to its constructor.
  2. A “pointer” is created in the context, which indicates where the new object, the “referent,” can be found.

Figure 1: A pointer on the stack referencing an object on the heap (managed memory version)

If you don’t initialize the variable, then the pointer is still created, but doesn’t indicate anything—this is a null. Either way, it looks like the variable you’ve created is of class MyClass, but in fact, it’s only a pointer to the real MyClass object.

Memory management in C, C++

In C and C++, there is a clear distinction between the MyClass object and the pointer, indicated by a *. The equivalent of the above code in C or C++ looks like this:

MyClass* c = new MyClass(p1, p2, …);

*c.field = 0; // or, equivalently, c->field = 0;

Figure 2: A pointer on the stack referencing an object on the heap (C++ version)

The reason for this distinction is that, unlike in a managed-memory language, it is possible to create objects in places other than the heap: on the stack, inside another object, or even inside some protected piece of memory. 

Here’s how you’d create the object on the stack:

MyClass s(p1, p2, …);

Figure 3: Creating the object on the stack instead of the heap

Garbage collection: Managed memory vs. C++

There’s another difference between managed memory and C++’s memory: cleanup.  

In a managed-memory language, the memory used by objects is reclaimed automatically using a garbage collector. There are many different strategies for this, but ultimately they all work by searching through all the objects allocated by a program, looking for any that are no longer accessible. It then deletes those orphaned objects and reclaims the memory used by them, returning them to the heap.

The process is not perfect. It often causes unexpected slowdowns because the garbage collector can’t work while the program is operating and, of course, you have no idea when the objects get reclaimed, meaning other, expensive resources might be left lying around unnecessarily. But it is automatic.

In contrast, C++’s objects get destroyed the moment they go out of scope. That MyClass that is on the stack: It deletes its resources and its memory gets reclaimed when the containing method ends. 

But what about the pointer c? The pointer gets deleted, but the referent is still there, and unless the programmer takes pains to delete it when it’s no longer accessible, nobody else will ever be able to delete it because nobody else has a reference to it. 

This is what we call a memory leak. Sometimes—during an exception, for instance— it’s not even possible for the creator to safely delete the referent.

Figure 4: A deleted pointer. The MyClass object can never be freed, so its memory has “leaked.”

Smart pointers to the rescue!

In C and C++, a pointer implicitly exposes an interface containing operator * and operator ->. Both these operators dereference the pointer: that is, they return the pointer’s referent.

In C++ we can create classes that implement these operators. The idea is that they look and work like those simple pointers we just saw, but they can be somewhat customized. They’re ordinary classes, so they can apply special processing while they are being constructed, while they’re being destroyed, while they’re being dereferenced, and even based on signals from elsewhere in the system. These are called “smart pointers.”

This series explores two kinds of built-in smart pointers: unique_ptr and shared_ptr. However, here in Part 1, we will only focus on unique_ptr, which is generally the more useful one.

unique_ptr and the idea of ownership

A unique pointer was introduced in C++ 11. It’s a smart pointer (so it exports operator * and operator ->), but it adds the concept of ownership, rather like the idea of ownership in Rust.

Just like the conventional pointer, a unique pointer indicates a chunk of memory that has been allocated on the heap. But unlike the conventional pointer, when the unique_ptr is deleted, it also destructs and frees the referent.

Figure 5: Deleting the unique_ptr automatically deletes the referent

Of course, this works only because each referent has exactly one unique_ptr managing it: just one owner. The trick is that unique_ptr itself enforces that, as you will see.

Compared to managed-memory garbage collection, there are tremendous benefits:

  • A referent gets deleted, and all its resources recycled, as soon as it is no longer accessible, so no delay for memory to be recycled as with managed memory
  • There is no pause in the program’s execution when memory starts to run low and the garbage collector starts up.

Compared to C++‘s conventional new and delete, there are significant advantages to unique_ptr. Memory recycling is completely automatic. There’s no need for the programmer to delete the memory manually, and it’s not even possible without using advanced, unsafe features.

It’s also inherently exception-safe. As you may know, C++ doesn’t have a finally statement; during exception processing, any resources you’ve allocated need to be deleted by the exception handlers, and that is often impossible because you don’t know where the exception has been thrown from. And exceptions cause memory leaks. 

C++ automatically deletes objects when the objects’ context is destroyed, e.g., when a function ends. unique_ptr leverages that mechanism to automatically delete the referents as well, keeping the memory safe.

Basic usage

The simplest way to use unique_ptr is to create the pointer and the referent at the same time, using make_unique. Both are defined in the memory namespace, so:

#include <memory>

unique_ptr<MyClass> c = make_unique<MyClass>(p1, p2, …);

c->field=0;

As you see, the creation of the pointer and the referent looks very similar to before: make_unique takes the same parameters as MyClass’s constructor; and if MyClass has multiple, overloaded constructors, they work just as you’d expect.

We can prove this works by defining an instrumentation class:

#include <memory>

Class Test {

    Test () {

        cout << “Test ctor” << endl;

    }

    ~Test () {

        cout << “Test dtor” << endl; 

    }

}

unique_ptr<Test> t = make_unique<Test>();

This outputs:

Test ctor

Test dtor

But if we try to copy the unique pointer to another:

unique_ptr<Test> q = t;

You get a compile-time error, complaining that it can’t be done. That makes sense: Copying the pointer would give the referent two owners (t and q), but the whole purpose of the unique_ptr is to ensure the referent has only one.

So, you can’t copy unique pointers, but you can move them. For example:

unique_ptr<Test> make_test () {

    return make_unique<Test>();

}

unique_ptr<Test> s = make_test();

You can also place them into STL containers:

vector<unique_ptr<Test>> v;

v.emplace_back (make_unique<Test>());

Or even swap them:

unique_ptr<Test> a; // initialized to “null”

unique_ptr<Test> b = make_unique<Test>();

swap (a, b); // b is now null, a has a referent, and nothing has been deleted

Advanced usage

As you saw above, it is possible to create a unique_ptr that isn’t associated with any referent. unique_ptr exposes operator bool, which tests whether there’s a referent;

unique_ptr<Test> a;

if (a) { /* use *a */ }

 In fact, there’s a whole collection of methods that allow you to reach in and “help” the unique_ptr do its work:

  • You can get a pointer to the referent with get().  
  • You can take ownership of the referent with release(). This returns the pointer to the referent, but it also zeroes out the unique_ptr. The referent is now your problem!
  • You can reset()the unique_ptr. This deletes whatever referent the unique_ptr had (just as if the unique_ptr were being deleted), and then takes ownership of the pointer you just gave it.

Generally, when you use these functions, you’re giving up a large part of a unique pointer’s safety net. If you call a function with a reference to your unique_ptr, you don’t know what changes the function can make to it before it comes back to you. The pointer may own a completely different referent from what you thought—or none at all.

Except in certain, very specific circumstances, you shouldn’t allow functions to move, release, or reset your unique pointers. You can prevent this by declaring your unique_ptr to be const:

const unique_ptr<Test> c = make_unique<Test>();

unique_ptr arrays

Unique pointers behave a lot like conventional pointers, but there’s one thing they don’t do: indexing. For example;

unique_ptr<C> c = make_unique<C>();

auto x = c[2]; // forbidden

auto y = *c+1; // forbidden

++c; // forbidden

It makes sense: the referent is only one object, not an array of them, and arrays are treated very differently in the heap from single objects, which is why we have new and new[]; delete and delete[].

But there is a way for unique_ptr to handle arrays. Like this:

unique_ptr<C[]> cc = make_unique<C[]>(5);

cout << *(cc.Get()+1) << nl;

cout << cc[3] << nl;

++cc; // still forbidden!

This only works when the template type is an unbounded array, i.e., C[] not C[5], and it only works if C has a default constructor. In other words, you can’t initialize the array from an initializer list.  

Conclusion

unique_ptr is a very simple way to protect your programs against memory errors. It binds the lifetime of heap objects to the lifetimes of other, more predictable objects, and you can transfer the ownership at any time. Used properly, it can prevent both null-pointer, and stray-pointer errors, and can completely obviate memory leaks. Lastly, unique_ptr involves virtually no runtime overhead and hardly any memory overhead when compared with C-style pointers.

Before you use them, though, you should check that they’re the right solution to your problem:

  • Do you need to reclaim memory at all? If yours is only a short-running program (like a command-line utility or a compiler), there may be no need to recycle memory because it will all get recycled anyway when the program terminates.
  • Are your allocations and deallocations synchronized with the stack? It’s much faster to use the stack for transient memory than to use the heap; it’s also safer because your objects will automatically get cleaned up as part of the normal stack metabolism. (Beware! The heap has much greater capacity than the stack, and the stack might not be big enough for all your memory needs.)

Further tips and suggested reading

Assuming you decide you need to allocate and deallocate heap memory, follow these rules:

  • Use make_unique in preference to new and delete. You can use new (without delete) if your program runs only for a short time, but you should never mix new/delete with unique_ptr.
  • Declare every unique_ptr to be const to avoid surprises, unless you really need to modify the referent.
  • When passing a unique_ptr<C> to another function, pass it either as C& or as const unique_ptr<C>&. Never pass a unique_ptr to a non-reference parameter—that will give ownership of the referent to the parameter, which will delete the referent when the parameter goes out of scope as the function ends. That is most likely not what you intended.

Provided you follow these rules, your program will be much easier to write and read than if you’d used simple pointers. You also won’t suffer from memory leaks or memory collisions.

The errors you’re likely to encounter will, for example, involve releasing a pointer prematurely by moving ownership of the referent to something short-lived, like a function parameter. The errors caused by this are visible right away; the unique_ptr reference becomes null, so you’ll reliably get an error if you attempt to dereference it, and the debugger will show you the moment the ownership gets passed.

Use smart pointers comprehensively, and you’ll rarely need to break out memory-checking tools like Valgrind.

Looking to learn more? The definitive source for unique_ptr is, of course, at cpp reference. Otherwise, stay tuned for Part 2 of this series, where we will discuss shared pointers and compare them with unique pointers!