C++ Coroutines - Let's Play with Them!

C++20 added a feature that a lot of us were waiting for – coroutines. (In another post we talked about other features that came out with C++20 and in other previous posts we also discussed related topics: modernizing your C++ code and the evolution of C++.)

In this post we are going to play a bit with C++ Coroutines.

Let’s start with just a piece of code.

template<typename T> 
unique_generator<T> range(T fromInclusive, T toExclusive) { 
    for (T v = fromInclusive; v < toExclusive; ++v) { 
        co_yield v; 
    } 
} 


int main() { 
    for (auto val : range(1, 10)) { 
        std::cout << val << '\n'; 
    } 
}

You can find above code in Compiler Explorer, here: https://coro.godbolt.org/z/zK3E9TEce

Let’s explain what we have above.

A coroutine is a special function that can suspend its execution and resume later at the exact point where execution was suspended. When suspending, the function may return (yield) a value. When coroutine ends execution it also may return a value.

When a coroutine is suspended, its state is copied into an allocated object that represents the state of the coroutine (not on the stack, we would call it the coroutine “frame”). When the coroutine is suspended, it returns some kind of “handle”. The return value itself would be generated through the handle.

In the main above we use “range” as a coroutine function. The thing that makes “range” become a coroutine function is having a “co_yield”, “co_return” or “co_await” within it.

In the above function we use “co_yield”, which returns a value while keeping the “frame” of the function, so we can get back to it for the next iteration and the function would preserve its state for us.

Note that it is not the same as using a static variable to preserve state, as we can call the coroutine from different threads, or recursively, and each call would preserve its own “frame” independently. To achieve that, the state of the function has to be allocated into a “frame” which is managed through the return value of the coroutine.

The return “handle” of the coroutine is set by the return type. This “handle” holds an inner promise_type (note that this is not related to std::promise). The promise_type must have the function get_return_object(). For other requirements for promise_type, see cppreference on coroutines promise_type.

The machinery of handling the promise_type and the lifetime of the coroutine “frame” is a real burden. To avoid that you can use an existing implementation and focus on the implementation of the coroutine itself. The cppreference usage example for std::coroutine_handle presents such an implementation for a generator class. We used a similar generator from another library in the example above. This library brings us the unique_generator type, which behaves as an iterator (i.e., we can iterate over the values being yielded from our coroutine, using the return value type unique_generator) in a similar way to the Generator presented in the above cppreference link.

The responsibility of the unique_generator is not easy stuff. It should handle the coroutine frame allocation and deallocation. If you want to have a peep at the nitpicking of handling coroutine frames, take a look at this bug fix for unique_generator.

A coroutine ends execution when it reaches a co_return, or when it reaches the end of the function. In our case, the range function finishes when we reach the toExclusive value in the loop.

Some limitations for coroutines, as of C++20:

Coroutines:

cannot use return, only co_return
cannot use varargs (e.g. like printf)
cannot be constexpr
cannot be a constructor or a destructor
cannot be the main function
cannot use auto or concepts as return type (the programmer needs to specify the return type so that the compiler knows what handle type to use, e.g. generator<int>; this obviously can’t be inferred from the function body’s contents)

The hazard of passing parameters to coroutines by reference

Let’s look at an example taken from Arthur O’Dwyer’s blog:

unique_generator<char> explode(const std::string& s) { 
    for (char ch : s) { 
        co_yield ch; 
    } 
}  

int main() { 
    for (char ch : explode("hello world")) { 
        std::cout << ch << '\n'; 
    } 
}

The above code creates a temporary string in the call to the coroutine function “explode”. However, this temporary string is dead before the actual first use of the coroutine, as the lifetime of temporaries is not extended as part of the coroutine frame creation.

As you can see in the code above, the bug is revealed when we run with address sanitizer (-fsanitize=address) and is not detected without that flag. That means this is one of those bugs that can work in your environment and crash in production.

Note that the problem would not be solved even if we try to copy the temporary string to another string that would outlive the coroutine lifetime:

unique_generator<char> explode(const std::string& s) { 
    auto ps = std::make_unique<std::string>(s); 
    for (char ch : *ps) { 
        co_yield ch; 
    } 
}

The above code still has undefined behavior, as the first call to the coroutine just creates it without executing even the first line of the body. Then, on the first execution the temporary is already dead and we try to create a heap-allocated string (by calling make_unique) from a dead temporary. Note again that the bug in this example is revealed when we run with address sanitizer (-fsanitize=address) and is not detected in this case without it.

To better understand the separation between creating the coroutine and actually calling it, we can separate the lines in main into two:

auto coro = explode("hello world"); // (1) coroutine being created 
for (char ch : coro) {  // (2) coroutine being called 
    std::cout << ch << '\n'; 
}

The first line, marked with (1) is still okay, but the second line marked as (2) executes coro, the coroutine, at a point where the temporary string created from “hello world” is already dead. The creation of the unique_ptr from the temporary string is done on the first call of line (2), which is too late, as the temporary string is already dead by then.

We can change the code to make it valid by sending a string that is not a temporary:

int main() { 
    std::string s = "hello world"; 
    // may_explode is a coroutine getting const string& 
    for (char ch : may_explode(s)) { // ok doesn't explode now 
        std::cout << ch << '\n'; 
    } 
}

But the above has changed only the call and not the function itself, so the function can still be called with a temporary, and we are still exposed to an undefined behavior usage.

We could change the function to expect something that outlives the coroutine, such as a unique_ptr:

unique_generator<char> doesnt_explode(std::unique_ptr<std::string> ps) { 
    for (char ch : *ps) { 
        co_yield ch; 
    } 
}  

int main() { 
    for (char ch : doesnt_explode(std::make_unique<std::string>("good"))) { 
        std::cout << ch << '\n'; 
    } 
}

However, one may argue that the above API is not too friendly.

We could also pass the string by-value, an option that we would discuss later on.

Code that sometimes works, depending on the parameter

As seen above, a coroutine that takes a const lvalue reference can work if we actually send an lvalue reference that outlives the lifetime of the coroutine or can explode if we, for example, send an rvalue. This is also the case with the following code, expecting std::string_view:

unique_generator<char> extract(std::string_view s) { 
    for (char ch : s) { 
        co_yield ch; 
    } 
}  

int main() { 
    // this works ok 
    for (char ch : extract("hello world")) { 
        std::cout << ch << '\n'; 
    } 
 
    // this doesn't 
    using namespace std::string_literals; 
    for (char ch : extract("hello world"s)) { 
        std::cout << ch << '\n'; 
    } 
}

Again, the undefined behavior is revealed with address sanitizer (-fsanitize=address) and is not revealed in this code example without it.

No, don’t pass params by value!

Some sources (such as SonarSource, for instance) have advised that when it comes to coroutines it is better to get parameters by value, to be safer and avoid the dangling reference scenarios presented above.

I beg to differ.

First, getting by value doesn’t always help, as we can see in the string_view example above. (One may argue that views are a kind of reference-semantic type, analogous to `const T&`, so passing a string_view by value isn’t really passing “by value”. That’s true. Yet technically speaking, the argument of “pass-by-value should save you from troubles” doesn’t always hold as is.)

Second, the problem is less with the parameter that we expect and more with sending a temporary, which was a known issue way before coroutines arrived.

And third, the process can be very inefficient, especially with coroutines.

Let’s make our coroutine a bit more generic so it can extract items from any container and for “safety reasons” (that we question) we will get the container by value:

template<typename T> 
unique_generator<const typename T::value_type&> extract(T s) { 
    for (const auto& val : s) { 
        co_yield val; 
    } 
}

Note that since coroutines are not allowed to use auto for their return type, at least in C++20, we need to express the return type explicitly.

In our main we will compare a coroutine loop and a simple loop using objects of type MyString, as internal values of the container, so we can add printouts in its constructors and destructor:

int main() { 
    std::array arr{MyString("Hello"), MyString("World"), MyString("!!!") }; 
    std::cout << "========================\n"; 
    std::cout << "coroutine loop:\n"; 
    std::cout << "------------------------\n"; 
    for (const auto& val : extract(arr)) { 
        std::cout << val << '\n'; 
    } 
    std::cout << "========================\n"; 
    std::cout << "simple loop:\n"; 
    std::cout << "------------------------\n"; 
    for (const auto& val : arr) { 
        std::cout << val << '\n'; 
    } 
}

The effect of our coroutine getting the container by value can be clearly seen in the printout:

======================== 
coroutine loop: 
------------------------ 
MyString copy ctor: Hello (0x7ffefe1f5790) 
MyString copy ctor: World (0x7ffefe1f57b0) 
MyString copy ctor: !!! (0x7ffefe1f57d0) 
MyString copy ctor: Hello (0x610000000070) 
MyString copy ctor: World (0x610000000090) 
MyString copy ctor: !!! (0x6100000000b0) 
~MyString: !!! (0x7ffefe1f57d0) 
~MyString: World (0x7ffefe1f57b0) 
~MyString: Hello (0x7ffefe1f5790) 
Hello (0x610000000070) 
World (0x610000000090) 
!!! (0x6100000000b0) 
~MyString: !!! (0x6100000000b0) 
~MyString: World (0x610000000090) 
~MyString: Hello (0x610000000070) 
======================== 
simple loop: 
------------------------ 
Hello (0x7ffefe1f5710) 
World (0x7ffefe1f5730) 
!!! (0x7ffefe1f5750)

In this example we could actually get the container by reference, as we send an actual lvalue reference that outlives the lifetime of the coroutine. This is the change (note the ref on T):

template<typename T> 
unique_generator<const typename T::value_type&> extract(const T& s) { 
    for (const auto& val : s) { 
        co_yield val; 
    } 
}

And the output would now become much nicer for the coroutine:

======================== 
coroutine loop: 
------------------------ 
Hello (0x7fff7b224350) 
World (0x7fff7b224370) 
!!! (0x7fff7b224390) 
======================== 
simple loop: 
------------------------ 
Hello (0x7fff7b224350) 
World (0x7fff7b224370) 
!!! (0x7fff7b224390)

However, the current code still allows getting a temporary that would result in undefined behavior:

for (const auto& val : extract(std::array{MyString("Hi"), MyString("!!")})) { 
    std::cout << val << '\n'; 
}

Looking at the output, it is clear that we have undefined behavior, as we print the strings after being destructed:

======================== 
coroutine loop: 
------------------------ 
MyString ctor from char*: Hello (0x7ffe650e0fc0) 
MyString ctor from char*: World (0x7ffe650e0fe0) 
MyString ctor from char*: !!! (0x7ffe650e1000) 
~MyString: !!! (0x7ffe650e1000) 
~MyString: World (0x7ffe650e0fe0) 
~MyString: Hello (0x7ffe650e0fc0) 
Hello (0x7ffe650e0fc0) 
World (0x7ffe650e0fe0) 
!!! (0x7ffe650e1000)

And again, the code crashed with -fsanitize=address, and didn’t without the address sanitizer. In this case it was acting as a hidden bug waiting for production.

My proposed solution to achieve the efficiency of getting by reference while preventing dangling reference bugs is simple and not new to coroutines. Implement the const reference and delete the rvalue reference:

void extract(const std::string&& s) = delete; 

unique_generator<char> extract(const std::string& s) { 
    for (char ch : s) { 
        co_yield ch; 
    } 
} 

int main() { 
    std::string s = "hello world"; 
    for (char ch : extract(s)) { 
        std::cout << ch << '\n'; 
    } 

    // doesn't compile! Good!! 
    // for (char ch : extract("temp")) { 
    //     std::cout << ch << '\n'; 
    // } 
}

Note that the above idea of deleting the rvalue version resolves the undefined behavior in this case, but is not bulletproof and is considered by some as a bad practice (see Abseil Tip of the Week #149: Object Lifetimes vs. = delete for an interesting discussion on the subject). Though controversial and not bulletproof, I still find this solution contributing.

Binary Tree inorder traversal with coroutines

This example is inspired by Adi Shavit’s talk at CppCon 2019 on coroutines.

Suppose that we want to traverse over a binary tree inorder like this:

BinaryTree<int> t1{5, 3, 14, 2, -3, 100, 56, 82, 72, 45}; 
for (auto val : t1.inorder()) { 
    std::cout << val << '\n'; 
}

Can we implement a member coroutine function in class BinaryTree? Well, the answer is: yes, we can!

Here it is:

template<typename T> 
class BinaryTree { 
    struct TreeNode { 
        T value; 
        TreeNode* left = nullptr; 
        TreeNode* right = nullptr; 
        // [...] 
        unique_generator<T> inorder() { 
            if(left) { 
                for(auto v: left->inorder()) { 
                    co_yield v; 
                } 
            } 
            co_yield value; 
            if(right) { 
                for(auto v: right->inorder()) { 
                    co_yield v; 
                } 
            } 
        } 
    }; 
    TreeNode* head = nullptr; 
    // [...] 
public: 
    auto inorder() { 
        return head->inorder(); 
    } 
    // [...] 
};

The above would fail for an empty BinaryTree, like this one:

BinaryTree<int> t2{}; 
for (auto val : t2.inorder()) { // crashes here, head is null 
    std::cout << val << '\n'; 
}

There are several nice and simple ways to solve the empty tree traversal, keeping the coroutine approach. You can find one here.

To Summarize

We played with simple coroutines, specifically with generator coroutines. The main idea of coroutines is to have a function that preserves a state while releasing control back to the caller. Coroutines in C++ are a complex beast. The coroutine implementer should manage the frame to be created when yielding out, but we used an external library that manages this for us. Coroutines are sensitive to dangling references to temporary objects, one would say even more than simple functions, as even if it seems that we use the temporary object when still alive. However, this is not the case for references copied into the coroutine frame. If you hear advice of passing objects by value to coroutines don’t get tempted to do that when it is costly (well, this is the same advice as for ordinary function calls. Pass-by-value can be safer than const reference, but can be expensive for large non-trivial types). We discussed the hazard of reference to temporary and ways to avoid it.

Resources and additional reading

Amir Kirsh

Amir Kirsh, Incredibuild's Dev Advocate, is a C++ lecturer at the Academic College of Tel-Aviv-Yaffo and at Tel-Aviv University, previously the Chief Programmer at Comverse, after being CTO and VP R&D at a startup acquired by Comverse. He is also a co-organizer of the annual Core C++ conference and a member of the ISO C++ Israeli National Body.

Cookie	Duration	Description
ARRAffinity	session	ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user.
ARRAffinitySameSite	session	This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session.
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_uetsid	1 day	Bing Ads sets this cookie to engage with a user that has previously visited the website.
_uetvid	1 year 24 days	Bing Ads sets this cookie to engage with a user that has previously visited the website.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_UA-8508435-1	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	Hotjar sets this cookie to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	Hotjar sets this cookie to identify a new user’s first session. It stores a true/false value, indicating whether it was the first time Hotjar saw this user.
_hjIncludedInPageviewSample	2 minutes	Hotjar sets this cookie to know whether a user is included in the data sampling defined by the site's pageview limit.
_hjIncludedInSessionSample	2 minutes	Hotjar sets this cookie to know whether a user is included in the data sampling defined by the site's daily session limit.
_hjTLDTest	session	To determine the most generic cookie path that has to be used instead of the page hostname, Hotjar sets the _hjTLDTest cookie to store different URL substring alternatives until it fails.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
MR	7 days	This cookie, set by Bing, is used to collect user information for analytics purposes.
utm_campaign	2 months	Google Ad Services sets this cookie to store session campaign value if present.
utm_content	2 months	This cookie is used for storing the session content value if present.
utm_source	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
utm_term	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
_mkto_trk	2 years	This cookie, provided by Marketo, has information (such as a unique user ID) that is used to track the user's site usage. The cookies set by Marketo are readable only by Marketo.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
MUID	1 year 24 days	Bing sets this cookie to recognize unique web browsers visiting Microsoft sites. This cookie is used for advertising, site analytics, and other operations.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
utm_medium	2 months	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

C++ Coroutines – Let’s Play with Them!

The hazard of passing parameters to coroutines by reference

Code that sometimes works, depending on the parameter

No, don’t pass params by value!

Binary Tree inorder traversal with coroutines

To Summarize

Resources and additional reading

Amir Kirsh

Table of Contents

Shorten your builds

Related Posts

13 minutes 8 Reasons Why You Need Build Observability

13 minutes These 4 advantages of caching are a game-changer for development projects

13 minutes What Level of Build Observability Is Right for You?

Cookie	Duration	Description
_hjSession_2537450	30 minutes	No description
_hjSessionUser_2537450	1 year	No description
AnalyticsSyncHistory	1 month	No description
BIGipServersn-mch-v2-80	session	No description
BIGipServersn02web-nginx-app_https	session	No description
ib_last_referrer	2 months	No description
incap_ses_1319_2167377	session	No description
li_gc	2 years	No description
muc_ads	2 years	No description
nlbi_2167377	session	No description
original_req_url	past	No description
referrer66_00f	1 month	No description
visid_incap_2167377	1 year	No description
visitorId	1 year	No description