SegFault

Don't capture in a coroutine lambda

Sep 10 2022

Mixing lambda functions and C++20 coroutines can be risky if done incorrectly. One important issue is the lifetime of variables captured in a lambda.

Lambda functions have been around since C++11 and provide a convenient way to create small, yet extremely powerful callables that can greatly simplify the usage of callback based APIs. One feature that makes lambdas as powerful as they are is the ability to capture variables in the current scope, including the this pointer and keep them accessible inside the function body.

Another great feature that got introduced a lot more recently is C++20's coroutine support. This allows the creation of functions that can be suspended and later resumed. Because this resumption does not need to happen within the same scope it allows from easy use of asynchronous operations while looking like synchronous code. This greatly simplifies the design of asynchronous code because language features that used to be hard to use or impossible with callback based code simply work. It also has the potential to drastically increase performance as the coroutine frame is only allocated once and suspension/resumption of the coroutine is extremly efficient.

The issue

Given the benefits of both features it seems natural to mix and match them, however this can cause lifetime issues that might not be immediately visible. One example would be code that uses a lambda to spin of a second asynchronous task. I am using my personal coroutine library along with its curl wrapper for illustration, but the problem is applicable to any library.

std::string url = "https://example.org";
launch([url]() -> task<> {
    auto req = http_request::make_get(url);
    auto resp = co_await req.execute_async();
    std::cout << "Downloaded " << resp.body.size() << " Byte from " << url << "\n";
}());

What happens here is that the code uses a lambda which captures a variable url in the current scope and uses it to build a web request, await the result and print out the url along with the received body size. launch() is a helper that allows starting a asynchronous task without waiting for it, similar to std::launch.

This innocently looking code however contains a pretty serious use after free bug. The lambda function will return after its first suspention point (i.e. the co_await statement) and the lambda state is destroyed right after because it is a temporary. However the coroutine function is still active and will be resumed once the asynchronous http transfer is done. Because the lambda, which in turn contains the url variable has already been destroyed the print statement will read from already freed and possibly reused memory.

This issue is so insidious that theres even a C++ Core Guideline for it.

Now that we know the issue, how do we solve it ?

Pass via parameter

The most obvious solution would be to pass the required values as a parameter to the function. Function paramters, unlike lambda captures, are visible to the coroutine and stored in the coroutine frame that is allocated on call and destroyed after the coroutine ends. While this creates additional typing it is the cleanest and least error prone solution.

std::string url = "https://example.org";
launch([](auto url) -> task<> {
    auto req = http_request::make_get(url);
    auto resp = co_await req.execute_async();
    std::cout << "Downloaded " << resp.body.size() << " Byte from " << url << "\n";
}(url));

Note that we explicitly don't take the url string as a std::string_view or using a reference, because that would result in a similar problem, unless we can ensure the string will outlive the end of the coroutine.

Move the value away

Adding additional parameters however is only possible if we control both the lambda creation and callsite. While this is often the case, we might want to pass the function around in a generic container to be executed sometime later, in which case we can't simply add new parameters. In this case we can move or copy the lambda captures out of the lambda struct into a local variable, which are stored in the coroutine frame. We can do this at any point while it is still in scope, which in most cases means before the first suspension point, however its best to do this right at the top of the function.

std::string url = "https://example.org";
launch([url]() -> task<> {
    auto local_url = std::move(url);
    auto req = http_request::make_get(local_url);
    auto resp = co_await req.execute_async();
    std::cout << "Downloaded " << resp.body.size() << " Byte from " << local_url << "\n";
}());

Because in this case the out of scope variables can still be used by accident, it is preferable to only use this if adding parameters is not possible.

Keep the lambda around

The third option to allow this is to keep the lambda in scope until the asynchronous coroutine is finished. This involves some synchronization between the coroutine and parent function and thus might not always be an option. It is however important to note that capturing is perfectly fine as long as the lambda lives longer than the coroutine.

std::string url = "https://example.org";
auto async_task = [&url]() -> task<> {
    auto req = http_request::make_get(url);
    auto resp = co_await req.execute_async();
    std::cout << "Downloaded " << resp.body.size() << " Byte from " << url << "\n";
};
as_promise(async_task()).get();

Here the lambda is stored in a local variable and we use the as_promise helper to get a std::future that is fulfilled once the coroutine is done. Because awaiting the returned future ensures the lambda is in scope for entire duration of the coroutine we can simply capture the variable as we normally would and in fact can even do so using a reference capture. Doing so in this simple example is useless, as we could simply execute a synchronous http request, however it can make sense if coroutines are stacked within each other and something like async_launch_scope is used to join all child coroutines back to the parent at the end.