How to flatten the nested std::optional?

note: this question was briefly marked as a duplicate of this, but it is not an exact duplicate since I am asking about std::optionals specifically. Still a good question to read if you care about general case.

Assume I have nested optionals, something like this(dumb toy example):

struct Person{
    const std::string first_name;
    const std::optional<std::string> middle_name;
    const std::string last_name;
};
struct Form{
    std::optional<Person> person;
};

and this spammy function:

void PrintMiddleName(const std::optional<Form> form){
    if (form.has_value() && form->person.has_value() && form->person->middle_name.has_value()) {
        std::cout << *(*(*form).person).middle_name << std::endl; 
    } else {
        std::cout << "<none>"  << std::endl; 
    }
}

What would be the best way to flatten this optional check? I have made something like this, it is not variadic, but I do not care that much about that(I can add one more level(overload with membr3) if really necessary, and everything beyond that is terrible code anyway).

template<typename T, typename M>
auto flatten_opt(const std::optional<T> opt, M membr){
    if (opt.has_value() && (opt.value().*membr).has_value()){
        return std::optional{*((*opt).*membr)};
    }
    return decltype(std::optional{*((*opt).*membr)}){};
}

template<typename T, typename M1, typename M2>
auto ret_val_helper(){
    // better code would use declval here since T might not be 
    // default constructible.
    T t;
    M1 m1;
    M2 m2;
    return ((t.*m1).value().*m2).value();
}

template<typename T, typename M1, typename M2>
std::optional<decltype(ret_val_helper<T, M1, M2>())> flatten_opt(const std::optional<T> opt, M1 membr1, M2 membr2){
    if (opt.has_value() && (opt.value().*membr1).has_value()){
        const auto& deref1 = *((*opt).*membr1);
        if ((deref1.*membr2).has_value()) {
            return std::optional{*(deref1.*membr2)};
        }
    }
    return {};
}

void PrintMiddleName2(const std::optional<Form> form){
    auto flat  = flatten_opt(form, &Form::person, &Person::middle_name);
    if (flat) {
        std::cout << *flat;
    }
    else {
        std::cout << "<none>"  << std::endl; 
    }
}

godbolt

notes:

  • I do not want to switch away from std::optional to some better optional.
  • I do not care that much about perf, unless I return a pointer I must make copy(unless arg is temporary) since std::optional does not support references.
  • I do not care about flatten_has_value function(although it is useful), since if there is a way to nicely flatten the nested optionals there is also a way to write that function.
  • I know my code looks like it works, but it is quite ugly, so I am wondering if there is a nicer solution.

Answer

The operation you’re looking for is called the monadic bind operation, and is sometimes spelled and_then (as it is in P0798 and Rust).

You’re taking an optional<T> and a function T -> optional<U> and want to get back an optional<U>. In this case the function is a pointer to data member, but it really does behave as a function in this sense. &Form::person takes a Form and gives back an optional<Person>.

You should write this in a way that is agnostic to the kind of function. The fact that it’s specifically a pointer to member data isn’t really important here, and maybe tomorrow you’ll want a pointer to member function or even a free function. So that’s:

template <typename T,
          typename F,
          typename R = std::remove_cvref_t<std::invoke_result_t<F, T>>,
          typename U = mp_first<R>>
    requires SpecializationOf<R, std::optional>
constexpr auto and_then(optional<T> o, F f) -> optional<U>
{
    if (o) {
        return std::invoke(f, *o);
    } else {
        return std::nullopt;
    }
}

This is one of the many kinds of function declarations that are just miserable to write in C++, even with concepts. I’ll leave it as an exercise to properly add references into there. I choose to specifically write it as -> optional<U> rather than -> R because I think it’s important for readability that you can see that it does, in fact, return some kind of optional.

Now, the question is how do we chain this to multiple functions. Haskell uses >>= for monadic bind, but in C++ that has the wrong association (o >>= f >>= g would evaluate f >>= g first and require parentheses). So the next closest chose of operator would be >> (which means something different in Haskell, but we’re not Haskell, so it’s okay). Or you could implement this borrowing the | model that Ranges does.

So we’d either end up syntactically with:

auto flat  = form >> &Form::person >> &Person::middle_name;

or

auto flat = form | and_then(&Form::person)
                 | and_then(&Person::middle_name);

A different way to compose multiple monadic binds together is an operation that Haskell spells >=>, which is called Kleisli composition. In this case, it takes a function T -> optional<U> and a function U -> optional<V> and produces a function T -> optional<V>. This is something that is exceedingly annoying to write constraints for so I’m just going to skip it, and it would look something like this (using the Haskell operator spelling):

template <typename F, typename G>
constexpr auto operator>=>(F f, G g) {
    return [=]<typename T>(T t){
        using R1 = std::remove_cvref_t<std::invoke_result_t<F, T>>;
        static_assert(SpecializationOf<R1, std::optional>);
        using R2 = std:remove_cvref_t<std::invoke_result_t<G, mp_first<R1>>>;
        static_assert(SpecializationOf<R2, std::optional>);

        if (auto o = std::invoke(f, t)) {
            return std::invoke(g, *o);
        } else {
            // can't return nullopt here, have to specify the type
            return R2();
        }
    };
}

And then you could write (or at least you could if >=> were an operator you could use):

auto flat  = form | and_then(&Form::person >=> &Person::middle_name);

Because the result of >=> is now a function that takes a Form and returns an optional<string>.