Consume the Input Only When `Some<T>` is returned

Option<T> is ubiquitous in Rust. It is used to represent a value that may or may not be present: every Option<T> is either Some<T> or None. This is a powerful concept that allows for safe handling of optional values without the need for null references.

Problem Statement. However, sometimes, given a function fn f<T, F>(t: T) -> Option<F> that takes the ownership of the input t and returns an Option<F>, we want to consume¹ the input only when Some<F> is returned. This is particularly useful when we want to avoid unnecessary cloning or copying of data. Now, let us consider the following example (playground):

        
        use rand::Rand;

        #[derive(Debug)]
        struct ZST;

        fn rnd_zst() -> Option<ZST> {
            if rand::random::<bool>() { Some(ZST) }
            else { None }
        }

        fn foo<T: std::fmt::Debug>(p: Option<T>) -> Option<String> {
            if p.is_some() {
                return Some(format!("{:?}", p));
            }
            None
        }

        fn main() {
            let input: Option<ZST> = rnd_zst();
            match foo(input) {
                Some(x) => { /* Do something with x */ }
                None => {
                    let x = input;
                }
            }
        }

The Rust compiler rejects this code with the following error:

        
        
        error[E0382]: use of moved value: `input`
          --> src/main.rs:23:21
           |
        19 |     let input: Option<ZST> = rng_zst();
           |         ----- move occurs because `input` has type `Option<ZST>`, which does not implement the `Copy` trait
        20 |     match foo(Some(input)) {
           |                    ----- value moved here
        ...
        23 |             let x = input;
           |                     ^^^^^ value used here after move

The error occurs because the input is moved into the function foo, and then it is used again in the None branch of the match statement (lines 15-17 highlighted in the code above). This is not allowed in Rust, as it violates the ownership rules.

Unideal Solutions. To solve this problem, we could consider the following three solutions:

As the compiler suggests use the Copy trait. This trait allows for types that can be duplicated by simply copying their bits. However, this is not always possible or desirable, especially for large data structures or when we want to avoid unnecessary cloning. For instance, consider struct ZST; in the example above being a large data structure instead of a zero-sized type (ZST).
Use a shared reference as a formal parameter of the function foo by migrating to the following signature: fn foo<T: std::fmt::Debug>(p: &Option<T>) -> Option<String>. This way, the input is borrowed instead of moved, and it can be used later in the None branch of the match statement. However, this is not always possible or desirable, especially when we want to consume the input.
Migrate to the following code (playground):
```
            
            use rand::Rng;

            fn rnd_zst() -> Option<ZST> { if rand::random::<bool>() { Some(ZST) } else { None } }

            #[derive(Debug, Default)]
            struct ZST;

            fn foo<T: std::fmt::Debug>(p: &mut Option<T>) -> Option<String> {
                if let Some(p) = std::mem::take(p) {
                    return Some(format!("{:?}", p));
                }
                None
            }

            fn main() {
                let input: &mut Option<ZST> = &mut rng_zst();
                match foo(input) {
                    Some(x) => { /* Do something with x */ }
                    None => {
                        let x = input;
                    }
                }
            }
            
            
```
In this code are highlighted the lines that are changed with respect to the previous code. The main change is that we use a mutable reference to the input and std::mem::take to replace the value of the input with its default value (i.e., None in this case). In this way, we can simulate the consumption of the input without actually moving it. Note that, the same solution may be achieved by using std::option::Option::take.

However, this solution is not idiomatic Rust and I personally find this an orrible solution. Therefore, the most important point is that it is not clear what the function foo does just by looking at its signature, by violating the principle of least surprise.² In addition, if foo is exposed to the outside world, it is not clear the behavior of the function except through documentation. This is a problem because it makes the code less readable and maintainable, and it can lead to confusion and bugs in the future.

Anyway, since we want to consume the input (i.e., keeping the type of the input unchanged) only when Some<F> is returned, these three solutions are not suitable for our purpose.

Decent Solutions. By rearranging the return value of the function foo we may be able to solve the problem. For instance, the following are possible solutions:

A simple strategy is to return a tuple of the output and the input, and then use pattern matching to extract the values. This way, we can consume the input only when Some<F> is returned. The signature of the function becomes fn foo<T: std::fmt::Debug>(p: Option<T>) -> (Option<String>, Option<T>). The 2-tuple is always in the form (Some(x), None) or (None, Some(input)). However, this is not actually elegant due to the principle of least surprise and the fact that we have to deal with a tuple instead of a single value.
A more elegant solution is to use the Result<T, E> type to represent the success or failure of the operation. The signature of the function becomes fn foo<T: std::fmt::Debug>(p: Option<T>) -> Result<String, T>. In this way, we can consume the input only when Ok(x) is returned. The Err(input) case is used to return the input without consuming it. This is a more idiomatic solution in Rust and it is easier to understand and use. However, again because of the principle of least surprise, if the Err(input) does not semantically represent an error, this could be misleading and may cause confusion.

Ideal Solution. The ideal solution is to use a custom \(\Sigma\)-type to represent the semantic meaning of the operation. In Rust, we can use an enum to represent \(\Sigma\)-types.

We define an enum FooResult<T> that has two variants: Ok(String) and Fallback(T). The Ok(String) variant is used to represent the success case and the input is consumed, while the Fallback(T) variant is used to represent the "failure" case and the input is given back without consuming it.

The signature of the function becomes fn foo<T: std::fmt::Debug>(p: Option<T>) -> FooResult<T>. This is a more idiomatic solution in Rust and it is easier to understand and use by minimizing the least surprise principle. In addition, this solution is more flexible and extensible, as we can add more variants to the enum in the future if needed. (playground)

        
        use rand::Rng;

        fn rnd_zst() -> Option<ZST> { if rand::random::<bool>() { Some(ZST) } else { None } }

        #[derive(Debug)]
        struct ZST;

        enum FooResult<F> {
            Ok(String),
            Fallback(F),
        }

        fn foo<T: std::fmt::Debug>(p: Option<T>) -> FooResult<Option<T>> {
            if p.is_some() {
                return FooResult::Ok(format!("{:?}", p));
            }
            FooResult::Fallback(p)
        }

        fn main() {
            let input = rnd_zst();
            match foo(input) {
                FooResult::Ok(x) => { /* Do something with x */ }
                FooResult::Fallback(input) => {
                    let x = input;
                }
            }
        }

The smallest last improvement might be to make the FooResult<T> \(\Sigma\)-type more generic and expressible in the type system. That is, we could:

use a generic type T for the Ok variant instead of a fixed type String.
introduce the variant Err(E) to represent the error case, where E is a generic type that can be used to represent any error type.

This way, we can use the same \(\Sigma\)-type in different contexts and with different types. The semantic meaning of the operation is still preserved, but we have more flexibility and expressibility in the type system. This is the definition of the \(\Sigma\)-type that I would personally use in my code.

        
        enum ResultWithFallback<T, E, F> {
            Ok(T),
            Err(E),
            Fallback(F),
        }

¹ Consume means to take ownership of the input and make it unavailable for further use. In Rust, this is typically done by passing the input to a function that takes ownership of it. Of course, this is in contrast to borrowing, where the input is passed to a function that does not take ownership of it and can be used later. ↩

² Principle of least surprise, also known as principle of least astonishment, is a design principle that states that a component of a system should behave in a way that most users will expect it to behave, and therefore not astonish or surprise users. More info: https://en.wikipedia.org/wiki/Principle_of_least_astonishment. ↩

Consume the Input Only When Some<T> is returned

Consume the Input Only When `Some<T>` is returned