Support Ukraine. DONATE.
A blog about software development.

Nutype: the newtype with guarantees!

Serhii Potapov February 13, 2023 #rust #macro #newtype #nutype

The newtype pattern

I am a big fan of the newtype pattern in Rust. In my projects I use it as much as I can: it makes my code self-documented and may even help to enforce domain logic using Rust's type system.

It's not rare when I found myself writing code similar to this:

use derive_more::{AsRef, Display, Into, From};

#[derive(Debug, Clone, PartialEq, Eq, AsRef, Into, From)]
pub struct Email(String);

impl Email {
    pub fn new(value: &str) -> Result<Self, EmailError> {
        let sanitized_value = value.trim().to_lowercase();
        if let Some(error) = Self::validate(&sanitized_value) {
            Err(error)
        } else {
            Ok(Self(sanitized_value))
        }
    }

    fn validate(value: &str) -> Option<EmailError> {
        if value.is_empty() {
            Some(EmailError::Missing)
        } else if !value.contains('@') {
            Some(EmailError::Invalid)
        } else {
            None
        }
    }
}

#[derive(Debug, Clone, Copy, Display, Serialize, Deserialize)]
pub enum EmailError {
    #[display(fmt = "can not be blank")]
    Missing,

    #[display(fmt = "is invalid")]
    Invalid,
}

It is easy to grasp what it is going on here:

Generally, this approach works well, however, there are a few pain points.

Pain point 1: no constraints

As long as everyone uses ::new() to obtain an instance of Email we can be sure that every instance of Email in the system has an inner string value that complies with the validation rules.

But it's rather a convention, not a constraint. Chances are high, that at some point someone will create an Email bypassing the validation rules:

let email = Email("Oopsy".to_string());

or

let email = Email::from("Bang".to_string());

Pain point 2: cumbersomeness

As I mentioned above, for the sake of type safety and documentation, I use the newtype pattern a lot. For example, a typical user structure may look like the following one:

struct User {
    id: UserId,
    email: Email,
    username: Username,
    birthday: Birthday,
    first_name: FirstName,
    last_name: LastName,
    // etc
}

You can imagine that defining the newtype structs, error types and validation for every single type, becomes very quickly cumbersome.

This made me seek a way to DRY my code.

Nutype: a new hope

Splitting down the problem, for most of my newtypes I just want the following:

So I came up with nutype library, that relies on heavy use of proc macros and does exactly what I need.

For example, the same Email type can be defined in a much shorter and more declarative way:

use nutype::nutype;

#[nutype(
    sanitize(trim, lowercase)
    validate(present, with = |s| s.contains('@'))
)]
#[derive(Debug, Clone, PartialEq, Eq, AsRef, Into, From, Serialize, Deserialize)]
pub struct Email(String);

The code above also defines EmailError implicitly. I know, this could be too much magic for someone, but I wanted to experiment with it.

What about constraints and guarantees?

One of the key features of nutype is that it disallows obtaining a value of a particular type bypassing the validation rules. To my knowledge, there is no way to do it in safe Rust. If you find any, please let me know!

We can try:

let email = Email("Oopsy".to_string());

and we get an error:

error[E0423]: cannot initialize a tuple struct that contains private fields

It's due to the fact, that nutype wraps the type into an extra module, similar to the following:

mod email {
    pub struct Email(String);
}
use email::Email;

This makes the inner tuple field inaccessible to the outer world.

Looking ahead, I will say that it is impossible to obtain an invalid Email even by deriving DerefMut and modifying an instance of a valid email. You can try it yourself.

As result, we can firmly rely on the types: whenever we have a value of type Email, we can be confident that it's indeed a valid email!

What is next?

As for now, nutype can work with serde (requires serde1 flag). But I'd like it to play well with some other major crates in the ecosystem too.

Arbitrary support

In the future, I'd like to add support for Arbitrary crate.

Consider the following example:

#[nutype(validate(min = 18, max = 99))]
#[derive(Arbitrary)]
struct Age(u8);

This would automatically derive Arbitrary trait to generate only valid values for Age within the range 18..=99.

Although, I guess it will be impossible to marry Arbitrary with custom validation rules.

Deriving Ord and Eq for floats

If you've worked with float types (f32, f64) you have learned by now, they have no Ord nor Eq traits implemented.

This causes some pain: it's not possible to sort Vec<f64> or use BTreeSet<f64>. Both require Ord trait.

The reason why the float types do not implement Ord and Eq is because of special float variants like f64::INFINITY or f64::NAN. There is a function is_finite that does the check against those variants.

It must be possible to filter out Infinity and NaN with validation and enable derives of Ord and Eq:

#[nutype(validate(finite))]
#[derive(PartialOrd, Ord, PartialEq, Eq)]
struct Price(f64);

This will come in the future versions of nutype! =)

Back to top