Rust: Trait Objects vs Generics

30 Mar

I’m taking a quick detour from LogStore to talk about a great comment that came from a HN post: 100 days with Rust, or, a series of brick walls. The comment is from kibwen, and I’m basically going to copy-and-paste it into this blog post for 2 reasons: 1) hopefully it’ll be easier for folks to find; 2) I want to be able to more easily find it as I think it’s great advice and a great explanation! Now, on to his comment/post; formatting/editing is mostly mine.

Don’t Use Trait Objects, Just Use Generics. You’ll Thank Me.

Here’s the setup: we have several different types, and those types implement the same trait (think of it like an interface from other languages).

    // Define two different types
    struct Chihuahua;
    struct GreatDane;
    
    // Define a trait with a method
    trait Bark {
        fn bark(&self);
    }
    
    // Implement that method for both types
    impl Bark for Chihuahua {
        fn bark(&self) { println!("woof") }
    }
    impl Bark for GreatDane {
        fn bark(&self) { println!("WOOF") }
    }

Using instances of these types looks like so:

    let rover = Chihuahua;
    let marmaduke = GreatDane;
    
    rover.bark();  // woof
    marmaduke.bark();  // WOOF

Now say that you want to write a function that accepts any type that implements the Bark trait. As I mentioned before, there’s two ways to do it: the static way, and the dynamic way. Here’s what both versions of the function look like:

    fn speak_static<T: Bark>(dog: T) {
        dog.bark();
    }

    fn speak_dynamic(dog: Bark) {  // wait for it...
        dog.bark();
    }

In the first one, there’s a generic type (the T), which we have bounded by the Bark trait. At compile-time, for each different type that you use with this function it will generate a new copy of the function with T replaced with whatever type you actually used (this might seem excessive, but it’s crucial for further optimizations).

Furthermore, calling this function is trivial:

speak_static(rover);  // woof
speak_static(marmaduke);  // WOOF

The fact that it’s so easy to use these functions is what we mean when we say that Rust “prefers” static dispatch. I’ll come back to this in a moment.

For the dynamic version, it’s different because there’s no generics at all. Instead, the function is just taking a normal parameter of type Bark. Looks simple, right? In fact, it even looks simpler than the static version! The illusion of simplicity is what makes this so pernicious to beginners. In fact, I’ve lied to you completely: despite seeming like this should work, it doesn’t even compile. That’s because, unlike many other languages, Rust doesn’t heap-allocate (or “box”) things by default. It has to pass function parameters, unboxed, on the stack. And trying to generate a single version of a function whose parameters have unknown size is pretty fundamentally unsafe.

So we have to give this parameter a size. If you’re coming from a high-level language, even this is already probably an alien concept (especially since “size on the stack”, which is what we care about here, isn’t the same thing as “total size of every memory allocation this type might transitively point to”).

Anyway, we give this type a size by sticking it behind a pointer. There are many different pointer types we can use depending on one’s need. The simplest is probably Box:

    fn speak_dynamic_box(dog: Box<Bark>) {
        dog.bark();
    }

Of course, using a Box implies a heap allocation, and, since Rust loves speed, it also loves to prefer stack allocation to heap allocation. So what you might actually want to do instead is use a reference, which will let you avoid the heap altogether:

    fn speak_dynamic_ref(dog: &Bark) {
        dog.bark();
    }

Now you have a function that takes a stack-allocated reference to a stack-allocated vtable. There’s still two pointer indirections to calling bark(), which isn’t great, but at least we’ve gotten rid of that heap allocation.

It doesn’t end there, though. If you try to just call speak_dynamic_box(marmaduke), which is how easy it was for speak_static, the compiler will error. That’s because speak_dynamic_box doesn’t take a GreatDane, it takes a Box<Bark>, which isn’t even close to the same thing. So you have to call it like this:

    speak_dynamic_box(Box::new(marmaduke) as Box<Bark>);

Not only do you have to box it up manually, but you have to cast it into a trait object. Not pretty, and definitely not worth avoiding generics for.

And all this is still understating the restrictions on trait objects. For example, once you cast to a trait object, you can’t cast back to the original type (the original type is lost, and if we let you cast back then you’d be able to turn rover into a GreatDane!). Furthermore, because of various inherent restrictions to how vtables work, not all traits can even be used as trait objects (and trying to explain the technical justification behind these rules, known collectively as “object safety”, is enough to make anyone’s eyes glaze over). Furthermore, getting back to the speak_dynamic_ref example, this only looks as simple as it does (and it doesn’t really look simple) because of how simple our example is. If you try to expand this example into anything useful, then you quickly need to really know what you’re doing with lifetimes lest you fall into despair.

Summary

To summarize, trait objects are an advanced feature that should only be attempted by people who need dynamic dispatch. Rust is designed to favor static dispatch. Don’t be fooled by the apparent simplicity of defining functions or structs that take traits as types. In fact, in the near future we’ll be introducing a new keyword to make it absolutely clear when trait objects are being used, solely so that new users don’t fall into the trap of thinking that they’re a simpler path forward than generics.

Leave a Reply

Your email address will not be published. Required fields are marked *