Back Original

5 patterns in Rust that are kind of a big deal

Ferris the crab

Recently I've been learning and building in Rust to the point where I'm feeling productive with it.

Like many others, I've found the "The Rust Programming Language" book (or "the book" as my rustacean friends like to call it) to be an incredibly helpful resource. But as I've progressed beyond the book I've found that reading the source code of the standard library and the crates I was using was extremely illuminating, and revealed common motifs in Rust development.

There are a few things I've found that I thought might be especially helpful for those new to Rust, which I've collected here.

Rc<RefCell<T>> (or Arc<RwLock<T>> or Arc<Mutex<T>>)

This one is in the book, but it's in the last 3/4, and if you're jumping into the language you might be worried that data structures that were easy in other languages might seem impossible in Rust.

Rc<T> and its thread-safe counterpart Arc<T> allow you to reference-count (or in the case of Arc, atomically reference-count) allocated memory.

.clone() on an Rc<T> will no longer .clone() (and copy) the wrapped value. Instead, a counter representing how many times the Rc is owned will be incremented and the same Rc will be effectively owned in multiple places. As these owned references are dropped the counter decrements, and when the last reference is finally dropped, Rc will drop the wrapped value before it is removed itself.

Rc implements Deref (but not DerefMut), which allows you to easily access the value it wraps via "Deref coercion".

Meanwhile, RefCell<T> (or RwLock<T> and Mutex<T>) lets you separate the mutability of the type that they wrap from the struct or reference that holds it. This is called "interior mutability". Anywhere a RefCell, RwLock or Mutex is borrowed as immutable, you can try to mutably borrow the interior value with .borrow_mut or similar.

RefCell<T> and Rc<T> (and their respective thread-safe counterparts) can be combined to create a value that can be shared (with Rc) and mutated in multiple places (inside of a RefCell, which temporarily provides a borrowed mut value).

One data structure that might be impossible or needlessly difficult to create in Rust without reference counting and interior mutability is a graph. Below is an example that uses reference counting with Rc<T> to create nodes, one of which is referenced multiple times from multiple other nodes. RefCell<T> is utilized to borrow a mutable reference to the node so that it can be set as visited so we can avoid visiting a node multiple times during a depth-first search.

use std::cell::RefCell;
use std::error::Error;
use std::rc::Rc;

fn main() -> Result<(), Box<dyn Error>> {
    graph_demo()
}

#[derive(Debug)]
struct Node {
    name: String,
    value: i64,
    children: Vec<Rc<RefCell<Node>>>,
    visited: bool,
}

impl Node {
    fn new(name: String, value: i64) -> Self {
        Self {
            name,
            value,
            children: vec![],
            visited: false,
        }
    }
}

fn graph_demo() -> Result<(), Box<dyn Error>> {
    let a = Rc::new(RefCell::new(Node::new("A".into(), 5)));
    let b = Rc::new(RefCell::new(Node::new("B".into(), 10)));
    let c = Rc::new(RefCell::new(Node::new("C".into(), 10)));
    a.try_borrow_mut()?.children.push(b.clone());
    // try_borrow_mut() gets a mutable reference to the interior of the
    // RefCell, or errors if already currently mutably borrowed.
    // b.clone() increments the reference count, instead of copying the
    // whole struct
    b.try_borrow_mut()?.children.push(c.clone());
    a.try_borrow_mut()?.children.push(c.clone());
    c.try_borrow_mut()?.value = 100;

    let mut stack: Vec<_> = vec![a.clone()];
    while let Some(current) = stack.pop() {
        let mut current = current.try_borrow_mut()?;
        if current.visited {
            println!("Already visited {:?}", current);
            continue;
        }
        println!("Visiting {:?}", current);
        current.visited = true;
        for child in current.children.iter() {
            stack.push(child.clone());
        }
    }
    Ok(())
}

std::collections

Rust beginners know about Vec<T>. A Vec is essentially a growable region of linear memory. Like in a C array, the location in memory of the n'th element is just the memory location of the first element + n * the size of the stored type.

Vec<T> is growable at the last value in constant time. (...sometimes -- if new capacity needs to be allocated, the new continious capacity will be allocated and the data copied, which happens in linear time.)

But if you need to insert or delete values at the beginning of the collection or somewhere in the middle, values will have to be shuffled in memory to make sure the list stays in order.

Luckily, Rust makes other ways of storing collections of values available in the std::collections module. The module's documentation provides a good summary of when to use the different options.

Sea shells 1 by Leonardo Aguiar Sea shells 1 by Leonardo Aguiar CC BY 2.0

.awaiting for events

Developers using dynamic languages like JavaScript and Python may be used to using event handlers for new connections or incoming messages.

For example, if you are setting up a WebSocket server, you attach a handler to handle incoming connections. That handler sets up other handlers to define what happens when that connection receives an incoming message.

// javascript example

import { WebSocketServer } from 'ws';

const wss = new WebSocketServer({ port: 8080 });

wss.on('connection', function connection(ws) {
  ws.on('error', console.error);

  ws.on('message', function message(data) {
    console.log('received: %s', data);
  });

  ws.send('hello!');
});

Rust tends to use a different pattern, where an infinite loop waits on a listener that blocks the thread or future until it has received a connection. This connection can then be moved to a new thread or future to handle that connection, while the loop to accept more connections keeps running.

use futures_util::{SinkExt, StreamExt};
use tokio::net::TcpListener;
use tokio_websockets::{Error, Message, ServerBuilder};

#[tokio::main]
async fn main() -> Result<(), Error> {
    let listener = TcpListener::bind("0.0.0.0:3000").await?;
    while let Ok((stream, _)) = listener.accept().await {
        tokio::spawn(async move {
            let mut ws_stream = ServerBuilder::new().accept(stream).await?;

            ws_stream.send(Message::text("hello!")).await?;

            while let Some(Ok(msg)) = ws_stream.next().await {
                if let Some(txt) = msg.as_text() {
                    println!("received: {}", txt);
                }
            }
            Ok::<_, Error>(())
        });
    }
    Ok(())
}

In this example, it's worth taking a look at SinkExt and StreamExt, extension traits provided by futures_util. These two traits provide an interface for asyncronously handling polling a sink or a stream with .await syntax. ws_stream in this example is a futures_sink::Sink and a futures_core::Stream. Because a Future is, at its essence, an interface that is polled to see if a result is ready, it is possible for SinkExt and StreamExt to wrap a similarly pollable Sink or Stream and provide a Future interface with ws_stream.send and ws_stream.next.

Note that because the stream is owned by the function or closure using it, if we want to wait for messages from a second source -- like a channel -- so we can forward it to the stream, while also waiting for messages from the stream to send back to the channel, we need to use something like tokio::select which will use the result from the first ready Future, and cancel the others:

use futures_util::{SinkExt, StreamExt};
use std::error::Error;
use tokio::net::TcpListener;
use tokio::sync::watch;
use tokio_websockets::{Message, ServerBuilder};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let listener = TcpListener::bind("127.0.0.1:3000").await?;
    let (pubsub_tx, pubsub_rx) = watch::channel("initial value".to_string());

    while let Ok((stream, _)) = listener.accept().await {
        let tx = pubsub_tx.clone();
        let mut rx = pubsub_rx.clone();
        rx.borrow_and_update(); // marks latest value as seen
        tokio::spawn(async move {
            let Ok(mut ws_stream) = ServerBuilder::new().accept(stream).await else {
                return;
            };

            loop {
                tokio::select! {
                    msg = ws_stream.next() => {
                        if let Some(Ok(msg)) = msg {
                            if let Some(txt) = msg.as_text() {
                                println!("received: {}", txt);
                                if tx.send(txt.to_string()).is_err() {
                                    return;
                                }
                            }
                        } else {
                            break;
                        }
                    }
                    result = rx.changed() => {
                        if result.is_err() {
                            return;
                        }
                        let msg = rx.borrow_and_update().clone();
                        if ws_stream.send(Message::text(msg)).await.is_err() {
                            return;
                        }
                    }
                }
            }
        });
    }
    Ok(())
}

Error handling with thiserror and anyhow

The first time I implemented an Error, I found that more boilerplate than I was used to from other languages was required.

thiserror provides macros to implement the Error trait. Error types can be created with an enum or a struct:

use thiserror::Error;

#[derive(Error, Debug)]
pub enum ErrorEnum {
    #[error("Not found")]
    NotFound,
    #[error("Permission {0} required")]
    PermissionRequired(String),
}

#[derive(Error, Debug)]
pub struct RemoteError {
    msg: String,
    source: anyhow::Error,
}

anyhow works well with thiserror (or separately) and does a couple things:

use anyhow::{Context, Result};

fn fail_for_some_reason() -> Result<()> {
    this_will_fail().context("Failed in example fn")?;
    Ok(())
}

There's more to anyhow than this brief example, and the docs are a good resource.

Any

The Any trait -- which "most types implement" -- can allow you to downcast generic types.

In the following example, we can pass any type of Vehicle to the take_public_transit function. Since we are using static dispatch, the compiler should compile the take_public_transit function multiple times for every type of Vehicle that we pass to it. generic_obj.downcast_ref::<T>() checks to see if the TypeId of the passed value is the same for the specified type, and if it is, it returns Some(&T). This is an equality comparison of static values for each compiled iteration of take_public_transit, and so for optimized builds, the type check and dead code paths for specific types should be optimized away! This is very cool!

use anyhow::{bail, Result};
use std::any::Any;

trait Vehicle {
    fn alert(&self);
}

#[derive(Default)]
struct Car {}

impl Vehicle for Car {
    fn alert(&self) {
        println!("Honk! Honk!");
    }
}

impl Car {
    fn drive(&self) {
        println!("Driving the car!");
    }
}

#[derive(Default)]
struct Trolley {}

impl Vehicle for Trolley {
    fn alert(&self) {
        println!("Ding! Ding!");
    }
}

impl Trolley {
    fn ride(&self) {
        println!("Riding the trolley!");
    }
}

#[derive(Default)]
struct Dog {}

fn watch_out(vehicle: &impl Vehicle) {
    vehicle.alert();
}

fn take_public_transit<V: Any + Vehicle>(vehicle: &V) -> Result<()> {
    let vehicle_any = vehicle as &dyn Any;
    if let Some(trolley) = vehicle_any.downcast_ref::<Trolley>() {
        trolley.ride();
    } else {
        bail!("Not sure how to ride this vehicle!");
    }
    Ok(())
}

fn main() -> Result<()> {
    let car = Car::default();
    let trolley = Trolley::default();
    let rover = Dog::default();
    take_public_transit(&trolley)?;
    car.drive();
    watch_out(&car);
    // take_public_transit(&rover);
    // The above line produces a compiler error
    // since it doesn't satisfy trait bounds
    Ok(())
}

Any can also downcast trait objects. Trait objects use "dynamic dispatch". Unlike in our last example where take_public_transit is compiled for every type that it supports, trait objects have a shared "vtable" that points to the implementation for each type. At runtime, the memory location of the implementation for the specific type is looked up in this vtable when it is called. This has a very small performance hit, but it lets us put objects of different types side by side in a collection like a Vec<_>. (It's worth noting here that the ability to do this is useful by itself. It doesn't necessarily need to be paired with Any -- only if you want to be able to downcast.)

To make downcasting work, a function that casts the trait object to Any (.as_any in this example) must be added to the trait.

use anyhow::{bail, Result};
use std::any::Any;

trait Vehicle {
    fn alert(&self);
    fn as_any(&self) -> &dyn Any;
}

#[derive(Default)]
struct Car {}

impl Vehicle for Car {
    fn alert(&self) {
        println!("Honk! Honk!");
    }

    fn as_any(&self) -> &dyn Any {
        self
    }
}

impl Car {
    fn drive(&self) {
        println!("Driving the car!");
    }
}

#[derive(Default)]
struct Trolley {}

impl Vehicle for Trolley {
    fn alert(&self) {
        println!("Ding! Ding!");
    }

    fn as_any(&self) -> &dyn Any {
        self
    }
}

impl Trolley {
    fn ride(&self) {
        println!("Riding the trolley!");
    }
}

fn take_public_transit(vehicle: &Box<dyn Vehicle>) -> Result<()> {
    let vehicle_any = vehicle.as_any();
    if let Some(trolley) = vehicle_any.downcast_ref::<Trolley>() {
        trolley.ride();
    } else {
        bail!("Not sure how to ride this vehicle!");
    }
    Ok(())
}

fn main() {
    let vehicles: Vec<Box<dyn Vehicle>> =
        vec![Box::new(Car::default()), Box::new(Trolley::default())];
    for vehicle in vehicles.iter() {
        if take_public_transit(vehicle).is_err() {
            println!("Didn't ride public transit!");
        }
        vehicle.alert();
    }
}

That's all for now

So that's a few cool things about Rust that once I figured out, unlocked the ability to write programs that really do useful things. I hope it helps some folks out there climb up the Rust learning curve!

Thanks to Jordan for reviewing an early draft of this post!