Advice for myself on how to avoid verbose Rust code
When you're introduced to Rust's data types you will often be introduced to usage via match
:
fn divide(numerator: f64, denominator: f64) -> Option<f64> {
if denominator == 0.0 {
None
} else {
Some(numerator / denominator)
}
}
// The return value of the function is an option
let result = divide(2.0, 3.0);
// Pattern match to retrieve the value
match result {
// The division was valid
Some(x) => println!("Result: {}", x),
// The division was invalid
None => println!("Cannot divide by 0"),
}
In practice, though, when writing more procedural code, simply doing a bunch of match
es will lead to a lot of indentation that doesn't match the "conceptual indentation" of the logic you are trying to implement. And there are fortunately many options available for
Here are a handful of things that I have figured out in order to try and simplify the aesthetics of my code.
Write constructors for your enums
When writing a struct
the first thing I will generate is a pub fn new(...) -> MyString {}
. But often I won't do this for enumerations, as it generally feels like these objects are "simple enough" to where I don't need them.
Here is an example enum I wrote working on a Redis clone:
#[derive(Clone, Debug, PartialEq)]
pub enum RedisData {
SimpleString(Vec<u8>),
BulkString(Vec<u8>),
ErrString(Vec<u8>),
Int(u64),
Array(Vec<RedisData>),
Nil,
}
This are mostly just wrapping around bags of bytes, so there isn't any complex setup required (unlike something like a connection manager).
I thien start writing out code with these enumerations, and it gets.... annoying.
fn test_cmd_read() {
use super::RedisCmd::*;
use crate::client_read::RedisData::*;
assert_eq!(read_cmd(SimpleString(b"Ping".to_vec())), Ok(Ping));
assert_eq!(read_cmd(BulkString(b"PIng".to_vec())), Ok(Ping));
}
Usually I think "OK it's just test code, not like this will affect my real code, and I should just suck it up." Then, of course, I write my application code.
assert_eq!(read_cmd(RedisData::simple_string(b"Ping")), Ok(Ping));
Is this a lot better? It's still kind of long (you could opt for writing a plain function to not need a RedisData
qualification), but at least you're not writing out to_vec
every other line.
One could also take this even further, where you could have a RedisData::string
function that will choose the "right string" for your use case depending on whether there is a newline or not.
Study Option
and Result
methods deeply
Option
and Result
are probably some of the most commonly-found datatypes in code I end up writng. Option
just as a consequence of dealing with data, but heavy Result
usage is because of how there are many tools available to write flat code thanks to the ?
operator in particular.
The nice ones
// provide a default
Some(x).unwrap_or(y) = x
None.unwrap_or(y) = y
// provide a default (gotten from a function call)
// useful if calculating the default is a problem
Some(x).unwrap_or_else(f) = x
None.unwrap_or_else(f) = f()
// apply a function "inside" the Option
Some(x).map(f) = Some(f(x))
None.map(f) = None
There's even nice stuff that lets you replicate Javascript-style option overlaying
let config_value = from_commandline
.or(from_config_file)
.unwrap_or(DEFAULT_VALUE)
Result
has a lot of its own helper methods that are useful as well, that work in a similar fashion (though with being able to manipulate the Err
) case as well.
There are also a handful of options letting you jump between Result
and Option
, and loads of the standard library uses these, so you can take advantage of a lot of things by sticking to those instead of ending up with a "not-quite-Result
/Option
" datatype.
Take full advantage of the ?
operator
When having a set of functions returning Result
s as your base API, you can get very nice error propogation through using ?
, turning your Go-like:
fn write_message() -> Result<(), io::Error> {
let mut file = match File::create("important_file.txt") {
Ok(f) => f,
Err(e) => return Err(e)
};
match file.write_all(b"Important bytes") {
Ok(_) => return Ok(()),
Err(e) => return Err(e)
}
}
Into something that will automatically do your error propogation nicely:
fn write_message() -> Result<(), io::Error> {
let mut file = File::create("important_file.txt")?;
file.write_all(b"Important bytes")?;
return Ok(());
}
Your code should end up pretty flat, just like if you just peppered unwrap
through your code, but you'll actually have proper error reporting for consumers of the functions.
In practice tihs gets messy when you want to mix and match error types though:
struct AccountDetails {}
struct User {}
fn lookup_user(user_id: u64) -> Result<User, String> {
todo!();
}
fn fetch_user_details(user: User) -> Result<AccountDetails, String>{
todo!();
}
fn process_user_request(user_id: u64) -> Result<AccountDetails, String> {
let user = lookup_user(user_id)?;
let account_details = fetch_user_details(user)?;
write_message()?;
return Ok(account_details);
}
the above process_user_request
function will fail to type check because while most of the called functions are using a String
for its error type, but write_message
is using a std::io::Error
for its error type.
In this sort of case, when our error types are out of our control, we have to do a bit of manual work here to convert errors into something acceptable.
// applying map_err on a Result<T,E> with
// a function E -> F will let us process the error
// with a passed in function, to get a Result<T, F>
write_message().map_err(|e| e.to_string())?;
All this means that the end consumer of your function will be getting strings as error messages. This is less than ideal if people want to handle certain errors differently without some string parsing.
Especially if you have a decent sized proogram, it's probably worth defining out your own error type to be used.
enum RequestErr {
UserNotFound,
ConnectionReset,
GenericErr(String)
}
type RequestResult<T> = Result<T, RequestErr>;
Armed with this, let's change our lookup_user
and process_user_request
function signatures...
fn lookup_user(user_id: u64) -> RequestResult<User> {
todo!();
}
fn process_user_request(user_id: u64) -> RequestResult<AccountDetails> {
let user = lookup_user(user_id)?;
let account_details = fetch_user_details(user)?;
write_message().map_err(|e| e.to_string())?;
return Ok(account_details);
}
Now if you just change the code this way, you'll have similar errors to before, where the error type isn't unified. Fortunately, this can be solved by providing From
implementations so your code can transform one kind of error into another:
// for strings, just throw it into a generic wrapper
impl From<String> for RequestErr {
fn from(elt: String) -> Self {
return RequestErr::GenericErr(elt.to_string());
}
}
// for io errors, do a bit more work to identify potential issues
impl From<std::io::Error> for RequestErr {
fn from(err: std::io::Error) -> RequestErr {
match err.kind() {
std::io::ErrorKind::ConnectionReset => {
// provide special code for connection resets
return RequestErr::ConnectionReset;
}
_ => return RequestErr::GenericErr(err.to_string()),
}
}
}
With this you can actually go back and remove the map_err
application, as the From
implementation will provide a default way of handling io errors while working with RequestErr
fn process_user_request(user_id: u64) -> RequestResult<AccountDetails> {
let user = lookup_user(user_id)?;
let account_details = fetch_user_details(user)?;
write_message()?;
return Ok(account_details);
}
Figure out how to use macro_rules!
Especially when coming from more dynamic languages, it can be tough when you need to create a bunch of functions that have very similar implementations but just with small changes in the internals. Or you really just want to copy a snippet and are having trouble defining ownership of that snippet.
Macros are a good hacky tool that can solve a lot of little headaches, at least until you are more confident of what you need.
One useful usecase for me was in defining a test suite. As part of writing a Lua implementation, I had a lot of test files written in lua themselve. I wanted there to be one Rust test per Lua test file (to aid in reporting of errors and test selection) but was having trouble.
Instead, I ended up defining a macro to declare a test function (just like I'd do it by hand) and generated a bunch of functions that way
// this takes a set of file paths and test names
// and then generates one test per name
macro_rules! lua_tests {
($($name: ident: $file: expr,)*) => {
$(
#[test]
fn $name(){
let file_name = $file;
run_lua_integration_test!(&file_name);
}
)*
}
}
// the actual code has ~50 lines
lua_tests! {
test_all: "lua_tests/all.lua",
test_api: "lua_tests/api.lua",
test_attrib: "lua_tests/attrib.lua",
test_api: "lua_tests/api.lua",
test_attrib: "lua_tests/attrib.lua",
test_base: "lua_tests/base.lua",
}
It can be tough to figure out how to write macros, but when you have a bunch of spaghetti code anyways, macros can help to get your head straight.
I'm far from a Rust expert, but I still end up much happier with my codebase after some refactoring along any of these lines