Rust Benchmark

using the criterion crate

This is a very very basic demo I made to learn how to benchmark my own function.

Rust Benchmark
red is the previous version

I made a new project, added a lib.rs and then made a function inside lib.rs called “cycler”

This is a tiny function so won’t really make full use of the benchmarks but I wanted to keep it simple, and then make the function deliberately slower to see what happened! (As you can see above, the performance REGRESSED)!

The code

To follow along you add a directory called “benches” and then create your file – mine is called “range_benchmark.rs” – this is where you add all the “criterion stuff”.

Note how the name of the file matches what is in the Cargo.toml file?

❯ tree -L 2
.
├── benches
│   └── range_benchmark.rs
├── Cargo.lock
├── Cargo.toml
├── src
│   ├── lib.rs
│   └── main.rs
└── target
    ├── CACHEDIR.TAG
    ├── criterion
    ├── debug
    ├── nextest
    ├── release
    └── tmp
[package]
name = "p180"
version = "0.1.0"
edition = "2021"

[dependencies]

[dev-dependencies]
criterion = {version = "0.3", features = ["html_reports"]}

[[bench]]
name = "range_benchmark"
harness = false

As you can see, the bench name doesn’t need to match the project name…

pub fn cycler(range:std::ops::Range<usize>)->Vec<usize>{
        // Collect the range into a Vec<usize>
    let range = range.into_iter().collect::<Vec<usize>>();
    
    // Perform some computation to make the operation more complex
    range.iter().map(|&x| x * 2).collect() // Just doubling each value for now
}
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use p180::cycler; // function to profile

pub fn criterion_benchmark(c: &mut Criterion) {
    // Profile the cycler function
    c.bench_function("cycler", |b| {
        b.iter(|| black_box(cycler(0..11))); // Use black_box to prevent optimization
    });
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

❯ pwd
/home/rag/p180/target

❯ tree -L 2
.
├── CACHEDIR.TAG
├── criterion
│   ├── cycler
│   └── report
├── debug
│   ├── build
│   ├── deps
│   ├── examples
│   ├── incremental
│   ├── libp180.d
│   ├── libp180.rlib
│   ├── p180
│   └── p180.d
├── nextest
│   └── default
├── release
│   ├── build
│   ├── deps
│   ├── examples
│   ├── incremental
│   └── p180
└── tmp

16 directories, 6 files
~/p180/target main* ❯

You can navigate to report and then use “firefox index.html” to open the html report which has the plots that we see on this page.

Why did it regress?

because I deliberately added this to my original function, in src/lib.rs, so as to make a noticeable difference for criterion for observe!

// Perform some computation to make the operation more complex
range.iter().map(|&x| x * 2).collect() // Just doubling each value for now

Improve the performance

pub fn cycler(range:std::ops::Range<usize>)->Vec<usize>{
        // Collect the range into a Vec<usize>
    let range = range.into_iter().collect::<Vec<usize>>();
    range
}

What does it show when the code is improved?

❯ cargo bench
   Compiling p180 v0.1.0 (/home/rag/p180)
    Finished `bench` profile [optimized] target(s) in 1.24s
     Running unittests src/lib.rs (target/release/deps/p180-2596068b6c8fb8ad)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running unittests src/main.rs (target/release/deps/p180-be45702ecfb86b95)

running 1 test
test tests::test_cycler ... ignored

test result: ok. 0 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/range_benchmark.rs (target/release/deps/range_benchmark-5fc0b38294993391)
Gnuplot not found, using plotters backend
cycler                  time:   [19.741 ns 19.796 ns 19.862 ns]                    
                        change: [-45.275% -44.752% -44.284%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe
improved performance

Tip:

Run with verbose

Running `/home/rag/p180/target/release/deps/range_benchmark-5fc0b38294993391 --bench`
Gnuplot not found, using plotters backend
cycler                  time:   [19.614 ns 19.696 ns 19.829 ns]                    
                        change: [-0.9919% -0.2145% +0.6550%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

~/p180 main* 12s ❯ cargo bench --verbose

Compare functions – select the best

If you had 2 functions which produce the same output, you could evaluate both in terms of performance and select the quickest one?

recycler-rust
mean 16ns
cycler-rust
mean 19ns