Checksum Calculator in Rust
A checksum is often used over something like an MD5 hash for speed and simplicity when you only need to quickly verify data integrity or detect errors.
Just for some practice let’s guess (brute force) a checksum, we’ll provide the checksum and then work out what data would produce the checksum :
Note the use of if let some
If the checksum function returns some data, iteratively apply the XOR operation to each byte in the data to calculate the checksum.
Also note the use of fold – Folding is useful whenever you have a collection of something, and want to produce a single value from it.
fn guess_checksum(target_checksum: u8) -> Option<Vec<u8>> {
// Iterate through all combinations of two byte values (0x00 to 0xFF)
for i in 0..=255 {
for j in 0..=255 {
let data = vec![i, j];
// Calculate checksum using XOR operation on both bytes
let checksum = data.iter().fold(0u8, |acc, &byte| acc ^ byte);
if checksum == target_checksum {
return Some(data); // Return the first matching pair
}
}
}
None
}
fn main() {
// Let's run the function with a checksum we know (e.g., 0xd5 or 0xc7)
let target_checksum = 0xd5; // Target checksum in hexadecimal (213 in decimal)
if let Some(guessed_data) = guess_checksum(target_checksum) {
let checksum = guessed_data.iter().fold(0u8, |acc, &byte| acc ^ byte); // XOR checksum
println!("Found matching data: {:?}, with checksum: {:x}", guessed_data, checksum);
} else {
println!("Checksum not found");
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_guess_checksum() {
let target_checksum = 0xd5; // Example checksum
let guessed_data = guess_checksum(target_checksum).expect("Checksum not found");
// Calculate checksum of guessed data using XOR to verify
let checksum = guessed_data.iter().fold(0u8, |acc, &byte| acc ^ byte);
assert_eq!(checksum, target_checksum);
assert_eq!(guessed_data.len(), 2); // Ensure the guessed data is of length 2 (two bytes)
println!("Test - Found matching data: {:?}, with checksum: {:x}", guessed_data, checksum);
}
}
Why would you use a checksum instead of a hash?
A checksum is often used over something like an MD5 hash for speed and simplicity when you only need to quickly verify data integrity or detect errors. Here’s why you might choose a checksum over MD5:
- Performance: Checksum algorithms (like XOR or simple sums) are computationally cheaper and faster than MD5. If you’re working in resource-constrained environments or need to process data quickly, a checksum is more efficient.
- Error Detection: A checksum is great for detecting simple errors (like single-bit flips) during transmission or storage. MD5, while more secure, is designed for cryptographic integrity and is better at detecting more complex data corruption but comes with a performance cost.
- Simplicity: Checksum methods are straightforward and easier to implement for tasks like quick data validation. MD5, on the other hand, requires more processing power and is more complex.
When to use MD5:
- Data Integrity: MD5 is more suitable when you need to detect even small, complex changes in data or when cryptographic security is important.
In short, checksums are faster and simpler, while MD5 is more robust and secure but slower.
Analysing the XOR
To show each step with binary representations, let’s modify the code to print the values in binary. This will help visualize the XOR operation at the bit level.
fn main() {
let data = vec![1, 2, 3];
let initial_value = 4u8;
println!("Initial Value: {:08b} ({})", initial_value, initial_value);
// Compute checksum with debug output at each step
let checksum = data.iter().fold(initial_value, |acc, &byte| {
let result = acc ^ byte;
println!(
"acc: {:08b} ({}), byte: {:08b} ({}), acc ^ byte: {:08b} ({})",
acc, acc, byte, byte, result, result
);
result
});
println!("Final Checksum: {:08b} ({})", checksum, checksum);
}
Explanation of Binary Output
Running this code would produce the following breakdown:
- Initial Value:
00000100 (4)
- Step-by-Step Iteration:
- Iteration 1:
acc = 00000100 (4)
,byte = 00000001 (1)
acc ^ byte = 00000100 ^ 00000001 = 00000101 (5)
- Iteration 2:
acc = 00000101 (5)
,byte = 00000010 (2)
acc ^ byte = 00000101 ^ 00000010 = 00000111 (7)
- Iteration 3:
acc = 00000111 (7)
,byte = 00000011 (3)
acc ^ byte = 00000111 ^ 00000011 = 00000100 (4)
- Final Checksum:
00000100 (4)
Binary XOR Explanation:
- In binary, XOR (
^
) flips bits where1
is encountered in only one of the operands: 00000100 ^ 00000001 = 00000101
- Only the last bit differs, so it’s flipped from
0
to1
.
- Only the last bit differs, so it’s flipped from
00000101 ^ 00000010 = 00000111
- The second-to-last bit differs, so it’s flipped.
00000111 ^ 00000011 = 00000100
- The last two bits are flipped back to match the original
00000100
.
- The last two bits are flipped back to match the original
This output makes it clear how each XOR operation works at the binary level to reach the final checksum.
XOR (exclusive OR) is a bitwise operation that outputs 1
only if one of the bits is 1
and the other is 0
. This makes it ideal for toggling bits, as repeating XOR with the same value reverts data to its original state. XOR is widely used in cryptography, checksums, and error detection because it enables data manipulation without needing additional storage.
XOR is a simple way to simulate a checksum. XOR works well here because it’s fast and allows you to "combine" all data bits into a single result, with the order and number of bits influencing the final checksum. While XOR is not a robust checksum for error detection in real applications, it effectively demonstrates the concept by condensing data into a single, reproducible value.