Bitwise operations – Combining 1D Arrays into 4-Byte Variables in Rust

Bitwise operations

When working with data processing in Rust, there are times you may need to efficiently combine elements from an array into a single variable. This tutorial explains how to take four elements from a 1D array of bytes (u8) at a time and pack them into a single 4-byte variable (u32).

This approach is useful in applications like file processing, network protocols, or custom data serialization.

I needed this recently when working on a pixelate program that I built in Rust.


The Challenge

Imagine we have a 1D array of bytes, for example:

let arr: [u8; 8] = [1, 2, 3, 4, 5, 6, 7, 8];  

Our goal is to process this array in chunks of 4 bytes and combine each chunk into a single u32 variable.

Each u32 is formed by treating the bytes as follows (assuming little-endian order):

  • The first byte occupies bits 0–7.
  • The second byte occupies bits 8–15.
  • The third byte occupies bits 16–23.
  • The fourth byte occupies bits 24–31.

For the above array:

  • [1, 2, 3, 4] becomes 0x04030201.
  • [5, 6, 7, 8] becomes 0x08070605.

The Rust Code

Here’s how we can implement this in Rust:

fn main() {  
    let arr: [u8; 8] = [1, 2, 3, 4, 5, 6, 7, 8];  
    let combined: Vec<u32> = combine_into_u32(&arr);  
    println!("{:?}", combined); // Output: [0x04030201, 0x08070605]  
}  

fn combine_into_u32(arr: &[u8]) -> Vec<u32> {  
    arr.chunks(4) // Split the array into chunks of 4 bytes  
        .map(|chunk| {  
            let mut iter = chunk.iter();  
            let b1 = *iter.next().unwrap_or(&0) as u32;  
            let b2 = *iter.next().unwrap_or(&0) as u32;  
            let b3 = *iter.next().unwrap_or(&0) as u32;  
            let b4 = *iter.next().unwrap_or(&0) as u32;  

            // Combine bytes into a single u32  
            (b4 << 24) | (b3 << 16) | (b2 << 8) | b1  
        })  
        .collect()  
}  

Explanation

  1. Splitting the Array
    The chunks(4) method divides the array into sub-arrays, each containing up to 4 elements. If the array’s length isn’t a multiple of 4, the last chunk will have fewer elements.
  2. Processing Each Chunk
    For each chunk, we extract its elements (b1, b2, b3, and b4) and treat missing values as 0.
  3. Combining Bytes into a u32
    Using bit shifts (<<) and bitwise OR (|), we pack the four bytes into a single u32.
    • b1 goes into bits 0–7 (no shift).
    • b2 is shifted left by 8 bits.
    • b3 is shifted left by 16 bits.
    • b4 is shifted left by 24 bits.
  4. Collecting Results
    The collect() method converts the iterator into a vector of u32.

Output

For the input array [1, 2, 3, 4, 5, 6, 7, 8], the program outputs:

[67305985, 134678021]  

In hexadecimal, these values are:

[0x04030201, 0x08070605]  

Edge Cases

  • If the array is not a multiple of 4 in length, the remaining bytes are padded with 0. For example: let arr: [u8; 5] = [9, 10, 11, 12, 13]; Output: [0x0C0B0A09, 0x0000000D]

Applications

  1. Data Serialization: Packing data for transmission or storage in compact formats.
  2. File Processing: Reading binary files into structured formats.
  3. Custom Protocols: Encoding multi-byte fields for network communication.

Why Use Rust for This?

Rust’s performance and memory safety make it an excellent choice for low-level data manipulation tasks. Using iterators allows for expressive, functional-style code while maintaining efficiency.


By following this guide, you can deconstruct arrays and manipulate data effectively in Rust. Happy coding! 🎉

cat