Chapter 5.6. Range Analysis in Rust

Range analysis is a type of static analysis used to track the range (interval) of possible values that a variable can take during the execution of a program. By maintaining an upper and lower bound for each variable, the analysis helps optimize code by narrowing the values that a program may handle. In Rust, range analysis is particularly important in verifying safety properties, such as ensuring that an array access is always within bounds or that certain calculations do not result in overflows.

Range Analysis Trait

#![allow(unused)]
fn main() {
pub trait RangeAnalysis<'tcx, T: IntervalArithmetic + ConstConvert + Debug>: Analysis {
    fn get_fn_range(&self, def_id: DefId) -> Option<RAResult<'tcx, T>>;
    fn get_fn_ranges_percall(
        &self,
        def_id: DefId,
    ) -> Option<Vec<RAResult<'tcx, T>>>;
    fn get_all_fn_ranges(&self) -> RAResultMap<'tcx, T>;
    fn get_all_fn_ranges_percall(&self) -> RAVecResultMap<'tcx, T>;
    fn get_fn_local_range(
        &self,
        def_id: DefId,
        local: Place<'tcx>,
    ) -> Option<Range<T>>;
    fn get_fn_path_constraints(
        &self,
        def_id: DefId,
    ) -> Option<PathConstraint<'tcx>>;
    fn get_all_path_constraints(&self) -> PathConstraintMap<'tcx>;
}
}

Quick Usage Guide

To test the feature via terminal command:

cargo rapx -range

or show the entire analysis process via terminal command:

RAP_LOG=trace cargo rapx -range

To use the feature in Rust code:

#![allow(unused)]
fn main() {
let mut range_analysis = RangeAnalyzer::<i128>::new(self.tcx, false);
range_analysis.run();
let path_constraint = range_analysis.get_all_path_constraints();
rap_info!("{}", PathConstraintMapWrapper(path_constraint));
let result = analyzer.get_all_fn_ranges();
}

Supported Integer Types For Ranges:

i32, i64, i128
u32, u64, u128
usize

Default Implementation

RAPx provides a default implementation of RangeAnalysis trait in range_analysis.rs. The implementation is inspired by the following CGO paper.

Raphael Ernani Rodrigues, Victor Hugo Sperle Campos, and Fernando Magno Quintao Pereira. "A fast and low-overhead technique to secure programs against integer overflows." In Proceedings of the 2013 IEEE/ACM international symposium on code generation and optimization (CGO), pp. 1-11. IEEE, 2013.

#![allow(unused)]
fn main() {
pub struct RangeAnalyzer<'tcx, T: IntervalArithmetic + ConstConvert + Debug> {
    pub tcx: TyCtxt<'tcx>,
    pub debug: bool,
    pub ssa_def_id: Option<DefId>,
    pub essa_def_id: Option<DefId>,
    pub final_vars: FxHashMap<DefId, HashMap<Local, Range<T>>>,
    pub ssa_locals_mapping: FxHashMap<DefId, HashMap<Local, HashSet<Local>>>,
}
}

Why SSA Form?

Before performing range analysis, the MIR (Mid-level Intermediate Representation) is first transformed into Static Single Assignment (SSA) form. SSA guarantees that each variable is assigned exactly once, and every use of a variable refers to a unique definition. This transformation simplifies the analysis in several ways:

It makes data flow explicit, allowing the analysis to accurately track how values propagate through the program.
It allows precise modeling of control flow joins using phi-like constructs.
It improves precision by allowing the interval of each version of a variable to be analyzed separately after each assignment.

Don’t worry about losing track of variables after SSA transformation: The DefaultRange maintains a mapping ssa_locals_mapping, which is a HashMap from the original MIR Local variables to their corresponding SSA Locals. This ensures that even after SSA conversion, you can still query the intervals for the variables you care about using their original MIR identity. The SSA transformation is essential for sound and precise interval analysis and is a foundational preprocessing step in this system.

Range Analysis Features

Flow Sensitivity

Interval analysis in Rust can be flow-sensitive, meaning that it accounts for the different execution paths a program might take. This allows the analysis to track how intervals change as variables are assigned or modified during the execution flow, improving the precision of analysis.

Lattice-based Approach

In this approach, values of variables are represented in a lattice, where each element represents an interval. A lattice ensures that each combination of intervals has a defined result, and merging different paths of a program is done by taking the least upper bound (LUB) of intervals from each path.

For example, if a variable x can have an interval [0, 10] on one path and [5, 15] on another path, the merged interval would be [0, 15] because that represents the union of both possible value ranges.

Meet-over-all-paths (MOP) Approach

In the meet-over-all-paths (MOP) approach, the analysis is performed by considering every possible path through a program and merging the results into a final interval. This approach is path-sensitive but may be less scalable on large programs because it needs to account for all paths explicitly.

Precise Interprocedural Analysis

Although each callee function is analyzed once globally for performance,Every call site (i.e., function invocation) triggers a separate numeric evaluation using the actual arguments passed in.

This hybrid approach preserves analysis precision without sacrificing performance.

Range Analysis Test Examples

RAP provides several small test programs under rapx/tests/range to demonstrate different aspects of the range analysis. This section shows the source code of three representative tests (range_1, range_2, and range_symbolic) and explains what each of them is checking.

Example 1: `range_1` — Intra-procedural loop ranges

Source: rapx/tests/range/range_1/src/main.rs

fn main() {
    let mut k = 0;
    while k < 100 {
        let mut i = 0;
        let mut j = k;
        while i < j {
            i += 1;
            j -= 1;
        }
        k += 1;
    }
}

This example is an intra-procedural test that exercises nested loops and variable updates inside them.

k starts from 0 and is increased until it reaches 100, so the analysis should infer:
- k (and the corresponding SSA locals) have ranges within [0, 100] at the appropriate program points.
Inside the inner loop, i and j move towards each other:
- i is incremented, j is decremented.
- The condition while i < j ensures that the loop stops when the two meet or cross.
The range analysis needs to:
- Track how k grows across the outer loop.
- Track how i and j evolve in the inner loop.
- Derive precise intervals for these locals at the end of each loop.

The corresponding unit test (test_range_analysis in rapx/tests/tests.rs) checks that the printed ranges for several locals (e.g., _1, _4, _6, _11, _12, _34) match the expected intervals such as [0, 100], [0, 99], [1, 100], etc.

Example 2: `range_2` — Inter-procedural range propagation

Source: rapx/tests/range/range_2/src/main.rs

fn main() {
    let para1 = 42;
    foo1(para1);

    let para2 = 52;
    let _x = foo2(55, para2);

}

// This function tests passing ranges of parameters between functions.
fn foo1(mut k: usize) {
    while k < 100 {
        k += 1;
    }
}

// This function tests whether the range of returned value is processed as expected.
fn foo2(mut k: usize, _c: usize) -> usize {
    while k < 100 {
        k += 1;
    }
    k
}

This example focuses on inter-procedural range analysis:

In main, para1 is initialized to 42 and passed to foo1 as the argument k.
- Inside foo1, k is incremented in a loop while k < 100, so at the end of the function the range of k should be [100, 100].
For foo2, k is initialized with the literal 55 and incremented until it reaches 100, then returned.
- The analysis should infer that the return value of foo2 is always 100, i.e. range [100, 100].

The test (test_interprocedual_range_analysis) checks that:

The parameter and local values in foo1 and foo2 have precise ranges like [42, 42], [52, 52], and [100, 100].
These ranges are correctly propagated across function calls (from caller to callee and back via return values).

This demonstrates that the range analysis is not limited to a single function, but can reason about how ranges flow across call edges.

Example 3: `range_symbolic` — Symbolic bound expressions

Source: rapx/tests/range/range_symbolic/src/main.rs

fn foo1(x: i32) -> i32 {
    let a = x + 1;
    let y = x;
    let mut result ;
    let _b = a - y; // _11/_8. [1,1] can be inferred before range analysis
    if a >= y {    // always true
        result =  a;
    } else {
        result =  y;
    }
    return result;  // result is always a, but its upper/lower bound 
                    // symbexpr is hard to be inferred without range analysis
}

fn main(){
    let y = 2;
    let x = y;
    foo1(2);
}

This test is designed to show symbolic range expressions:

Inside foo1, we have:
- a = x + 1
- y = x
- _b = a - y
Even before full range analysis, _b can be known to be 1:
- _b = (x + 1) - x = 1, so its range is [1, 1].
The conditional if a >= y is always true because a = x + 1 and y = x.
- Therefore, result will always be a.
However, to express the bounds of result precisely, the analysis needs to:
- Track that a is x + 1 symbolically.
- Use symbolic expressions (such as Binary(AddWithOverflow, Place(_1), Constant(1))) to represent lower and upper bounds of intervals.

The unit test (test_symbolic_interval) asserts the existence of specific symbolic interval strings in the analysis output, for example:

A symbolic lower and upper bound based on x + 1.
A symbolic interval that refers to another place (e.g. Place(_1)).
A symbolic interval for a constant (e.g. [1, 1] as a constant expression).

This demonstrates that the range analysis in RAP does not only compute numeric intervals, but also keeps track of symbolic expressions for lower and upper bounds. This is important when the exact numeric values cannot be known statically, but their relationship to program variables can still be expressed and exploited for further analyses.

Rust Analysis Platform