Chapter 5.6. Range Analysis in Rust
Range analysis is a type of static analysis used to track the range (interval) of possible values that a variable can take during the execution of a program. By maintaining an upper and lower bound for each variable, the analysis helps optimize code by narrowing the values that a program may handle. In Rust, range analysis is particularly important in verifying safety properties, such as ensuring that an array access is always within bounds or that certain calculations do not result in overflows.
Range Analysis Trait
#![allow(unused)] fn main() { pub trait RangeAnalysis<'tcx, T: IntervalArithmetic + ConstConvert + Debug>: Analysis { fn get_fn_range(&self, def_id: DefId) -> Option<RAResult<'tcx, T>>; fn get_fn_ranges_percall( &self, def_id: DefId, ) -> Option<Vec<RAResult<'tcx, T>>>; fn get_all_fn_ranges(&self) -> RAResultMap<'tcx, T>; fn get_all_fn_ranges_percall(&self) -> RAVecResultMap<'tcx, T>; fn get_fn_local_range( &self, def_id: DefId, local: Place<'tcx>, ) -> Option<Range<T>>; fn get_fn_path_constraints( &self, def_id: DefId, ) -> Option<PathConstraint<'tcx>>; fn get_all_path_constraints(&self) -> PathConstraintMap<'tcx>; } }
Quick Usage Guide
To test the feature via terminal command:
cargo rapx -range
or show the entire analysis process via terminal command:
RAP_LOG=trace cargo rapx -range
To use the feature in Rust code:
#![allow(unused)] fn main() { let mut range_analysis = RangeAnalyzer::<i128>::new(self.tcx, false); range_analysis.run(); let path_constraint = range_analysis.get_all_path_constraints(); rap_info!("{}", PathConstraintMapWrapper(path_constraint)); let result = analyzer.get_all_fn_ranges(); }
Supported Integer Types For Ranges:
- i32, i64, i128
- u32, u64, u128
- usize
Default Implementation
RAPx provides a default implementation of RangeAnalysis trait in range_analysis.rs. The implementation is inspired by the following CGO paper.
- Raphael Ernani Rodrigues, Victor Hugo Sperle Campos, and Fernando Magno Quintao Pereira. "A fast and low-overhead technique to secure programs against integer overflows." In Proceedings of the 2013 IEEE/ACM international symposium on code generation and optimization (CGO), pp. 1-11. IEEE, 2013.
#![allow(unused)] fn main() { pub struct RangeAnalyzer<'tcx, T: IntervalArithmetic + ConstConvert + Debug> { pub tcx: TyCtxt<'tcx>, pub debug: bool, pub ssa_def_id: Option<DefId>, pub essa_def_id: Option<DefId>, pub final_vars: FxHashMap<DefId, HashMap<Local, Range<T>>>, pub ssa_locals_mapping: FxHashMap<DefId, HashMap<Local, HashSet<Local>>>, } }
Why SSA Form?
Before performing range analysis, the MIR (Mid-level Intermediate Representation) is first transformed into Static Single Assignment (SSA) form. SSA guarantees that each variable is assigned exactly once, and every use of a variable refers to a unique definition. This transformation simplifies the analysis in several ways:
-
It makes data flow explicit, allowing the analysis to accurately track how values propagate through the program.
-
It allows precise modeling of control flow joins using phi-like constructs.
-
It improves precision by allowing the interval of each version of a variable to be analyzed separately after each assignment.
Don’t worry about losing track of variables after SSA transformation: The
DefaultRangemaintains a mappingssa_locals_mapping, which is a HashMap from the original MIR Local variables to their corresponding SSA Locals. This ensures that even after SSA conversion, you can still query the intervals for the variables you care about using their original MIR identity. The SSA transformation is essential for sound and precise interval analysis and is a foundational preprocessing step in this system.
Range Analysis Features
Flow Sensitivity
Interval analysis in Rust can be flow-sensitive, meaning that it accounts for the different execution paths a program might take. This allows the analysis to track how intervals change as variables are assigned or modified during the execution flow, improving the precision of analysis.
Lattice-based Approach
In this approach, values of variables are represented in a lattice, where each element represents an interval. A lattice ensures that each combination of intervals has a defined result, and merging different paths of a program is done by taking the least upper bound (LUB) of intervals from each path.
For example, if a variable x can have an interval [0, 10] on one path and [5, 15] on another path, the merged interval would be [0, 15] because that represents the union of both possible value ranges.
Meet-over-all-paths (MOP) Approach
In the meet-over-all-paths (MOP) approach, the analysis is performed by considering every possible path through a program and merging the results into a final interval. This approach is path-sensitive but may be less scalable on large programs because it needs to account for all paths explicitly.
Precise Interprocedural Analysis
Although each callee function is analyzed once globally for performance,Every call site (i.e., function invocation) triggers a separate numeric evaluation using the actual arguments passed in.
This hybrid approach preserves analysis precision without sacrificing performance.
Range Analysis Test Examples
RAP provides several small test programs under rapx/tests/range to demonstrate different aspects of the range analysis. This section shows the source code of three representative tests (range_1, range_2, and range_symbolic) and explains what each of them is checking.
Example 1: range_1 — Intra-procedural loop ranges
Source: rapx/tests/range/range_1/src/main.rs
fn main() { let mut k = 0; while k < 100 { let mut i = 0; let mut j = k; while i < j { i += 1; j -= 1; } k += 1; } }
This example is an intra-procedural test that exercises nested loops and variable updates inside them.
kstarts from0and is increased until it reaches100, so the analysis should infer:k(and the corresponding SSA locals) have ranges within[0, 100]at the appropriate program points.
- Inside the inner loop,
iandjmove towards each other:iis incremented,jis decremented.- The condition
while i < jensures that the loop stops when the two meet or cross.
- The range analysis needs to:
- Track how
kgrows across the outer loop. - Track how
iandjevolve in the inner loop. - Derive precise intervals for these locals at the end of each loop.
- Track how
The corresponding unit test (test_range_analysis in rapx/tests/tests.rs) checks that the printed ranges for several locals (e.g., _1, _4, _6, _11, _12, _34) match the expected intervals such as [0, 100], [0, 99], [1, 100], etc.
Example 2: range_2 — Inter-procedural range propagation
Source: rapx/tests/range/range_2/src/main.rs
fn main() { let para1 = 42; foo1(para1); let para2 = 52; let _x = foo2(55, para2); } // This function tests passing ranges of parameters between functions. fn foo1(mut k: usize) { while k < 100 { k += 1; } } // This function tests whether the range of returned value is processed as expected. fn foo2(mut k: usize, _c: usize) -> usize { while k < 100 { k += 1; } k }
This example focuses on inter-procedural range analysis:
- In
main,para1is initialized to42and passed tofoo1as the argumentk.- Inside
foo1,kis incremented in a loop whilek < 100, so at the end of the function the range ofkshould be[100, 100].
- Inside
- For
foo2,kis initialized with the literal55and incremented until it reaches100, then returned.- The analysis should infer that the return value of
foo2is always100, i.e. range[100, 100].
- The analysis should infer that the return value of
The test (test_interprocedual_range_analysis) checks that:
- The parameter and local values in
foo1andfoo2have precise ranges like[42, 42],[52, 52], and[100, 100]. - These ranges are correctly propagated across function calls (from caller to callee and back via return values).
This demonstrates that the range analysis is not limited to a single function, but can reason about how ranges flow across call edges.
Example 3: range_symbolic — Symbolic bound expressions
Source: rapx/tests/range/range_symbolic/src/main.rs
fn foo1(x: i32) -> i32 { let a = x + 1; let y = x; let mut result ; let _b = a - y; // _11/_8. [1,1] can be inferred before range analysis if a >= y { // always true result = a; } else { result = y; } return result; // result is always a, but its upper/lower bound // symbexpr is hard to be inferred without range analysis } fn main(){ let y = 2; let x = y; foo1(2); }
This test is designed to show symbolic range expressions:
- Inside
foo1, we have:a = x + 1y = x_b = a - y
- Even before full range analysis,
_bcan be known to be1:_b = (x + 1) - x = 1, so its range is[1, 1].
- The conditional
if a >= yis always true becausea = x + 1andy = x.- Therefore,
resultwill always bea.
- Therefore,
- However, to express the bounds of
resultprecisely, the analysis needs to:- Track that
aisx + 1symbolically. - Use symbolic expressions (such as
Binary(AddWithOverflow, Place(_1), Constant(1))) to represent lower and upper bounds of intervals.
- Track that
The unit test (test_symbolic_interval) asserts the existence of specific symbolic interval strings in the analysis output, for example:
- A symbolic lower and upper bound based on
x + 1. - A symbolic interval that refers to another place (e.g.
Place(_1)). - A symbolic interval for a constant (e.g.
[1, 1]as a constant expression).
This demonstrates that the range analysis in RAP does not only compute numeric intervals, but also keeps track of symbolic expressions for lower and upper bounds. This is important when the exact numeric values cannot be known statically, but their relationship to program variables can still be expressed and exploited for further analyses.