Chapter 5.1. Alias Analysis
Alias analysis involves determining if two identifiers point to the same memory allocation. The task is challenging, with various options that balance precision and cost, including flow sensitivity, field sensitivity, and path sensitivity. In general, there are two main approaches to analysis: the lattice-based approach and the meet-over-all-paths (MOP) approach.
5.1.1 Alias Analysis Trait
RAPx provides the AliasAnalysis trait for alias analysis. The trait has several methods, which enables users to query the aliases among the arguments and return value of a function based on the function DefId, or the aliases of all functions as a FxHashMap. Developers can implement the trait based on their needs.
#![allow(unused)] fn main() { pub trait AliasAnalysis: Analysis { fn get_fn_alias(&self, def_id: DefId) -> Option<FnAliasPairs>; fn get_all_fn_alias(&self) -> FnAliasMap; fn get_local_fn_alias(&self) -> FnAliasMap; } }
The alias analysis result for each function is stored as the FnAliasPairs type, which contains a HashSet of multiple alias relationships, and each alias relationship is recorded as AliasPair.
#![allow(unused)] fn main() { pub struct FnAliasPairs { arg_size: usize, alias_set: HashSet<AliasPair>, } pub struct AliasPair { pub left_local: usize, // parameter id; the id of return value is `0`; pub lhs_fields: Vec<usize>, // field-sensive: sequence of (sub) field numbers for left_local pub right_local: usize, // parameter id, which is an alias of left_local pub rhs_fields: Vec<usize>, // field-sensive: sequence of (sub) field numbers for right_local } }
RAPx has implemented a default alias analysis algorithm based on MOP.
5.1.2 Default Alias Analysis
The MOP-based alias approach is achieved via a struct AliasAnalyzer, which implements the AliasAnalysis trait. The detailed implementation can be found in mop.rs.
#![allow(unused)] fn main() { pub struct AliasAnalyzer<'tcx> { pub tcx: TyCtxt<'tcx>, pub fn_map: FxHashMap<DefId, MopFnAliasPairs>, } }
The results can be retrieved by decoding the data structure of FnAliasPairs .
Suposing the task is to analyze the alias relationship among the return values and the arguments, the approach performs alias analysis for each execution path of a target function and merges the results from different paths into a final result. When encountering function calls, it recursively analyzes the callees until all dependencies are resolved. This approach is path-sensitive and field-sensitive but context-insensitive.
5.1.2.1 Feature: Path Sensitive with Light Reachability Constraints
In the following code, there are two conditional branches. If adopting a lattice-based approach, a could be an alias of either x or y at program point ①. Consequently, the return value could also be an alias of either x or y.
However, this analysis is inaccurate, and a cannot be an alias of y in this program.
In fact, the function foo has four execution paths induced by the two conditional branches. Among them, only one path involves an alias relationship between the return value and y. This path requires choice to be Selector::Second in the first conditional statement and Selector::First in the second conditional statement, which is impossible. Therefore, this path is unreachable, and a should not be considered an alias of y in this program.
#![allow(unused)] fn main() { enum Selector { First, Second, } fn foo<'a>(x: &'a i32, y: &'a i32, choice: Selector) -> &'a i32 { let a = match choice { Selector::First => x, Selector::Second => y, }; // program point ①. match choice { Selector::First => a, Selector::Second => x, } } }
Our MOP-based approach can address this issue by explicitly extracting each execution path and justifying its reachability through the maintenance of a set of constant constraints.
We use the following MIR code for illustration, which corresponds to the previous source code. In bb0, there is a SwitchInt() instruction that branches control flow to bb3 if _5 is 0, to bb1 if _5 is 1, or to bb2 otherwise.
Accordingly, we impose constant constraints on _5 along each branch: in bb3, _5 is constrained to be 0; in bb1, _5 is constrained to be 1; and in bb2, _5 is constrained to be neither 0 nor 1. With these constraints in place, when execution reaches the second conditional statement in bb4, the analysis can rule out two unreachable combinations of branch conditions.
#![allow(unused)] fn main() { fn foo(_1: &i32, _2: &i32, _3: Selector) -> &i32 { bb0: { StorageLive(_4); _5 = discriminant(_3); switchInt(move _5) -> [0: bb3, 1: bb1, otherwise: bb2]; } bb1: { _4 = _2; goto -> bb4; } bb2: { unreachable; } bb3: { _4 = _1; goto -> bb4; } bb4: { _6 = discriminant(_3); switchInt(move _6) -> [0: bb6, 1: bb5, otherwise: bb2]; } bb5: { _0 = _1; goto -> bb7; } bb6: { _0 = _4; goto -> bb7; } bb7: { StorageDead(_4); return; } } }
5.1.2.2 SCC Handling (default)
A major challenge for path-sensitive analysis lies in the presence of strongly connected components (SCCs), since loops may induce an unbounded number of execution paths. We address this challenge based on the following two observations:
- Each SCC in MIR has a dominator, i.e., every loop in MIR is a natural loop. This property allows us to construct SCC trees that capture the hierarchical structure of loops.
- Traversing the same loop path once or multiple times does not change the resulting alias relationships.
Based on these observations, we can soundly bound path exploration by collapsing repetitive loop traversals. In the following, we use a concrete example to demonstrate how these findings enable efficient and precise path enumeration in the presence of loops.
#![allow(unused)] fn main() { enum Selector { First, Second, } // Expected alias analysis result: (0, 1) (0, 2) fn foo(x: *mut i32, y: *mut i32, choice: Selector) -> *mut i32 { let mut r = x; let mut q = x; unsafe { while *r > 0 { let mut p = match choice { Selector::First => y, Selector::Second => x, }; loop { r = q; q = match choice { Selector::First => x, Selector::Second => p, }; *q -= 1; if *r <= 1 { break; } q = y; } if *r == 0 { break; } } } r } }
The following figure illustrates the control-flow graph of the code.

There is a large SCC
{1, 22, 23, 2, 4, 5, 6, 7, 8, 9, 10, 21, 24, 11, 20, 25, 19, 26, 13, 12, 18, 27, 15},
whose dominator is node 1, since every path from the entry node to any node in this SCC must pass through 1.
Next, we identify sub-SCCs by removing the dominator of the SCC. Starting from node 22, which is a successor of the removed dominator 1, we discover a sub-SCC
{7, 8, 9, 10, 21, 24, 11, 20, 25, 19, 26, 13},
with dominator 7.
These two dominators form a hierarchical relationship. In practice, an SCC may contain multiple sub-SCCs, and sub-SCCs may themselves contain further nested SCCs. This naturally yields a tree structure that represents multi-level SCC nesting.
In this way, we enumerate valuable paths of an SCC tree in a recursive manner. We define a path as a sequence from the dominator to an exit of the corresponding sub-SCC. For example, 7-8-10-21-24-11-20-25-19-26 is such a path.
A node sequence is worth further exploration only if the most recent traversal between two occurrences of the same dominator introduces at least one previously unvisited node. For instance,
7-8-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-13-7 is not worth exploring, since the second segment is a duplication of the first and does not introduce any new nodes.
In contrast, 7-8-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-13-7 is worth exploring because the second segment introduces a new node 9, which may lead to new alias relationships. The resulting path is
7-8-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-26.
Using this strategy, our approach generates the following 10 paths for the inner SCC.
7-8-10-21-24-11-20-25-19-267-9-10-21-24-11-20-25-19-267-8-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-267-8-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-267-9-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-267-9-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-267-8-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-267-8-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-267-9-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-267-9-10-21-24-11-20-25-19-13-7-8-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-26
Note that this path enumeration strategy can effectively detect the alias relationship between r and y, for example via the path 7-9-10-21-24-11-20-25-19-13-7-9-10-21-24-11-20-25-19-26.
In contrast, a straightforward DFS-based traversal fails to detect this alias relationship.
Our analysis adopts a recursive strategy instead of an iterative bottom-up approach. A purely bottom-up method, although potentially more efficient, ignores path constraints introduced outside an SCC and may therefore over-approximate reachable paths. To address this limitation, our analysis starts from the outermost SCC and proceeds inward. When the analysis reaches an inner SCC, it propagates the accumulated path constraints from the outer context and uses them to determine which of the 10 inner-SCC paths are reachable. For each reachable path, the analysis then concatenates the corresponding inner-SCC subsequence with the path of the outer SCC, continuing the exploration.
5.1.2.3 SCC Handling with Fix-Point (Experimental)
Motivating Case
The motivating example below illustrates a representative pattern of Loop-Carried State Propagation, featuring nested loops guarded by correlated constraints. The function includes an immutable configuration variable, choice, which synchronizes the behavior of an outer loop (initializing pointer p) and an inner loop (using p to assign r).
#![allow(unused)] fn main() { // Expected result: (r,x) (r,z) fn test_pipeline_correlation<'a>(x: &'a i32, y: &'a i32, z: &'a i32, choice: Selector) -> &'a i32 { let mut r = x; let mut q = z; // [Outer SCC]: Configuration Phase loop { // Defines the 'Source' of the pipeline based on 'choice' // This is the "Input Valve" let mut p = match choice { Selector::First => y, // Path A: Source is 'y' Selector::Second => x, // Path B: Source is 'x' }; // [Inner SCC]: Propagation Phase (The Pipeline) loop { if random() { break; } r = match choice { Selector::First => x, Selector::Second => p, }; p = q; } if random() { break; } } r } }
-
Path A (
Selector::First): The outer loop initializespto aliasy. However, the inner loop explicitly assignsrtox, structurally guardingyfrom reachingr. -
Path B (
Selector::Second): The outer loop initializespto aliasx. A critical mutation (whereqpoints toz) occurs at the end of the inner loop. In the first iteration,ris assignedx, but in subsequent iterations, the back-edge propagates the mutatedp, causingrto aliasz.
The core challenge is that r must strictly alias {x, z}. Traditional methods fail here: state merging (MFP) conflates the contexts of First and Second, falsely suggesting r aliases y. Conversely, linearization methods (spanning trees) sever the back-edge, failing to capture the loop-carried dependency where p transitions to z.
Method Overview
Slice-Splicing Path Generation Inside an SCC, the solver abandons rigid unrolling in favor of a Slice-Splicing strategy. It first decomposes the loop into fundamental behavioral units called Base Slices, which include Exit Slices (terminating the loop) and Loop Slices (completing a single iteration). The solver then dynamically synthesizes full execution traces by recursively splicing these loop slices onto the current path, simulating the loop's execution behavior.
Discrete Constraint Propagation (DCP) To ensure semantic feasibility, the framework integrates a lightweight Discrete Constraint Propagation engine that operates in lockstep with topological traversal. By projecting high-level constructs (enums, booleans) into a unified integer constraint space via Unified Discrete Projection , the solver treats control-flow edges as assertion generators. A Snapshot-Backtracking mechanism manages the constraint context, allowing the solver to speculatively bind variables and prune logically unreachable paths with complexity upon detecting conflicts.
State-Stability Guided Termination The recursion depth of loop unrolling is governed by dual validity checks: Logical Consistency and State Stability. The enumeration continues only if the spliced path is logically satisfiable and induces a substantive change in the abstract state. Once a local fixpoint is reached or an exit slice is appended, the resulting valid internal traces are integrated back into the global path via a Backfilling Operation, propagating precise context-sensitive states to the subsequent control flow.
Case Study
The figure below illustrates how the framework achieves path-sensitive analysis within the cyclic structure of the Motivating Example. The central challenge lies in simultaneously maintaining path constraints while computing a fixpoint for the loop. Leveraging our top-down analysis strategy, the solver inherits constraint information from the predecessor path upon entering the inner SCC. Specifically, the immutable predicate choice is already resolved to either First or Second within the current context, guiding the subsequent path generation.

Constraint-Guided Path Generation.
In the context where choice is determined to be First, the solver enforces strict constraint validation. Since traversing Node 10 requires the conflicting condition selector=Second, all paths attempting to pass through Node 10 are immediately pruned.
- Iteration 1: The path traverses Node 12, introducing a new alias relationship
p=q. Consequently, the trace (6-7-9-11-12) is extended into the second iteration. - Iteration 2: Base paths are spliced onto the current trace. Similarly, branches leading to Node 10 are pruned due to the mutual exclusivity of path constraints.
- Convergence: A fixpoint is reached in the second round. All traces with states marked as
exitare collected, resulting in a set of 4 valid internal traces ready for external integration.
A similar logic applies to the context where choice is Second, with a key distinction in state evolution.
- State Propagation: This execution flow traverses the assignments
r=pandp=q. - Convergence: During the second iteration, the alias relationship is updated to
r=p=z(capturing the loop-carried dependency). Consequently, the analysis requires a third iteration to reach a stable fixpoint. This process ultimately yields a set of 6 valid internal traces.
Global Integration via Backfilling. Upon completing the inner SCC analysis, the global analyzer executes a Backfilling Operation. These result paths are physically grafted onto the end of the paused external path. By carrying the precisely computed context state, this mechanism allows for the seamless resumption of global traversal from the outer successor nodes (e.g., Nodes 8 and 3), ensuring that the complex internal behavior of the loop is accurately reflected in the downstream analysis.
5.1.2.4 Feature: Field Sensitive
In the following example, the return value of foo() is an alias of the first field of its first argument.
#![allow(unused)] fn main() { struct Point { x: i32, y: i32, } fn foo(p1: &Point) -> &i32 { &p1.y } }
The corresponding MIR code is as follows:
#![allow(unused)] fn main() { fn foo(_1: &Point) -> &i32 { bb0: { _0 = &((*_1).1: i32); return; } } }
The alias analysis result should be (0, 1.1).
Key Steps of Our Algorithm
There are three key steps (source code):
#![allow(unused)] fn main() { let mut mop_graph = MopGraph::new(self.tcx, def_id); mop_graph.solve_scc(); mop_graph.check(0, &mut self.fn_map); }
- Graph preparation: Construct the control-flow graph for the target function. See the source code.
- SCC shrinkage: Extract the strongly connected components (SCCs) and shrink SCCs of the control-flow graph. See the source code.
- Alias Check: Traversal the control-flow graph and perform alias analysis. See the source code
Reference
The feature is based on our SafeDrop paper, which was published in TOSEM.
@article{cui2023safedrop,
title={SafeDrop: Detecting memory deallocation bugs of rust programs via static data-flow analysis},
author={Mohan Cui, Chengjun Chen, Hui Xu, and Yangfan Zhou},
journal={ACM Transactions on Software Engineering and Methodology},
volume={32},
number={4},
pages={1--21},
year={2023},
publisher={ACM New York, NY, USA}
}
## 5.1.3 MFP-based Alias Analysis
### 5.1.3.1 Overview
In addition to the path-sensitive MOP (Meet-Over-all-Paths) approach, RAPx provides an alternative alias analysis implementation based on the **MFP (Maximum Fixed Point)** framework. The MFP-based approach is built on Rust compiler's `rustc_mir_dataflow` infrastructure, which provides a general-purpose monotone dataflow analysis framework.
The MFP approach differs fundamentally from MOP in its analysis strategy:
- **Path Insensitivity**: Unlike MOP, which explicitly enumerates and analyzes individual execution paths, MFP merges information from all paths at control-flow join points. This means MFP does not track path-specific constraints (e.g., the value of a discriminant variable across branches).
- **Flow Sensitivity**: Both approaches are flow-sensitive, tracking how alias relationships evolve along the control flow. However, MFP uses a worklist-based fixed-point iteration over the control-flow graph (CFG), while MOP performs explicit path enumeration.
- **Precision Trade-offs**: MFP may produce more conservative (over-approximate) results compared to MOP due to path insensitivity, but it can be more efficient for functions with complex control flow where path enumeration becomes expensive.
#### Example: Path Insensitivity in MFP
The following example demonstrates how MFP's path insensitivity leads to more conservative results compared to MOP:
```rust
enum Selector {
First,
Second,
}
fn foo<'a>(x: &'a i32, y: &'a i32, choice: Selector) -> &'a i32 {
let a = match choice {
Selector::First => x,
Selector::Second => y,
};
match choice {
Selector::First => a,
Selector::Second => x,
}
}
MOP Analysis (Path-Sensitive):
- Path 1:
choice = First→a = x→ returnsa(i.e.,x) → alias(0, 1) - Path 2:
choice = Second→a = y→ returnsx→ alias(0, 1) - Result:
{(0, 1)}(return value aliases only with parameterx)
MFP Analysis (Path-Insensitive):
- After the first
match:amay alias with eitherxory - At the second
match, both branches are considered reachable:- Branch
First: returnsa, which may bexory→ aliases(0, 1)and(0, 2) - Branch
Second: returnsx→ alias(0, 1)
- Branch
- Join at the merge point:
{(0, 1), (0, 2)}(return value may alias with bothxandy) - Result:
{(0, 1), (0, 2)}(includes spurious alias with parametery)
The key difference is that MFP loses the correlation between the two match statements on the same variable choice, leading to a spurious alias (0, 2) that MOP correctly excludes.
The MFP alias analyzer is implemented in the MfpAliasAnalyzer struct (source code), which also implements the AliasAnalysis trait. Like the MOP approach, MFP analysis is field-sensitive and context-insensitive, supporting interprocedural analysis through function summaries.
5.1.3.2 Lattice Design
The MFP approach models alias relationships using a lattice-based abstract domain built on the Union-Find (disjoint-set) data structure.
Abstract Domain
The abstract domain AliasDomain represents alias relationships among program places:
#![allow(unused)] fn main() { pub struct AliasDomain { /// Parent array for Union-Find parent: Vec<usize>, /// Rank for union-by-rank optimization rank: Vec<usize>, } }
Each place in the program is assigned a unique index, and the Union-Find structure tracks equivalence classes of aliased places. Two places are considered aliased if they belong to the same equivalence class (i.e., they have the same root in the Union-Find tree).
Lattice Structure
The AliasDomain forms a join semi-lattice with the following properties:
-
Bottom (⊥): The initial state where no aliases exist. Each place is in its own equivalence class:
parent[i] = ifor alli. -
Partial Order (⊑): Domain
D₁ ⊑ D₂if and only if every alias pair inD₁is also present inD₂. In other words,D₂contains at least as many alias relationships asD₁. -
Join Operation (⊔): The join of two domains
D₁ ⊔ D₂produces a domain containing all alias relationships from both. This is implemented by extracting all alias pairs from one domain and unioning them in the other:
#![allow(unused)] fn main() { impl JoinSemiLattice for AliasDomain { fn join(&mut self, other: &Self) -> bool { let mut changed = false; let pairs = other.get_all_alias_pairs(); for (i, j) in pairs { if self.union(i, j) { changed = true; } } changed } } }
Monotonicity and Convergence
The lattice design ensures monotonicity: transfer functions always produce domains that are higher in the lattice (contain at least as many aliases). This property, combined with the finite height of the lattice (bounded by the number of possible alias pairs), guarantees that the fixed-point iteration will converge.
The Union-Find structure provides efficient operations:
- Find: Determines the root of an equivalence class with path compression:
O(α(n))amortized time. - Union: Merges two equivalence classes using union-by-rank:
O(α(n))amortized time. - Are-Aliased: Checks if two places belong to the same equivalence class:
O(α(n))amortized time.
where α(n) is the inverse Ackermann function, which grows extremely slowly and is effectively constant for all practical purposes.
5.1.3.3 Transfer Functions
The MFP analysis defines transfer functions for each type of MIR statement and terminator using a Kill-Gen pattern: first remove (kill) stale alias relationships, then establish (generate) new ones.
Kill-Gen Pattern
For an assignment lv = ..., the general pattern is:
-
Kill Phase: Remove all existing aliases involving
lvand its field projections (e.g.,lv.0,lv.1.2). This is necessary because assigning tolvoverwrites its previous value and invalidates old aliases. -
Gen Phase: Establish new alias relationships based on the right-hand side of the assignment.
The kill phase is implemented using remove_aliases_with_prefix, which efficiently removes all places that have lv as a prefix:
#![allow(unused)] fn main() { pub fn remove_aliases_with_prefix(&mut self, place_id: &PlaceId, place_info: &PlaceInfo) { let mut indices_to_remove = Vec::new(); for idx in 0..self.parent.len() { if let Some(pid) = place_info.get_place(idx) { if pid.has_prefix(place_id) { indices_to_remove.push(idx); } } } for idx in indices_to_remove { self.remove_aliases(idx); } } }
Assignment: lv = rv
For a simple assignment where rv is a place (via Copy or Move):
#![allow(unused)] fn main() { pub fn transfer_assign<'tcx>( state: &mut AliasDomain, lv: Place<'tcx>, rv: &Operand<'tcx>, place_info: &PlaceInfo<'tcx>, ) { let lv_id = mir_place_to_place_id(lv); let lv_idx = match place_info.get_index(&lv_id) { Some(idx) => idx, None => return, }; if !place_info.may_drop(lv_idx) { return; } // Kill: remove old aliases for lv and all its fields state.remove_aliases_with_prefix(&lv_id, place_info); // Gen: add alias lv ≈ rv if rv is a place if let Some(rv_id) = operand_to_place_id(rv) { if let Some(rv_idx) = place_info.get_index(&rv_id) { if place_info.may_drop(rv_idx) { state.union(lv_idx, rv_idx); sync_fields(state, &lv_id, &rv_id, place_info); } } } } }
Only places that "may drop" (contain pointer-like data) are tracked, as only these are relevant for memory safety analysis.
Reference Creation: lv = &rv or lv = &raw rv
Reference creation is treated similarly to assignment, creating an alias between the reference lv and the referent rv:
#![allow(unused)] fn main() { pub fn transfer_ref<'tcx>( state: &mut AliasDomain, lv: Place<'tcx>, rv: Place<'tcx>, place_info: &PlaceInfo<'tcx>, ) { let lv_id = mir_place_to_place_id(lv); let rv_id = mir_place_to_place_id(rv); // Kill: remove old aliases for lv state.remove_aliases_with_prefix(&lv_id, place_info); // Gen: add alias lv ≈ rv state.union(lv_idx, rv_idx); sync_fields(state, &lv_id, &rv_id, place_info); } }
Aggregate Construction: lv = (op₁, op₂, ...)
For aggregate construction (tuples, structs, arrays), each field of lv is aliased with the corresponding operand:
#![allow(unused)] fn main() { pub fn transfer_aggregate<'tcx>( state: &mut AliasDomain, lv: Place<'tcx>, operands: &[Operand<'tcx>], place_info: &PlaceInfo<'tcx>, ) { let lv_id = mir_place_to_place_id(lv); // Kill: remove old aliases for lv and all its fields state.remove_aliases_with_prefix(&lv_id, place_info); // Gen: for each field, add alias lv.i ≈ operand[i] for (field_idx, operand) in operands.iter().enumerate() { if let Some(rv_id) = operand_to_place_id(operand) { let lv_field_id = lv_id.project_field(field_idx); if let Some(lv_field_idx) = place_info.get_index(&lv_field_id) { if let Some(rv_idx) = place_info.get_index(&rv_id) { state.union(lv_field_idx, rv_idx); sync_fields(state, &lv_field_id, &rv_id, place_info); } } } } } }
Field Synchronization
To ensure field-sensitive precision, when two places are aliased, their corresponding fields should also be aliased. The sync_fields function recursively establishes these relationships:
#![allow(unused)] fn main() { pub fn sync_fields<'tcx>( state: &mut AliasDomain, lv: &PlaceId, rv: &PlaceId, place_info: &PlaceInfo<'tcx>, ) { const MAX_SYNC_DEPTH: usize = 3; // Try to sync common fields (0..N), N = 16 here for field_idx in 0..16 { let lv_field = lv.project_field(field_idx); let rv_field = rv.project_field(field_idx); if let (Some(lv_field_idx), Some(rv_field_idx)) = ( place_info.get_index(&lv_field), place_info.get_index(&rv_field), ) { if place_info.may_drop(lv_field_idx) && place_info.may_drop(rv_field_idx) { state.union(lv_field_idx, rv_field_idx); // Recursively sync nested fields up to MAX_SYNC_DEPTH } } } } }
Function Calls: ret = f(args)
Function calls apply both kill and gen effects:
- Kill: Remove aliases for the return place
ret. - Gen: Apply the function summary (if available) or use a conservative fallback.
The gen effect involves mapping the callee's summary (aliases between its parameters and return value) to the caller's context by substituting actual arguments.
5.1.3.4 Intraprocedural Analysis
The intraprocedural analysis for a single function is implemented by the FnAliasAnalyzer struct, which implements the rustc_mir_dataflow::Analysis trait.
PlaceInfo: Field-Sensitive Place Management
To support field-sensitive analysis, the analyzer maintains a PlaceInfo structure that tracks all program places and their properties:
#![allow(unused)] fn main() { pub struct PlaceInfo<'tcx> { place_to_index: FxHashMap<PlaceId, usize>, index_to_place: Vec<PlaceId>, may_drop: Vec<bool>, need_drop: Vec<bool>, num_places: usize, } }
PlaceId Design: Places are represented using a recursive structure that supports field projections:
#![allow(unused)] fn main() { pub enum PlaceId { /// A local variable (e.g., _1) Local(usize), /// A field projection (e.g., _1.0) Field { base: Box<PlaceId>, field_idx: usize, }, } }
This allows precise tracking of aliases at the field level, e.g., distinguishing _1.0 from _1.1.
Recursive Field Creation: During initialization, PlaceInfo::build analyzes the MIR body and recursively creates field projections for struct and tuple types:
#![allow(unused)] fn main() { fn create_fields_for_type( &mut self, tcx: TyCtxt<'tcx>, ty: Ty<'tcx>, base_place: PlaceId, field_depth: usize, deref_depth: usize, ty_env: TypingEnv<'tcx>, ) { const MAX_FIELD_DEPTH: usize = 5; const MAX_DEREF_DEPTH: usize = 3; match ty.kind() { ty::Adt(adt_def, substs) => { for (field_idx, field) in adt_def.all_fields().enumerate() { let field_ty = field.ty(tcx, substs); let field_place = base_place.project_field(field_idx); self.register_place(field_place.clone(), may_drop, need_drop); // Recursively create nested fields } } // Handle tuples, references, etc. } } }
The analysis limits recursion depth to avoid infinite loops for recursive types.
Fixed-Point Iteration
The dataflow analysis uses the standard worklist algorithm provided by rustc_mir_dataflow:
#![allow(unused)] fn main() { let analyzer = FnAliasAnalyzer::new(tcx, def_id, body, fn_summaries); let mut results = analyzer .iterate_to_fixpoint(tcx, body, None) .into_results_cursor(body); }
The analysis computes the standard dataflow equations:
- OUT[B] = F_B(IN[B]): The output state of a basic block is obtained by applying its transfer function to the input state.
- IN[B] = ⊔_{P ∈ pred(B)} OUT[P]: The input state of a block is the join of the output states of all its predecessors.
The iteration continues until a fixed point is reached, i.e., no block's output state changes between iterations.
Analysis Trait Implementation
The FnAliasAnalyzer implements key methods of the Analysis trait:
#![allow(unused)] fn main() { impl<'tcx> Analysis<'tcx> for FnAliasAnalyzer<'tcx> { type Domain = AliasDomain; fn bottom_value(&self, _body: &Body<'tcx>) -> Self::Domain { // Bottom is no aliases AliasDomain::new(self.place_info.num_places()) } fn apply_primary_statement_effect( &self, state: &mut Self::Domain, statement: &Statement<'tcx>, _location: Location, ) { // Apply transfer functions for each statement kind } fn apply_primary_terminator_effect( &self, state: &mut Self::Domain, terminator: &Terminator<'tcx>, _location: Location, ) -> TerminatorEdges { // Handle function calls, branches, etc. } } }
5.1.3.5 Interprocedural Analysis
The MFP approach achieves interprocedural precision through function summaries and global fixed-point iteration.
Function Summaries
A function summary captures the alias relationships between a function's parameters and its return value, abstracted as a set of AliasPair instances:
#![allow(unused)] fn main() { pub struct FnAliasPairs { arg_size: usize, alias_set: HashSet<AliasPair>, } pub struct AliasPair { pub left_local: usize, // 0 for return value, 1+ for parameters pub lhs_fields: Vec<usize>, pub right_local: usize, pub rhs_fields: Vec<usize>, } }
For example, if a function returns a reference to the first field of its first parameter, the summary would contain AliasPair { left_local: 0, lhs_fields: [], right_local: 1, rhs_fields: [0] }, representing the alias (0, 1.0).
Global Fixed-Point Iteration
The interprocedural analysis operates in three phases:
Phase 1: Collect Reachable Functions
Starting from the local crate's functions (mir_keys), the analyzer recursively collects all reachable functions by traversing call sites:
#![allow(unused)] fn main() { fn collect_reachable_functions(&self, def_id: DefId, reachable: &mut FxHashSet<DefId>) { if reachable.contains(&def_id) || !self.tcx.is_mir_available(def_id) { return; } reachable.insert(def_id); let body = self.tcx.optimized_mir(def_id); for bb_data in body.basic_blocks.iter() { if let TerminatorKind::Call { func, .. } = &terminator.kind { if let Operand::Constant(c) = func { if let ty::FnDef(callee_def_id, _) = c.ty().kind() { self.collect_reachable_functions(*callee_def_id, reachable); } } } } } }
Phase 2: Initialize Summaries
All reachable functions are initialized with empty summaries (⊥):
#![allow(unused)] fn main() { for def_id in reachable_functions.iter() { if let Some(arg_count) = self.get_arg_count(*def_id) { self.fn_map.insert(*def_id, FnAliasPairs::new(arg_count)); } } }
Phase 3: Iterate to Fixed Point
The analyzer iteratively analyzes all functions until their summaries converge:
#![allow(unused)] fn main() { const MAX_ITERATIONS: usize = 10; loop { iteration += 1; let mut changed = false; // Sync current summaries to shared storage { let mut summaries = fn_summaries.borrow_mut(); summaries.clear(); for (def_id, summary) in &self.fn_map { summaries.insert(*def_id, summary.clone()); } } // Re-analyze each function with current summaries for def_id in reachable_functions { let old_summary = self.fn_map.get(def_id).cloned().unwrap(); self.analyze_function(*def_id, &fn_summaries); if let Some(new_summary) = self.fn_map.get(def_id) { if old_summary != new_summary { changed = true; } } } // Check convergence if !changed { break; } if iteration >= MAX_ITERATIONS { break; } } }
This outer fixed-point iteration ensures that when analyzing a function f that calls function g, the analysis uses the most up-to-date summary of g. Convergence is guaranteed by the monotonicity of the lattice.
Summary Extraction
Extracting a function summary from its intraprocedural analysis results involves collecting aliases between function parameters and the return value at return points. The extraction process handles transitive connections through temporary variables and filters redundant information.
The analyzer examines all Return terminators and extracts the alias state at those program points. Since aliases may be connected through temporary variables (e.g., parameter _1 aliases with _15, and _15 aliases with return value _0), the extraction performs a transitive closure to identify all parameter-return alias relationships, even when they are not directly connected in the Union-Find structure.
The extracted candidate aliases are then normalized and filtered to remove redundancy:
- Self-aliases like
(0, 0)are removed - Prefix-subsumed aliases are removed (e.g.,
(0, 1)subsumes(0.0, 1.0))
This ensures the summary is minimal while retaining precision.
5.1.4 Quick Usage Guide
Developers can test the MOP-based alias analysis using the following command:
cargo rapx -alias
or the MFP-based alias analysis using:
cargo rapx -alias-mfp
For example, we can apply the mop analysis to the first case, and the result is as follows:
Checking alias_mop_field...
21:50:18|RAP|INFO|: Start analysis with RAP.
21:50:18|RAP|INFO|: Alias found in Some("::boxed::{impl#0}::new"): {(0.0,1)}
21:50:18|RAP|INFO|: Alias found in Some("::foo"): {(0,1.1),(0,1.0)}
When applying the mop analysis to the first case, and the result is as follows:
Checking alias_mop_switch...
21:53:37|RAP|INFO|: Start analysis with RAP.
21:53:37|RAP|INFO|: Alias found in Some("::foo"): {(0,2),(0,1)}
21:53:37|RAP|INFO|: Alias found in Some("::boxed::{impl#0}::new"): {(0.0,1)}
To utilize the analysis results in other analytical features, developers can refer the following example:
#![allow(unused)] fn main() { // MOP-based approach let mut alias_analysis = AliasAnalyzer::new(self.tcx); alias_analysis.run(); let result = alias_analysis.get_local_fn_alias(); rap_info!("{}", AAResultMapWrapper(result)); }
#![allow(unused)] fn main() { // MFP-based approach let mut analyzer = MfpAliasAnalyzer::new(tcx); analyzer.run(); let alias = analyzer.get_local_fn_alias(); rap_info!("{}", FnAliasMapWrapper(alias)); }
The code above performs alias analysis for each function, recording the alias pairs between two arguments or between an argument and the return value.