LLVM IR Generation for EH and Cleanups¶
Overview¶
This document describes how Clang’s LLVM IR generation represents exception handling (EH) and C++ cleanups. It focuses on the data structures and control flow patterns used to model normal and exceptional exits, and it outlines how the generated IR differs across common ABI models.
For details on the LLVM IR representation of exception handling, see LLVM Exception Handling.
Core Model¶
EH and cleanup handling is centered around an EHScopeStack that records
nested scopes for:
Cleanups, which run on normal control flow, exceptional control flow, or both. These are used for destructors, full-expression cleanups, and other scope-exit actions.
Catch scopes, which represent
try/catchhandlers.Filter scopes, used to model dynamic exception specifications and some platform-specific filters.
Terminate scopes, used for
noexceptand similar termination paths.
Each cleanup is a small object with an Emit method. When a cleanup scope is
popped, the IR generator decides whether it must materialize a normal cleanup
block (for fallthrough, branch-through, or unresolved goto fixups) and/or an
EH cleanup entry (when exceptional control flow can reach the cleanup). This
results in a flattened CFG where cleanup lifetime is represented by the blocks
and edges that flow into those blocks.
Key Components¶
The LLVM IR generation for EH and cleanups is spread across several core components:
CodeGenModuleowns module-wide state such as the LLVM module, target information, and the selected EH personality function. It provides access to ABI helpers viaCGCXXABIand target-specific hooks.CodeGenFunctionmanages per-function state and IR building. It owns theEHScopeStack, tracks the current insertion point, and emits blocks, calls, and branches. Most cleanup and EH control flow is built here.EHScopeStackis the central stack of scopes used to model EH and cleanup semantics. It storesEHCleanupScopeentries for cleanups, along withEHCatchScope,EHFilterScope, andEHTerminateScopefor handlers and termination logic.EHCleanupScopestores the cleanup object plus state data (active flags, fixup depth, and enclosing scope links). When a cleanup scope is popped,CodeGenFunctiondecides whether to emit a normal cleanup block, an EH cleanup entry, or both.Cleanup emission helpers implement the mechanics of branching through cleanups, threading fixups, and emitting cleanup blocks.
Exception emission helpers implement landing pads, dispatch blocks, personality selection, and helper routines for try/catch, filters, and terminate handling.
CGCXXABI(and its ABI-specific implementations such asItaniumCXXABIandMicrosoftCXXABI) provide ABI-specific lowering for throws, catch handling, and destructor emission details.The cleanup and exception handling code generation is driven by the flow of
CodeGenFunctionand its helper classes traversing the AST to emit IR for C++ expressions, classes, and statements.
AST traversal in CodeGenFunction emits code and pushes cleanups or EH scopes,
EHScopeStack records scope nesting, cleanup and exception helpers materialize
the CFG as scopes are popped, and CGCXXABI supplies ABI-specific details for
landing pads or funclets.
Cleanup Destination Routing¶
When multiple control flow exits (return, break, continue,
fallthrough) pass through the same cleanup, the generated IR shares a single
cleanup block among them. Before entering the cleanup, each exit path stores a
unique index into a “cleanup destination” slot. After the cleanup code runs, a
switch instruction loads this index and dispatches to the appropriate final
destination. This avoids duplicating cleanup code for each exit while preserving
correct control flow.
For example, if a function has both a return and a break that exit
through the same destructor cleanup, both paths branch to the shared cleanup
block after storing their respective destination indices. The cleanup epilogue
then switches on the stored index to reach either the return block or the
loop-exit block.
When only a single exit passes through a cleanup (the common case), the switch is unnecessary and the cleanup block branches directly to its sole destination.
Branch Fixups for Forward Gotos¶
A goto statement that jumps forward to a label not yet seen poses a special
problem. The destination’s enclosing cleanup scope is unknown at the point the
goto is emitted. This is handled by emitting an optimistic branch and
recording a “fixup.” When the cleanup scope is later popped, any recorded fixups
are resolved by rewriting the branch to thread through the cleanup block and
adding the destination to the cleanup’s switch.
Exceptional Cleanups and EH Dispatch¶
Exceptional exits (throw, invoke unwinds) are routed through EH cleanup
entries, which are reached via a landing pad or a funclet dispatch block,
depending on the target ABI.
For Itanium-style EH (such as is used on x86-64 Linux), the IR uses invoke
to call potentially-throwing operations and a landingpad instruction to
capture the exception and selector values. The landing pad aggregates any
catch and cleanup clauses for the current scope, and branches to a dispatch
block that compares the selector to type IDs and jumps to the appropriate
handler.
For Windows, LLVM IR uses funclet-style EH: catchswitch and catchpad for
handlers, and cleanuppad for cleanups, with catchret and cleanupret
edges to resume normal flow. The personality function determines how these pads
are interpreted by the backend.
Personality and ABI Selection¶
Each function with exception handling constructs is associated with a personality function (e.g. __gxx_personality_v0 for C++ on Linux). The personality function determines the ABI-specifc EH behavior of the function. The IR generation selects a personality function based on language options and the target ABI (e.g., Itanium, MSVC SEH, SJLJ, Wasm EH). This decision affects:
Whether the IR uses landing pads or funclet pads.
The shape of dispatch logic for catch and filter scopes.
How termination or rethrow paths are modeled.
Whether certain helper functions such as exception filters must be outlined.
Because the personality choice is made during IR generation, the CFG shape directly reflects ABI-specific details.
Example: Array of Objects with Throwing Constructor¶
Consider:
class MyClass {
public:
MyClass(); // may throw
~MyClass();
};
void doSomething(); // may throw
void f() {
MyClass arr[4];
doSomething();
}
High-level behavior¶
Construction of
arrproceeds element-by-element. If an element constructor throws, destructors must run for any elements that were successfully constructed before the throw in reverse order of construction.After full construction, the call to
doSomethingmay throw, in which case the destructors for all constructed elements must run, in reverse order.On normal exit, destructors for all elements run in reverse order.
Codegen flow and key components¶
The surrounding compound statement enters a
CodeGenFunction::LexicalScope, which is aRunCleanupsScopeand is responsible for popping local cleanups at the end of the block.CodeGenFunction::EmitDeclroutes the local variable toCodeGenFunction::EmitVarDecland thenCodeGenFunction::EmitAutoVarDecl, which in turn callsEmitAutoVarAlloca,EmitAutoVarInit, andEmitAutoVarCleanups.CodeGenFunction::EmitCXXAggrConstructorCallemits the array constructor loop. While emitting the loop body, it enters aRunCleanupsScopeand usesCodeGenFunction::pushRegularPartialArrayCleanupto register a cleanup before callingCodeGenFunction::EmitCXXConstructorCallfor one element in the loop iteration. If this constructor were to throw an exception, the cleanup handler would destroy the previously constructed elements in reverse order.CodeGenFunction::EmitAutoVarCleanupscallsemitAutoVarTypeCleanup, which ultimately registers aDestroyObjectcleanup viaCodeGenFunction::pushDestroy/pushFullExprCleanupfor the full-array destructor path.DestroyObjectusesCodeGenFunction::destroyCXXObject, which emits the actual destructor call viaCodeGenFunction::EmitCXXDestructorCall.Cleanup emission helpers (e.g.,
CodeGenFunction::PopCleanupBlockandCodeGenFunction::EmitBranchThroughCleanup) thread both normal and EH exits through the cleanup blocks as scopes are popped.The cleanup is represented as an
EHCleanupScopeonEHScopeStack, and itsEmitmethod generates a loop that calls the destructor on the initialized range in reverse order.
The above function names and flow are accurate as of LLVM 22.0, but this is subject to change as the code evolves, and this document might not be updated to reflect the exact functions used.
Example: Temporary object materialization¶
Consider:
class MyClass {
public:
MyClass();
~MyClass();
};
void useMyClass(MyClass &);
void f() {
useMyClass(MyClass());
}
High-level behavior¶
The temporary
MyClassis materialized for the call argument.The temporary must be destroyed at the end of the full-expression, both on the normal path and on the exceptional path if
useMyClassthrows.If the constructor throws, the temporary is not considered constructed and no destructor runs.
Codegen flow and key functions¶
CodeGenFunction::EmitExprWithCleanupswraps the full-expression in aRunCleanupsScopeso that full-expression cleanups are run after the call.CodeGenFunction::EmitMaterializeTemporaryExprcreates storage for the temporary viacreateReferenceTemporaryand initializes it. For record temporaries this flows throughEmitAnyExprToMemandCodeGenFunction::EmitCXXConstructExpr, which callsCodeGenFunction::EmitCXXConstructorCall.pushTemporaryCleanupregisters the destructor as a full-expression cleanup by callingCodeGenFunction::pushDestroyforSD_FullExpressiontemporaries.The cleanup ultimately uses
DestroyObjectandCodeGenFunction::destroyCXXObject, which emitsCodeGenFunction::EmitCXXDestructorCall.
The above function names and flow are accurate as of LLVM 22.0, but this is subject to change as the code evolves, and this document might not be updated to reflect the exact functions used.