1. Force-Linker Headers¶
Warning
The framework is rapidly evolving. The documentation might be out-of-sync with the implementation. The purpose of this documentation is to give context for upcoming reviews.
1.1. The problem¶
SSAF uses llvm::Registry<> for decentralized registration of summary extractors and serialization formats. Each registration is a file-scope static object whose constructor adds an entry to the global registry:
// In MyExtractor.cpp
static TUSummaryExtractorRegistry::Add<MyExtractor>
RegisterExtractor("MyExtractor", "My summary extractor");
When the translation unit containing this static object is compiled into a
static library (.a / .lib), the static linker will only pull in
object files that resolve an undefined symbol in the consuming binary.
Because no code ever calls anything in MyExtractor.o directly, the linker
discards the object file — and the registration never runs.
This is not a problem for shared libraries (.so / .dylib), because
the dynamic linker loads the entire shared object and runs all global
constructors unconditionally.
1.2. The solution: anchor symbols¶
Each registration translation unit defines a volatile int anchor symbol:
// In MyExtractor.cpp — next to the registry Add<> object
// NOLINTNEXTLINE(misc-use-internal-linkage)
volatile int SSAFMyExtractorAnchorSource = 0;
A force-linker header declares the symbol as extern and reads it into a
[[maybe_unused]] static int destination:
// In SSAFBuiltinForceLinker.h
extern volatile int SSAFMyExtractorAnchorSource;
[[maybe_unused]] static int SSAFMyExtractorAnchorDestination =
SSAFMyExtractorAnchorSource;
Any translation unit that #includes this header now has a reference to
SSAFMyExtractorAnchorSource, which forces the linker to pull in
MyExtractor.o — and with it, the static Add<> registration object.
The volatile qualifier is essential: without it the compiler could
constant-fold the 0 and eliminate the reference entirely.
1.2.1. Header hierarchy¶
SSAFForceLinker.h (umbrella — include this in binaries)
└── SSAFBuiltinForceLinker.h (upstream built-in anchors only)
clang/include/clang/Analysis/Scalable/SSAFBuiltinForceLinker.h— anchors for upstream-provided (built-in) extractors and formats (e.g.JSONFormat).clang/include/clang/Analysis/Scalable/SSAFForceLinker.h— umbrella header that includesSSAFBuiltinForceLinker.h. This is the header that downstream projects should modify to add their own force-linker includes (see How to Extend the Framework).
Include the umbrella header with // IWYU pragma: keep in any translation
unit that must guarantee all registrations are active — typically the entry
point of a binary that uses clangAnalysisScalable:
// In ExecuteCompilerInvocation.cpp
#include "clang/Analysis/Scalable/SSAFForceLinker.h" // IWYU pragma: keep
1.2.2. Naming convention¶
Anchor symbols follow the pattern SSAF<Component>AnchorSource and
SSAF<Component>AnchorDestination. For example:
SSAFJSONFormatAnchorSource/SSAFJSONFormatAnchorDestinationSSAFMyExtractorAnchorSource/SSAFMyExtractorAnchorDestination
1.3. Considered alternatives¶
1.3.1. --whole-archive / -force_load¶
The linker can be instructed to include every object file from a static library, regardless of whether any symbols are referenced:
# GNU ld / lld (Linux, BSD)
-Wl,--whole-archive -lclangAnalysisScalable -Wl,--no-whole-archive
# Apple ld
-Wl,-force_load,libclangAnalysisScalable.a
Since CMake 3.24, the $<LINK_LIBRARY:WHOLE_ARCHIVE,...> generator expression
provides a portable way to do the same:
target_link_libraries(clang PRIVATE
"$<LINK_LIBRARY:WHOLE_ARCHIVE,clangAnalysisScalable>")
Why we did not choose this approach:
It is a blunt instrument — all object files in the library are pulled in, increasing binary size.
The anchor approach only targets specific object files: only registrations whose anchors are referenced in a force-linker header are pulled in.
--whole-archivesemantics vary across platforms and toolchains, requiring platform-specific CMake logic or the relatively newWHOLE_ARCHIVEgenerator expression.
1.3.2. Explicit initialization functions¶
An alternative is a central initializeSSAFRegistrations() function that
explicitly calls into each registration module:
void initializeSSAFRegistrations() {
initializeJSONFormat();
initializeMyExtractor();
// ... one entry per registration
}
Why we did not choose this approach:
It reintroduces a centralized list that must be maintained manually, defeating the decoupled-registration benefit of
llvm::Registry.Adding a new extractor or format requires modifying a central file, which increases merge-conflict risk for downstream users.