1. Force-Linker Headers

Warning

The framework is rapidly evolving. The documentation might be out-of-sync with the implementation. The purpose of this documentation is to give context for upcoming reviews.

1.1. The problem

SSAF uses llvm::Registry<> for decentralized registration of summary extractors and serialization formats. Each registration is a file-scope static object whose constructor adds an entry to the global registry:

// In MyExtractor.cpp
static TUSummaryExtractorRegistry::Add<MyExtractor>
    RegisterExtractor("MyExtractor", "My summary extractor");

When the translation unit containing this static object is compiled into a static library (.a / .lib), the static linker will only pull in object files that resolve an undefined symbol in the consuming binary. Because no code ever calls anything in MyExtractor.o directly, the linker discards the object file — and the registration never runs.

This is not a problem for shared libraries (.so / .dylib), because the dynamic linker loads the entire shared object and runs all global constructors unconditionally.

1.2. The solution: anchor symbols

Each registration translation unit defines a volatile int anchor symbol:

// In MyExtractor.cpp - next to the registry Add<> object in the ``clang::ssaf`` namespace
// NOLINTNEXTLINE(misc-use-internal-linkage)
volatile int MyExtractorAnchorSource = 0;

For in-tree anchors, add a single ANCHOR(...) entry to BuiltinAnchorSources.def (in alphabetical order):

// In clang/include/clang/ScalableStaticAnalysisFramework/BuiltinAnchorSources.def
ANCHOR(JSONFormatAnchorSource)
ANCHOR(MyExtractorAnchorSource) // <-- Add here, in alphabetical order

SSAFBuiltinForceLinker.h includes this .def file automatically to generate the extern declarations and the AnchorSources array — there is no need to edit that header directly.

Any translation unit that #includes this header now has a reference to MyExtractorAnchorSource, which forces the linker to pull in MyExtractor.o — and with it, the static Add<> registration object.

The volatile qualifier is essential: without it the compiler could constant-fold the 0 and eliminate the reference entirely.

These anchor symbols must be mutable (not const), because otherwise on MSVC const volatile variables would still have internal linkage — despite that the standard specifies that these should have external linkage.

1.2.1. Header hierarchy

SSAFForceLinker.h                   (umbrella — include this in binaries)
└── SSAFBuiltinForceLinker.h        (upstream built-in anchors only)
  • clang/include/clang/ScalableStaticAnalysisFramework/SSAFBuiltinForceLinker.h — anchors for upstream-provided (built-in) extractors and formats (e.g. JSONFormat).

  • clang/include/clang/ScalableStaticAnalysisFramework/SSAFForceLinker.h — umbrella header that includes SSAFBuiltinForceLinker.h. This is the header that downstream projects should modify to add their own force-linker includes (see How to Extend the Framework).

Include the umbrella header with // IWYU pragma: keep in any translation unit that must guarantee all registrations are active — typically the entry point of a binary that uses clangScalableStaticAnalysisFrameworkCore:

// In ExecuteCompilerInvocation.cpp
#include "clang/ScalableStaticAnalysisFramework/SSAFForceLinker.h" // IWYU pragma: keep

1.2.2. Naming convention

Anchor symbols follow the pattern <Component>AnchorSource in the clang::ssaf namespace. For example:

  • JSONFormatAnchorSource

  • MyExtractorAnchorSource

All anchor sources are aggregated into a single BuiltinAnchorDestination lambda in the force-linker header (see SSAFBuiltinForceLinker.h).

1.3. Considered alternatives

1.3.1. --whole-archive / -force_load

The linker can be instructed to include every object file from a static library, regardless of whether any symbols are referenced:

# GNU ld / lld (Linux, BSD)
-Wl,--whole-archive -lclangScalableStaticAnalysisFrameworkCore -Wl,--no-whole-archive

# Apple ld
-Wl,-force_load,libclangScalableStaticAnalysisFrameworkCore.a

Since CMake 3.24, the $<LINK_LIBRARY:WHOLE_ARCHIVE,...> generator expression provides a portable way to do the same:

target_link_libraries(clang PRIVATE
  "$<LINK_LIBRARY:WHOLE_ARCHIVE,clangScalableStaticAnalysisFrameworkCore>")

Why we did not choose this approach:

  • It is a blunt instrument — all object files in the library are pulled in, increasing binary size.

  • The anchor approach only targets specific object files: only registrations whose anchors are referenced in a force-linker header are pulled in.

  • --whole-archive semantics vary across platforms and toolchains, requiring platform-specific CMake logic or the relatively new WHOLE_ARCHIVE generator expression.

1.3.2. Explicit initialization functions

An alternative is a central initializeSSAFRegistrations() function that explicitly calls into each registration module:

void initializeSSAFRegistrations() {
  initializeJSONFormat();
  initializeMyExtractor();
  // ... one entry per registration
}

Why we did not choose this approach:

  • It reintroduces a centralized list that must be maintained manually, defeating the decoupled-registration benefit of llvm::Registry.

  • Adding a new extractor or format requires modifying a central file, which increases merge-conflict risk for downstream users.