Hacking on Clang

This document provides some hints for how to get started hacking on Clang for developers who are new to the Clang and/or LLVM codebases.

Coding Standards

Clang follows the LLVM Coding Standards. When submitting patches, please take care to follow these standards and to match the style of the code to that present in Clang (for example, in terms of indentation, bracing, and statement spacing).

Clang has a few additional coding standards:

Developer Documentation

Both Clang and LLVM use doxygen to provide API documentation. Their respective web pages (generated nightly) are here:

For work on the LLVM IR generation, the LLVM assembly language reference manual is also useful.

Debugging

Inspecting data structures in a debugger:

Debugging using Visual Studio

The files llvm/utils/LLVMVisualizers/llvm.natvis and clang/utils/ClangVisualizers/clang.natvis provide debugger visualizers that make debugging of more complex data types much easier.

Depending on how you configure the project, Visual Studio may automatically use these visualizers when debugging or you may be required to put the files into %USERPROFILE%\Documents\Visual Studio <version>\Visualizers or create a symbolic link so they update automatically. See Microsoft's documentation for more details on use of NATVIS.

Testing

Testing on Unix-like Systems

Clang includes a basic regression suite in the tree which can be run with make test from the top-level clang directory, or just make in the test sub-directory. make VERBOSE=1 can be used to show more detail about what is being run.

If you built LLVM and Clang using CMake, the test suite can be run with make check-clang from the top-level LLVM directory.

The tests primarily consist of a test runner script running the compiler under test on individual test files grouped in the directories under the test directory. The individual test files include comments at the beginning indicating the Clang compile options to use, to be read by the test runner. Embedded comments also can do things like telling the test runner that an error is expected at the current line. Any output files produced by the test will be placed under a created Output directory.

During the run of make test, the terminal output will display a line similar to the following:

--- Running clang tests for i686-pc-linux-gnu ---

followed by a line continually overwritten with the current test file being compiled, and an overall completion percentage.

After the make test run completes, the absence of any Failing Tests (count): message indicates that no tests failed unexpectedly. If any tests did fail, the Failing Tests (count): message will be followed by a list of the test source file paths that failed. For example:

  Failing Tests (3):
      /home/john/llvm/tools/clang/test/SemaCXX/member-name-lookup.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/namespace-alias.cpp
      /home/john/llvm/tools/clang/test/SemaCXX/using-directive.cpp

If you used the make VERBOSE=1 option, the terminal output will reflect the error messages from the compiler and test runner.

The regression suite can also be run with Valgrind by running make test VG=1 in the top-level clang directory.

For more intensive changes, running the LLVM Test Suite with clang is recommended. Currently the best way to override LLVMGCC, as in: make LLVMGCC="clang -std=gnu89" TEST=nightly report (make sure clang is in your PATH or use the full path).

Testing using Visual Studio on Windows

The Clang test suite can be run from either Visual Studio or the command line.

Note that the test runner is based on Python, which must be installed. Find Python at: https://www.python.org/downloads/. Download the latest stable version.

The GNU core utilities included in Git For Windows are also required to run the tests. This is available from https://git-scm.com/download. You can specify the LLVM_LIT_TOOLS_DIR to CMake explicitly to override the location of the GNU core utilities used for testing.

The cmake build tool is set up to create Visual Studio project files for running the tests, "check-clang" being the root. Therefore, to run the test from Visual Studio, right-click the check-clang project and select "Build".

Please see also Getting Started with the LLVM System using Microsoft Visual Studio and Building LLVM with CMake.

Testing on the Command Line

If you want more control over how the tests are run, it may be convenient to run the test harness on the command-line directly. Running the check-clang build target will generate a script to start the LLVM Integrated Tester (lit) that can be used to run tests for your current configuration. Once the tests have started running, you can stop them with control+C, as the files are generated before running any tests.

Once that is done, all the tests can be executed from the command line by running the generated llvm-lit script as follows:

  (build dir)\bin\llvm-lit (path to llvm)\clang\test
  
For example; if you have a Ninja build in the llvm-project\build_ninja directory, the command to execute from the llvm-project directory would be:
  build_ninja\bin\llvm-lit clang\test 
  
Or, for a Visual Studio Debug build in the llvm-project\build directory, the lit start command to execute from the llvm-project directory would be:
  build\Debug\bin\llvm-lit clang\test
  

You can run a single test or all tests in a specific folder by providing the target test or folder to lit. For example, we can run the wchar.c test:

    build_ninja\bin\llvm-lit clang\test\Sema\wchar.c
  

or all tests in the Sema folder:

    build_ninja\bin\llvm-lit clang\test\Sema
  

Pass in the --no-progress-bar option if you wish to disable progress indications while the tests are running.

Your output might look something like this:

lit.py: lit.cfg:152: note: using clang: 'C:\Tools\llvm\bin\Release\clang.EXE'
-- Testing: Testing: 2534 tests, 4 threads --
Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Testing Time: 81.52s
  Passed           : 2503
  Expectedly Failed:   28
  Unsupported      :    3

The statistic, "Failed" (not shown if all tests pass), is the important one.

Testing changes affecting libc++

Some changes in Clang affect libc++, for example:

After adjusting libc++ to work with the changes, the next revision will be tested by libc++'s pre-commit CI.

For most configurations, the pre-commit CI uses a recent nightly build of Clang from LLVM's main branch. These configurations do not use the Clang changes in the patch. They only use the libc++ changes.

The "Bootstrapping build" builds Clang and uses it to build and test libc++. This build does use the Clang changes in the patch.

Libc++ supports multiple versions of Clang. Therefore when a patch changes the diagnostics it might be required to use a regex in the "expected" tests to make it pass the CI.

Libc++ has more documentation about the pre-commit CI. For questions regarding libc++, the best place to ask is the #libcxx channel on LLVM's Discord server.

Creating Patch Files

To contribute changes to Clang see LLVM's Getting Started page

LLVM IR Generation

The LLVM IR generation part of clang handles conversion of the AST nodes output by the Sema module to the LLVM Intermediate Representation (IR). Historically, this was referred to as "codegen", and the Clang code for this lives in lib/CodeGen.

The output is most easily inspected using the -emit-llvm option to clang (possibly in conjunction with -o -). You can also use -emit-llvm-bc to write an LLVM bitcode file which can be processed by the suite of LLVM tools like llvm-dis, llvm-nm, etc. See the LLVM Command Guide for more information.