HIP Support

HIP (Heterogeneous-Compute Interface for Portability) https://github.com/ROCm-Developer-Tools/HIP is a C++ Runtime API and Kernel Language. It enables developers to create portable applications for offloading computation to different hardware platforms from a single source code.

AMD GPU Support

Clang provides HIP support on AMD GPUs via the ROCm platform https://rocm.docs.amd.com/en/latest/#. The ROCm runtime forms the base for HIP host APIs, while HIP device APIs are realized through HIP header files and the ROCm device library. The Clang driver uses the HIPAMD toolchain to compile HIP device code to AMDGPU ISA via the AMDGPU backend. The compiled code is then bundled and embedded in the host executables.

Intel GPU Support

Clang provides partial HIP support on Intel GPUs using the CHIP-Star project https://github.com/CHIP-SPV/chipStar. CHIP-Star implements the HIP runtime over oneAPI Level Zero or OpenCL runtime. The Clang driver uses the HIPSPV toolchain to compile HIP device code into LLVM IR, which is subsequently translated to SPIR-V via the SPIR-V backend or the out-of-tree LLVM-SPIRV translator. The SPIR-V is then bundled and embedded into the host executables.

Note

While Clang does not directly provide HIP support for NVIDIA GPUs and CPUs, these platforms are supported via other means:

Example Usage

To compile a HIP program, use the following command:

clang++ -c --offload-arch=gfx906 -xhip sample.cpp -o sample.o

The -xhip option indicates that the source is a HIP program. If the file has a .hip extension, Clang will automatically recognize it as a HIP program:

clang++ -c --offload-arch=gfx906 sample.hip -o sample.o

To link a HIP program, use this command:

clang++ --hip-link --offload-arch=gfx906 sample.o -o sample

In the above command, the --hip-link flag instructs Clang to link the HIP runtime library. However, the use of this flag is unnecessary if a HIP input file is already present in your program.

For convenience, Clang also supports compiling and linking in a single step:

clang++ --offload-arch=gfx906 -xhip sample.cpp -o sample

In the above commands, gfx906 is the GPU architecture that the code is being compiled for. The supported GPU architectures can be found in the AMDGPU Processor Table. Alternatively, you can use the amdgpu-arch tool that comes with Clang to list the GPU architecture on your system:

amdgpu-arch

You can use --offload-arch=native to automatically detect the GPU architectures on your system:

clang++ --offload-arch=native -xhip sample.cpp -o sample

Path Setting for Dependencies

Compiling a HIP program depends on the HIP runtime and device library. The paths to the HIP runtime and device libraries can be specified either using compiler options or environment variables. The paths can also be set through the ROCm path if they follow the ROCm installation directory structure.

Order of Precedence for HIP Path

  1. --hip-path compiler option

  2. HIP_PATH environment variable (use with caution)

  3. --rocm-path compiler option

  4. ROCM_PATH environment variable (use with caution)

  5. Default automatic detection (relative to Clang or at the default ROCm installation location)

Order of Precedence for Device Library Path

  1. --hip-device-lib-path compiler option

  2. HIP_DEVICE_LIB_PATH environment variable (use with caution)

  3. --rocm-path compiler option

  4. ROCM_PATH environment variable (use with caution)

  5. Default automatic detection (relative to Clang or at the default ROCm installation location)

Compiler Option

Environment Variable

Description

Default Value

--rocm-path=<path>

ROCM_PATH

Specifies the ROCm installation path.

Automatic detection

--hip-path=<path>

HIP_PATH

Specifies the HIP runtime installation path.

Determined by ROCm directory structure

--hip-device-lib-path=<path>

HIP_DEVICE_LIB_PATH

Specifies the HIP device library installation path.

Determined by ROCm directory structure

Note

We recommend using the compiler options as the primary method for specifying these paths. While the environment variables ROCM_PATH, HIP_PATH, and HIP_DEVICE_LIB_PATH are supported, their use can lead to implicit dependencies that might cause issues in the long run. Use them with caution.

Predefined Macros

Macro

Description

__CLANG_RDC__

Defined when Clang is compiling code in Relocatable Device Code (RDC) mode. RDC, enabled with the -fgpu-rdc compiler option, is necessary for linking device codes across translation units.

__HIP__

Defined when compiling with HIP language support, indicating that the code targets the HIP environment.

__HIPCC__

Alias to __HIP__.

__HIP_DEVICE_COMPILE__

Defined during device code compilation in Clang’s separate compilation process for the host and each offloading GPU architecture.

__HIP_MEMORY_SCOPE_SINGLETHREAD

Represents single-thread memory scope in HIP (value is 1).

__HIP_MEMORY_SCOPE_WAVEFRONT

Represents wavefront memory scope in HIP (value is 2).

__HIP_MEMORY_SCOPE_WORKGROUP

Represents workgroup memory scope in HIP (value is 3).

__HIP_MEMORY_SCOPE_AGENT

Represents agent memory scope in HIP (value is 4).

__HIP_MEMORY_SCOPE_SYSTEM

Represents system-wide memory scope in HIP (value is 5).

__HIP_NO_IMAGE_SUPPORT__

Defined with a value of 1 when the target device lacks support for HIP image functions.

__HIP_NO_IMAGE_SUPPORT

Alias to __HIP_NO_IMAGE_SUPPORT__. Deprecated.

__HIP_API_PER_THREAD_DEFAULT_STREAM__

Defined when the GPU default stream is set to per-thread mode.

HIP_API_PER_THREAD_DEFAULT_STREAM

Alias to __HIP_API_PER_THREAD_DEFAULT_STREAM__. Deprecated.