clang 22.0.0git
clang::syntax::TokenBuffer Class Reference

A list of tokens obtained by preprocessing a text buffer and operations to map between the expanded and spelled tokens, i.e. More...

#include "clang/Tooling/Syntax/Tokens.h"

Classes

struct  Expansion
 An expansion produced by the preprocessor, includes macro expansions and preprocessor directives. More...

Public Member Functions

 TokenBuffer (const SourceManager &SourceMgr)
 TokenBuffer (TokenBuffer &&)=default
 TokenBuffer (const TokenBuffer &)=delete
TokenBufferoperator= (TokenBuffer &&)=default
TokenBufferoperator= (const TokenBuffer &)=delete
llvm::ArrayRef< syntax::TokenexpandedTokens () const
 All tokens produced by the preprocessor after all macro replacements, directives, etc.
void indexExpandedTokens ()
 Builds a cache to make future calls to expandedToken(SourceRange) faster.
llvm::ArrayRef< syntax::TokenexpandedTokens (SourceRange R) const
 Returns the subrange of expandedTokens() corresponding to the closed token range R.
std::optional< llvm::ArrayRef< syntax::Token > > spelledForExpanded (llvm::ArrayRef< syntax::Token > Expanded) const
 Returns the subrange of spelled tokens corresponding to AST node spanning Expanded.
llvm::SmallVector< llvm::ArrayRef< syntax::Token >, 1 > expandedForSpelled (llvm::ArrayRef< syntax::Token > Spelled) const
 Find the subranges of expanded tokens, corresponding to Spelled.
std::optional< ExpansionexpansionStartingAt (const syntax::Token *Spelled) const
 If Spelled starts a mapping (e.g.
std::vector< ExpansionexpansionsOverlapping (llvm::ArrayRef< syntax::Token > Spelled) const
 Returns all expansions (partially) expanded from the specified tokens.
llvm::ArrayRef< syntax::TokenspelledTokens (FileID FID) const
 Lexed tokens of a file before preprocessing.
const syntax::TokenspelledTokenContaining (SourceLocation Loc) const
 Returns the spelled Token containing the Loc, if there are no such tokens returns nullptr.
std::vector< const syntax::Token * > macroExpansions (FileID FID) const
 Get all tokens that expand a macro in FID.
const SourceManagersourceManager () const
std::string dumpForTests () const

Friends

class TokenCollector

Detailed Description

A list of tokens obtained by preprocessing a text buffer and operations to map between the expanded and spelled tokens, i.e.

TokenBuffer has information about two token streams:

  1. Expanded tokens: tokens produced by the preprocessor after all macro replacements,
  2. Spelled tokens: corresponding directly to the source code of a file before any macro replacements occurred. Here's an example to illustrate a difference between those two: #define FOO 10 int a = FOO;

Spelled tokens are {'#','define','FOO','10','int','a','=','FOO',';'}. Expanded tokens are {'int','a','=','10',';','eof'}.

Note that the expanded token stream has a tok::eof token at the end, the spelled tokens never store a 'eof' token.

The full list expanded tokens can be obtained with expandedTokens(). Spelled tokens for each of the files can be obtained via spelledTokens(FileID).

To map between the expanded and spelled tokens use findSpelledByExpanded().

To build a token buffer use the TokenCollector class. You can also compute the spelled tokens of a file using the tokenize() helper.

FIXME: allow mappings into macro arguments.

Definition at line 174 of file Tokens.h.

Constructor & Destructor Documentation

◆ TokenBuffer() [1/3]

clang::syntax::TokenBuffer::TokenBuffer ( const SourceManager & SourceMgr)
inline

Definition at line 176 of file Tokens.h.

Referenced by operator=(), operator=(), TokenBuffer(), and TokenBuffer().

◆ TokenBuffer() [2/3]

clang::syntax::TokenBuffer::TokenBuffer ( TokenBuffer && )
default

References TokenBuffer().

◆ TokenBuffer() [3/3]

clang::syntax::TokenBuffer::TokenBuffer ( const TokenBuffer & )
delete

References TokenBuffer().

Member Function Documentation

◆ dumpForTests()

std::string TokenBuffer::dumpForTests ( ) const

Definition at line 911 of file Tokens.cpp.

References clang::File, and clang::T.

◆ expandedForSpelled()

llvm::SmallVector< llvm::ArrayRef< syntax::Token >, 1 > TokenBuffer::expandedForSpelled ( llvm::ArrayRef< syntax::Token > Spelled) const

Find the subranges of expanded tokens, corresponding to Spelled.

Some spelled tokens may not be present in the expanded token stream, so this function can return an empty vector, e.g. for tokens of macro directives or disabled preprocessor branches.

Some spelled tokens can be duplicated in the expanded token stream multiple times and this function will return multiple results in those cases. This happens when Spelled is inside a macro argument.

FIXME: return correct results on macro arguments. For now, we return an empty list.

(!) will return empty vector on tokens from #define body: E.g. for the following example:

#define FIRST(A) f1 A = A f2 #define SECOND s

a FIRST(arg) b SECOND c // expanded tokens are: a f1 arg = arg f2 b s The results would be

spelled => expanded

#define FIRST => {} a FIRST(arg) => {a f1 arg = arg f2} arg => {arg, arg} // arg #1 is before = and arg #2 is // after = in the expanded tokens.

Definition at line 321 of file Tokens.cpp.

References clang::File.

◆ expandedTokens() [1/2]

llvm::ArrayRef< syntax::Token > clang::syntax::TokenBuffer::expandedTokens ( ) const
inline

All tokens produced by the preprocessor after all macro replacements, directives, etc.

Source locations found in the clang AST will always point to one of these tokens. Tokens are in TU order (per SourceManager::isBeforeInTranslationUnit()). FIXME: figure out how to handle token splitting, e.g. '>>' can be split into two '>' tokens by the parser. However, TokenBuffer currently keeps it as a single '>>' token.

Definition at line 190 of file Tokens.h.

Referenced by expandedTokens().

◆ expandedTokens() [2/2]

llvm::ArrayRef< syntax::Token > TokenBuffer::expandedTokens ( SourceRange R) const

Returns the subrange of expandedTokens() corresponding to the closed token range R.

Consider calling indexExpandedTokens() before for faster lookups.

Definition at line 239 of file Tokens.cpp.

References expandedTokens(), clang::SourceRange::getBegin(), clang::SourceRange::getEnd(), and clang::SourceRange::isInvalid().

◆ expansionsOverlapping()

std::vector< TokenBuffer::Expansion > TokenBuffer::expansionsOverlapping ( llvm::ArrayRef< syntax::Token > Spelled) const

Returns all expansions (partially) expanded from the specified tokens.

This is the expansions whose Spelled range intersects Spelled.

Definition at line 502 of file Tokens.cpp.

References clang::File.

◆ expansionStartingAt()

std::optional< TokenBuffer::Expansion > TokenBuffer::expansionStartingAt ( const syntax::Token * Spelled) const

If Spelled starts a mapping (e.g.

if it's a macro name or '#' starting a preprocessor directive) return the subrange of expanded tokens that the macro expands to.

Definition at line 489 of file Tokens.cpp.

References clang::File.

◆ indexExpandedTokens()

void TokenBuffer::indexExpandedTokens ( )

Builds a cache to make future calls to expandedToken(SourceRange) faster.

Creates an index only once. Further calls to it will be no-op.

Definition at line 226 of file Tokens.cpp.

References clang::SourceLocation::isValid().

◆ macroExpansions()

std::vector< const syntax::Token * > TokenBuffer::macroExpansions ( FileID FID) const

Get all tokens that expand a macro in FID.

For the following input #define FOO B #define FOO2(X) int X FOO2(XY) int B; FOO; macroExpansions() returns {"FOO2", "FOO"} (from line 3 and 5 respecitvely).

Definition at line 559 of file Tokens.cpp.

References clang::File, and clang::syntax::Token::kind().

◆ operator=() [1/2]

TokenBuffer & clang::syntax::TokenBuffer::operator= ( const TokenBuffer & )
delete

References TokenBuffer().

◆ operator=() [2/2]

TokenBuffer & clang::syntax::TokenBuffer::operator= ( TokenBuffer && )
default

References TokenBuffer().

◆ sourceManager()

const SourceManager & clang::syntax::TokenBuffer::sourceManager ( ) const
inline

◆ spelledForExpanded()

std::optional< llvm::ArrayRef< syntax::Token > > TokenBuffer::spelledForExpanded ( llvm::ArrayRef< syntax::Token > Expanded) const

Returns the subrange of spelled tokens corresponding to AST node spanning Expanded.

This is the text that should be replaced if a refactoring were to rewrite the node. If Expanded is empty, the returned value is std::nullopt.

Will fail if the expanded tokens do not correspond to a sequence of spelled tokens. E.g. for the following example:

#define FIRST f1 f2 f3 #define SECOND s1 s2 s3 #define ID2(X, Y) X Y

a FIRST b SECOND c // expanded tokens are: a f1 f2 f3 b s1 s2 s3 c d ID2(e f g, h) i // expanded tokens are: d e f g h i

the results would be:

expanded => spelled

a => a s1 s2 s3 => SECOND a f1 f2 f3 => a FIRST a f1 => can't map s1 s2 => can't map e f => e f g h => can't map

EXPECTS: Expanded is a subrange of expandedTokens(). Complexity is logarithmic.

Definition at line 402 of file Tokens.cpp.

References clang::File, clang::First, clang::Last, and Next.

◆ spelledTokenContaining()

const syntax::Token * TokenBuffer::spelledTokenContaining ( SourceLocation Loc) const

Returns the spelled Token containing the Loc, if there are no such tokens returns nullptr.

Definition at line 385 of file Tokens.cpp.

References clang::SourceLocation::isFileID(), spelledTokens(), and Tok.

◆ spelledTokens()

llvm::ArrayRef< syntax::Token > TokenBuffer::spelledTokens ( FileID FID) const

Lexed tokens of a file before preprocessing.

E.g. for the following input #define DECL(name) int name = 10 DECL(a); spelledTokens() returns {"#", "define", "DECL", "(", "name", ")", "int", "name", "=", "10", "DECL", "(", "a", ")", ";"}

Definition at line 378 of file Tokens.cpp.

Referenced by clang::syntax::spelledIdentifierTouching(), spelledTokenContaining(), and clang::syntax::spelledTokensTouching().

◆ TokenCollector

friend class TokenCollector
friend

Definition at line 350 of file Tokens.h.

References TokenCollector.

Referenced by TokenCollector.


The documentation for this class was generated from the following files: