clang 19.0.0git
Classes | Public Member Functions | Friends | List of all members
clang::syntax::TokenBuffer Class Reference

A list of tokens obtained by preprocessing a text buffer and operations to map between the expanded and spelled tokens, i.e. More...

#include "clang/Tooling/Syntax/Tokens.h"

Classes

struct  Expansion
 An expansion produced by the preprocessor, includes macro expansions and preprocessor directives. More...
 

Public Member Functions

 TokenBuffer (const SourceManager &SourceMgr)
 
 TokenBuffer (TokenBuffer &&)=default
 
 TokenBuffer (const TokenBuffer &)=delete
 
TokenBufferoperator= (TokenBuffer &&)=default
 
TokenBufferoperator= (const TokenBuffer &)=delete
 
llvm::ArrayRef< syntax::TokenexpandedTokens () const
 All tokens produced by the preprocessor after all macro replacements, directives, etc.
 
void indexExpandedTokens ()
 Builds a cache to make future calls to expandedToken(SourceRange) faster.
 
llvm::ArrayRef< syntax::TokenexpandedTokens (SourceRange R) const
 Returns the subrange of expandedTokens() corresponding to the closed token range R.
 
std::optional< llvm::ArrayRef< syntax::Token > > spelledForExpanded (llvm::ArrayRef< syntax::Token > Expanded) const
 Returns the subrange of spelled tokens corresponding to AST node spanning Expanded.
 
llvm::SmallVector< llvm::ArrayRef< syntax::Token >, 1 > expandedForSpelled (llvm::ArrayRef< syntax::Token > Spelled) const
 Find the subranges of expanded tokens, corresponding to Spelled.
 
std::optional< ExpansionexpansionStartingAt (const syntax::Token *Spelled) const
 If Spelled starts a mapping (e.g.
 
std::vector< ExpansionexpansionsOverlapping (llvm::ArrayRef< syntax::Token > Spelled) const
 Returns all expansions (partially) expanded from the specified tokens.
 
llvm::ArrayRef< syntax::TokenspelledTokens (FileID FID) const
 Lexed tokens of a file before preprocessing.
 
const syntax::TokenspelledTokenAt (SourceLocation Loc) const
 Returns the spelled Token starting at Loc, if there are no such tokens returns nullptr.
 
std::vector< const syntax::Token * > macroExpansions (FileID FID) const
 Get all tokens that expand a macro in FID.
 
const SourceManagersourceManager () const
 
std::string dumpForTests () const
 

Friends

class TokenCollector
 

Detailed Description

A list of tokens obtained by preprocessing a text buffer and operations to map between the expanded and spelled tokens, i.e.

TokenBuffer has information about two token streams:

  1. Expanded tokens: tokens produced by the preprocessor after all macro replacements,
  2. Spelled tokens: corresponding directly to the source code of a file before any macro replacements occurred. Here's an example to illustrate a difference between those two: #define FOO 10 int a = FOO;

Spelled tokens are {'#','define','FOO','10','int','a','=','FOO',';'}. Expanded tokens are {'int','a','=','10',';','eof'}.

Note that the expanded token stream has a tok::eof token at the end, the spelled tokens never store a 'eof' token.

The full list expanded tokens can be obtained with expandedTokens(). Spelled tokens for each of the files can be obtained via spelledTokens(FileID).

To map between the expanded and spelled tokens use findSpelledByExpanded().

To build a token buffer use the TokenCollector class. You can also compute the spelled tokens of a file using the tokenize() helper.

FIXME: allow mappings into macro arguments.

Definition at line 174 of file Tokens.h.

Constructor & Destructor Documentation

◆ TokenBuffer() [1/3]

clang::syntax::TokenBuffer::TokenBuffer ( const SourceManager SourceMgr)
inline

Definition at line 176 of file Tokens.h.

◆ TokenBuffer() [2/3]

clang::syntax::TokenBuffer::TokenBuffer ( TokenBuffer &&  )
default

◆ TokenBuffer() [3/3]

clang::syntax::TokenBuffer::TokenBuffer ( const TokenBuffer )
delete

Member Function Documentation

◆ dumpForTests()

std::string TokenBuffer::dumpForTests ( ) const

Definition at line 912 of file Tokens.cpp.

References clang::File, clang::SourceManager::getFileEntryRefForID(), and clang::T.

◆ expandedForSpelled()

llvm::SmallVector< llvm::ArrayRef< syntax::Token >, 1 > TokenBuffer::expandedForSpelled ( llvm::ArrayRef< syntax::Token Spelled) const

Find the subranges of expanded tokens, corresponding to Spelled.

Some spelled tokens may not be present in the expanded token stream, so this function can return an empty vector, e.g. for tokens of macro directives or disabled preprocessor branches.

Some spelled tokens can be duplicated in the expanded token stream multiple times and this function will return multiple results in those cases. This happens when Spelled is inside a macro argument.

FIXME: return correct results on macro arguments. For now, we return an empty list.

(!) will return empty vector on tokens from #define body: E.g. for the following example:

#define FIRST(A) f1 A = A f2 #define SECOND s

a FIRST(arg) b SECOND c // expanded tokens are: a f1 arg = arg f2 b s The results would be

spelled => expanded

#define FIRST => {} a FIRST(arg) => {a f1 arg = arg f2} arg => {arg, arg} // arg #1 is before = and arg #2 is // after = in the expanded tokens.

Definition at line 323 of file Tokens.cpp.

References clang::File.

◆ expandedTokens() [1/2]

llvm::ArrayRef< syntax::Token > clang::syntax::TokenBuffer::expandedTokens ( ) const
inline

All tokens produced by the preprocessor after all macro replacements, directives, etc.

Source locations found in the clang AST will always point to one of these tokens. Tokens are in TU order (per SourceManager::isBeforeInTranslationUnit()). FIXME: figure out how to handle token splitting, e.g. '>>' can be split into two '>' tokens by the parser. However, TokenBuffer currently keeps it as a single '>>' token.

Definition at line 190 of file Tokens.h.

Referenced by expandedTokens(), clang::syntax::TreeBuilder::finalize(), and clang::syntax::TreeBuilder::TreeBuilder().

◆ expandedTokens() [2/2]

llvm::ArrayRef< syntax::Token > TokenBuffer::expandedTokens ( SourceRange  R) const

Returns the subrange of expandedTokens() corresponding to the closed token range R.

Consider calling indexExpandedTokens() before for faster lookups.

Definition at line 241 of file Tokens.cpp.

References expandedTokens(), clang::SourceRange::getBegin(), clang::SourceRange::getEnd(), and clang::SourceRange::isInvalid().

◆ expansionsOverlapping()

std::vector< TokenBuffer::Expansion > TokenBuffer::expansionsOverlapping ( llvm::ArrayRef< syntax::Token Spelled) const

Returns all expansions (partially) expanded from the specified tokens.

This is the expansions whose Spelled range intersects Spelled.

Definition at line 503 of file Tokens.cpp.

References clang::File.

◆ expansionStartingAt()

std::optional< TokenBuffer::Expansion > TokenBuffer::expansionStartingAt ( const syntax::Token Spelled) const

If Spelled starts a mapping (e.g.

if it's a macro name or '#' starting a preprocessor directive) return the subrange of expanded tokens that the macro expands to.

Definition at line 490 of file Tokens.cpp.

References clang::File.

◆ indexExpandedTokens()

void TokenBuffer::indexExpandedTokens ( )

Builds a cache to make future calls to expandedToken(SourceRange) faster.

Creates an index only once. Further calls to it will be no-op.

Definition at line 228 of file Tokens.cpp.

References clang::SourceLocation::isValid().

◆ macroExpansions()

std::vector< const syntax::Token * > TokenBuffer::macroExpansions ( FileID  FID) const

Get all tokens that expand a macro in FID.

For the following input #define FOO B #define FOO2(X) int X FOO2(XY) int B; FOO; macroExpansions() returns {"FOO2", "FOO"} (from line 3 and 5 respecitvely).

Definition at line 560 of file Tokens.cpp.

References clang::File, and clang::syntax::Token::kind().

◆ operator=() [1/2]

TokenBuffer & clang::syntax::TokenBuffer::operator= ( const TokenBuffer )
delete

◆ operator=() [2/2]

TokenBuffer & clang::syntax::TokenBuffer::operator= ( TokenBuffer &&  )
default

◆ sourceManager()

const SourceManager & clang::syntax::TokenBuffer::sourceManager ( ) const
inline

Definition at line 309 of file Tokens.h.

◆ spelledForExpanded()

std::optional< llvm::ArrayRef< syntax::Token > > TokenBuffer::spelledForExpanded ( llvm::ArrayRef< syntax::Token Expanded) const

Returns the subrange of spelled tokens corresponding to AST node spanning Expanded.

This is the text that should be replaced if a refactoring were to rewrite the node. If Expanded is empty, the returned value is std::nullopt.

Will fail if the expanded tokens do not correspond to a sequence of spelled tokens. E.g. for the following example:

#define FIRST f1 f2 f3 #define SECOND s1 s2 s3 #define ID2(X, Y) X Y

a FIRST b SECOND c // expanded tokens are: a f1 f2 f3 b s1 s2 s3 c d ID2(e f g, h) i // expanded tokens are: d e f g h i

the results would be:

expanded => spelled

a => a s1 s2 s3 => SECOND a f1 f2 f3 => a FIRST a f1 => can't map s1 s2 => can't map e f => e f g h => can't map

EXPECTS: Expanded is a subrange of expandedTokens(). Complexity is logarithmic.

Definition at line 403 of file Tokens.cpp.

References clang::File, clang::First, clang::SourceManager::getFileID(), clang::SourceManager::isMacroArgExpansion(), and clang::Last.

◆ spelledTokenAt()

const syntax::Token * TokenBuffer::spelledTokenAt ( SourceLocation  Loc) const

Returns the spelled Token starting at Loc, if there are no such tokens returns nullptr.

Definition at line 386 of file Tokens.cpp.

References clang::SourceManager::getFileID(), clang::SourceLocation::isFileID(), clang::syntax::Token::location(), and spelledTokens().

◆ spelledTokens()

llvm::ArrayRef< syntax::Token > TokenBuffer::spelledTokens ( FileID  FID) const

Lexed tokens of a file before preprocessing.

E.g. for the following input #define DECL(name) int name = 10 DECL(a); spelledTokens() returns {"#", "define", "DECL", "(", "name", ")", "int", "name", "=", "10", "DECL", "(", "a", ")", ";"}

Definition at line 380 of file Tokens.cpp.

Referenced by spelledTokenAt().

Friends And Related Function Documentation

◆ TokenCollector

friend class TokenCollector
friend

Definition at line 350 of file Tokens.h.


The documentation for this class was generated from the following files: