clang-tools 20.0.0git
|
Namespaces | |
namespace | detail |
Classes | |
struct | Chunk |
NOTE: This is an implementation detail. More... | |
class | Corpus |
class | Dex |
In-memory Dex trigram-based index implementation. More... | |
class | Iterator |
Iterator is the interface for Query Tree node. More... | |
class | PostingList |
PostingList is the storage of DocIDs which can be inserted to the Query Tree as a leaf by constructing Iterator over the PostingList object. More... | |
class | Token |
A Token represents an attribute of a symbol, such as a particular trigram present in the name (used for fuzzy search). More... | |
class | Trigram |
Typedefs | |
using | DocID = uint32_t |
Symbol position in the list of all index symbols sorted by a pre-computed symbol quality. | |
Functions | |
llvm::StringRef | findPathInURI (llvm::StringRef S) |
llvm::SmallVector< llvm::StringRef, ProximityURILimit > | generateProximityURIs (llvm::StringRef) |
Returns Search Token for a number of parent directories of given Path. | |
std::vector< std::pair< DocID, float > > | consume (Iterator &It) |
Advances the iterator until it is exhausted. | |
template<typename Func > | |
static void | identifierTrigrams (llvm::StringRef Identifier, Func Out) |
void | generateIdentifierTrigrams (llvm::StringRef Identifier, std::vector< Trigram > &Out) |
Produces list of unique fuzzy-search trigrams from unqualified symbol. | |
std::vector< Token > | generateQueryTrigrams (llvm::StringRef Query) |
Returns list of unique fuzzy-search trigrams given a query. | |
Variables | |
constexpr unsigned | ProximityURILimit = 5 |
using clang::clangd::dex::DocID = typedef uint32_t |
Symbol position in the list of all index symbols sorted by a pre-computed symbol quality.
Definition at line 45 of file Iterator.h.
Advances the iterator until it is exhausted.
Returns pairs of document IDs with the corresponding boosting score.
Boosting can be seen as a compromise between retrieving too many items and calculating finals score for each of them (which might be very expensive) and not retrieving enough items so that items with very high final score would not be processed. Boosting score is a computationally efficient way to acquire preliminary scores of requested items.
Definition at line 357 of file Iterator.cpp.
References clang::clangd::dex::Iterator::advance(), clang::clangd::dex::Iterator::consume(), clang::clangd::dex::Iterator::peek(), and clang::clangd::dex::Iterator::reachedEnd().
Referenced by clang::clangd::dex::Dex::fuzzyFind().
llvm::StringRef clang::clangd::dex::findPathInURI | ( | llvm::StringRef | S | ) |
Definition at line 405 of file Dex.cpp.
References C, and findPathInURI().
Referenced by findPathInURI(), and generateProximityURIs().
void clang::clangd::dex::generateIdentifierTrigrams | ( | llvm::StringRef | Identifier, |
std::vector< Trigram > & | Out | ||
) |
Produces list of unique fuzzy-search trigrams from unqualified symbol.
The trigrams give the 3-character query substrings this symbol can match.
The symbol's name is broken into segments, e.g. "FooBar" has two segments. Trigrams can start at any character in the input. Then we can choose to move to the next character or to the start of the next segment.
Short trigrams (length 1-2) are used for short queries. These are:
For "FooBar" we get the following trigrams: {f, fo, fb, foo, fob, fba, oob, oba, bar}.
Trigrams are lowercase, as trigram matching is case-insensitive. Trigrams in the list are deduplicated.
Definition at line 100 of file Trigram.cpp.
References clang::clangd::Identifier, and identifierTrigrams().
llvm::SmallVector< llvm::StringRef, 5 > clang::clangd::dex::generateProximityURIs | ( | llvm::StringRef | ) |
Returns Search Token for a number of parent directories of given Path.
Should be used within the index build process.
This function is exposed for testing only.
Definition at line 420 of file Dex.cpp.
References findPathInURI(), generateProximityURIs(), and ProximityURILimit.
Referenced by generateProximityURIs().
std::vector< Token > clang::clangd::dex::generateQueryTrigrams | ( | llvm::StringRef | Query | ) |
Returns list of unique fuzzy-search trigrams given a query.
Query is segmented using FuzzyMatch API and downcasted to lowercase. Then, the simplest trigrams - sequences of three consecutive letters and digits are extracted and returned after deduplication.
For short queries (less than 3 characters with Head or Tail roles in Fuzzy Matching segmentation) this returns a single trigram with the first characters (up to 3) to perform prefix match.
Definition at line 123 of file Trigram.cpp.
References clang::clangd::calculateRoles(), clang::clangd::Head, clang::clangd::Tail, and clang::clangd::dex::Token::Trigram.
Referenced by clang::clangd::dex::Dex::fuzzyFind().
|
static |
Definition at line 30 of file Trigram.cpp.
References clang::clangd::calculateRoles(), clang::clangd::Head, clang::clangd::Identifier, K, Out, and clang::clangd::Tail.
Referenced by generateIdentifierTrigrams().
|
constexpr |
Definition at line 417 of file Dex.cpp.
Referenced by generateProximityURIs().