Class LexicalTokenizerName
java.lang.Object
com.azure.core.util.ExpandableStringEnum<LexicalTokenizerName>
com.azure.search.documents.indexes.models.LexicalTokenizerName
- All Implemented Interfaces:
com.azure.core.util.ExpandableEnum<String>
public final class LexicalTokenizerName
extends com.azure.core.util.ExpandableStringEnum<LexicalTokenizerName>
Defines the names of all tokenizers supported by the search engine.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final LexicalTokenizerNameGrammar-based tokenizer that is suitable for processing most European-language documents.static final LexicalTokenizerNameTokenizes the input from an edge into n-grams of the given size(s).static final LexicalTokenizerNameEmits the entire input as a single token.static final LexicalTokenizerNameDivides text at non-letters.static final LexicalTokenizerNameDivides text at non-letters and converts them to lower case.static final LexicalTokenizerNameDivides text using language-specific rules and reduces words to their base forms.static final LexicalTokenizerNameDivides text using language-specific rules.static final LexicalTokenizerNameTokenizes the input into n-grams of the given size(s).static final LexicalTokenizerNameTokenizer for path-like hierarchies.static final LexicalTokenizerNameTokenizer that uses regex pattern matching to construct distinct tokens.static final LexicalTokenizerNameStandard Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter.static final LexicalTokenizerNameTokenizes urls and emails as one token.static final LexicalTokenizerNameDivides text at whitespace. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic LexicalTokenizerNamefromString(String name) Creates or finds a LexicalTokenizerName from its string representation.static Collection<LexicalTokenizerName> values()Gets known LexicalTokenizerName values.Methods inherited from class com.azure.core.util.ExpandableStringEnum
equals, fromString, getValue, hashCode, toString, values
-
Field Details
-
CLASSIC
Grammar-based tokenizer that is suitable for processing most European-language documents. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicTokenizer.html. -
EDGE_NGRAM
Tokenizes the input from an edge into n-grams of the given size(s). See https://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenizer.html. -
KEYWORD
Emits the entire input as a single token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/KeywordTokenizer.html. -
LETTER
Divides text at non-letters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html. -
LOWERCASE
Divides text at non-letters and converts them to lower case. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LowerCaseTokenizer.html. -
MICROSOFT_LANGUAGE_TOKENIZER
Divides text using language-specific rules. -
MICROSOFT_LANGUAGE_STEMMING_TOKENIZER
Divides text using language-specific rules and reduces words to their base forms. -
NGRAM
Tokenizes the input into n-grams of the given size(s). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html. -
PATH_HIERARCHY
Tokenizer for path-like hierarchies. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/path/PathHierarchyTokenizer.html. -
PATTERN
Tokenizer that uses regex pattern matching to construct distinct tokens. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternTokenizer.html. -
STANDARD
Standard Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html. -
UAX_URL_EMAIL
Tokenizes urls and emails as one token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/UAX29URLEmailTokenizer.html. -
WHITESPACE
Divides text at whitespace. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/WhitespaceTokenizer.html.
-
-
Constructor Details
-
LexicalTokenizerName
Deprecated.Use thefromString(String)factory method.Creates a new instance of LexicalTokenizerName value.
-
-
Method Details
-
fromString
Creates or finds a LexicalTokenizerName from its string representation.- Parameters:
name- a name to look for.- Returns:
- the corresponding LexicalTokenizerName.
-
values
Gets known LexicalTokenizerName values.- Returns:
- known LexicalTokenizerName values.
-
fromString(String)factory method.