java.lang.Object

com.azure.core.util.ExpandableStringEnum<LexicalTokenizerName>

com.azure.search.documents.indexes.models.LexicalTokenizerName

All Implemented Interfaces:: com.azure.core.util.ExpandableEnum<String>

public final class LexicalTokenizerName extends com.azure.core.util.ExpandableStringEnum<LexicalTokenizerName>

Defines the names of all tokenizers supported by the search engine.

Field Summary

Fields

Modifier and Type

Field

Description

static final LexicalTokenizerName

CLASSIC

Grammar-based tokenizer that is suitable for processing most European-language documents.

static final LexicalTokenizerName

EDGE_NGRAM

Tokenizes the input from an edge into n-grams of the given size(s).

static final LexicalTokenizerName

KEYWORD

Emits the entire input as a single token.

static final LexicalTokenizerName

LETTER

Divides text at non-letters.

static final LexicalTokenizerName

LOWERCASE

Divides text at non-letters and converts them to lower case.

static final LexicalTokenizerName

MICROSOFT_LANGUAGE_STEMMING_TOKENIZER

Divides text using language-specific rules and reduces words to their base forms.

static final LexicalTokenizerName

MICROSOFT_LANGUAGE_TOKENIZER

Divides text using language-specific rules.

static final LexicalTokenizerName

NGRAM

Tokenizes the input into n-grams of the given size(s).

static final LexicalTokenizerName

PATH_HIERARCHY

Tokenizer for path-like hierarchies.

static final LexicalTokenizerName

PATTERN

Tokenizer that uses regex pattern matching to construct distinct tokens.

static final LexicalTokenizerName

STANDARD

Standard Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter.

static final LexicalTokenizerName

UAX_URL_EMAIL

Tokenizes urls and emails as one token.

static final LexicalTokenizerName

WHITESPACE

Divides text at whitespace.
Constructor Summary

Constructors

Constructor

Description

LexicalTokenizerName()

Deprecated.
Use the fromString(String) factory method.
Method Summary

Modifier and Type

Method

Description

static LexicalTokenizerName

fromString(String name)

Creates or finds a LexicalTokenizerName from its string representation.

static Collection<LexicalTokenizerName>

values()

Gets known LexicalTokenizerName values.

Methods inherited from class com.azure.core.util.ExpandableStringEnum
equals, fromString, getValue, hashCode, toString, values

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

Field Details
- CLASSIC
  
  public static final LexicalTokenizerName CLASSIC
  
  Grammar-based tokenizer that is suitable for processing most European-language documents. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicTokenizer.html.
- EDGE_NGRAM
  
  public static final LexicalTokenizerName EDGE_NGRAM
  
  Tokenizes the input from an edge into n-grams of the given size(s). See https://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenizer.html.
- KEYWORD
  
  public static final LexicalTokenizerName KEYWORD
  
  Emits the entire input as a single token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/KeywordTokenizer.html.
- LETTER
  
  public static final LexicalTokenizerName LETTER
  
  Divides text at non-letters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html.
- LOWERCASE
  
  public static final LexicalTokenizerName LOWERCASE
  
  Divides text at non-letters and converts them to lower case. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LowerCaseTokenizer.html.
- MICROSOFT_LANGUAGE_TOKENIZER
  
  public static final LexicalTokenizerName MICROSOFT_LANGUAGE_TOKENIZER
  
  Divides text using language-specific rules.
- MICROSOFT_LANGUAGE_STEMMING_TOKENIZER
  
  public static final LexicalTokenizerName MICROSOFT_LANGUAGE_STEMMING_TOKENIZER
  
  Divides text using language-specific rules and reduces words to their base forms.
- NGRAM
  
  public static final LexicalTokenizerName NGRAM
  
  Tokenizes the input into n-grams of the given size(s). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html.
- PATH_HIERARCHY
  
  public static final LexicalTokenizerName PATH_HIERARCHY
  
  Tokenizer for path-like hierarchies. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/path/PathHierarchyTokenizer.html.
- PATTERN
  
  public static final LexicalTokenizerName PATTERN
  
  Tokenizer that uses regex pattern matching to construct distinct tokens. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternTokenizer.html.
- STANDARD
  
  public static final LexicalTokenizerName STANDARD
  
  Standard Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html.
- UAX_URL_EMAIL
  
  public static final LexicalTokenizerName UAX_URL_EMAIL
  
  Tokenizes urls and emails as one token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/UAX29URLEmailTokenizer.html.
- WHITESPACE
  
  public static final LexicalTokenizerName WHITESPACE
  
  Divides text at whitespace. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/WhitespaceTokenizer.html.
Constructor Details
- LexicalTokenizerName
  
  @Deprecated public LexicalTokenizerName()
  
  Deprecated.
  Use the fromString(String) factory method.
  
  Creates a new instance of LexicalTokenizerName value.
Method Details
- fromString
  
  public static LexicalTokenizerName fromString(String name)
  
  Creates or finds a LexicalTokenizerName from its string representation.
  
  Parameters:
  
  name - a name to look for.
  
  Returns:
  
  the corresponding LexicalTokenizerName.
- values
  
  public static Collection<LexicalTokenizerName> values()
  
  Gets known LexicalTokenizerName values.
  
  Returns:
  
  known LexicalTokenizerName values.

Class LexicalTokenizerName

Field Summary

Constructor Summary

Method Summary

Methods inherited from class com.azure.core.util.ExpandableStringEnum

Methods inherited from class java.lang.Object

Field Details

CLASSIC

EDGE_NGRAM

KEYWORD

LETTER

LOWERCASE

MICROSOFT_LANGUAGE_TOKENIZER

MICROSOFT_LANGUAGE_STEMMING_TOKENIZER

NGRAM

PATH_HIERARCHY

PATTERN

STANDARD

UAX_URL_EMAIL

WHITESPACE

Constructor Details

LexicalTokenizerName

Method Details

fromString

values