Package version:

Enumeration KnownTokenFilterNames

Known values of TokenFilterName that the service accepts.

Enumeration Members

Apostrophe: "apostrophe"

Strips all characters after an apostrophe (including the apostrophe itself). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html

ArabicNormalization: "arabic_normalization"

A token filter that applies the Arabic normalizer to normalize the orthography. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ar/ArabicNormalizationFilter.html

AsciiFolding: "asciifolding"

Converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html

CjkBigram: "cjk_bigram"

Forms bigrams of CJK terms that are generated from the standard tokenizer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKBigramFilter.html

CjkWidth: "cjk_width"

Normalizes CJK width differences. Folds fullwidth ASCII variants into the equivalent basic Latin, and half-width Katakana variants into the equivalent Kana. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html

Classic: "classic"
CommonGram: "common_grams"

Construct bigrams for frequently occurring terms while indexing. Single terms are still indexed too, with bigrams overlaid. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html

EdgeNGram: "edgeNGram_v2"

Generates n-grams of the given size(s) starting from the front or the back of an input token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.html

Elision: "elision"

Removes elisions. For example, "l'avion" (the plane) will be converted to "avion" (plane). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/util/ElisionFilter.html

GermanNormalization: "german_normalization"

Normalizes German characters according to the heuristics of the German2 snowball algorithm. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html

HindiNormalization: "hindi_normalization"

Normalizes text in Hindi to remove some differences in spelling variations. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/hi/HindiNormalizationFilter.html

IndicNormalization: "indic_normalization"
KStem: "kstem"
KeywordRepeat: "keyword_repeat"
Length: "length"
Limit: "limit"
Lowercase: "lowercase"
NGram: "nGram_v2"
PersianNormalization: "persian_normalization"
Phonetic: "phonetic"
PorterStem: "porter_stem"

Uses the Porter stemming algorithm to transform the token stream. See http://tartarus.org/~martin/PorterStemmer

Reverse: "reverse"
ScandinavianFoldingNormalization: "scandinavian_folding"

Folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o. It also discriminates against use of double vowels aa, ae, ao, oe and oo, leaving just the first one. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianFoldingFilter.html

ScandinavianNormalization: "scandinavian_normalization"
Shingle: "shingle"
Snowball: "snowball"
SoraniNormalization: "sorani_normalization"
Stemmer: "stemmer"
Stopwords: "stopwords"
Trim: "trim"
Truncate: "truncate"
Unique: "unique"
Uppercase: "uppercase"
WordDelimiter: "word_delimiter"

Splits words into subwords and performs optional transformations on subword groups.