Class LuceneStandardTokenizer

java.lang.Object
com.azure.search.documents.indexes.models.LexicalTokenizer
com.azure.search.documents.indexes.models.LuceneStandardTokenizer
All Implemented Interfaces:
com.azure.json.JsonSerializable<LexicalTokenizer>

public final class LuceneStandardTokenizer extends LexicalTokenizer
Breaks text following the Unicode Text Segmentation rules. This tokenizer is implemented using Apache Lucene.
  • Constructor Details

    • LuceneStandardTokenizer

      public LuceneStandardTokenizer(String name)
      Constructor of LuceneStandardTokenizer.
      Parameters:
      name - The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
  • Method Details

    • getMaxTokenLength

      public Integer getMaxTokenLength()
      Get the maxTokenLength property: The maximum token length. Default is 255. Tokens longer than the maximum length are split.
      Returns:
      the maxTokenLength value.
    • setMaxTokenLength

      public LuceneStandardTokenizer setMaxTokenLength(Integer maxTokenLength)
      Set the maxTokenLength property: The maximum token length. Default is 255. Tokens longer than the maximum length are split.
      Parameters:
      maxTokenLength - the maxTokenLength value to set.
      Returns:
      the LuceneStandardTokenizer object itself.
    • toJson

      public com.azure.json.JsonWriter toJson(com.azure.json.JsonWriter jsonWriter) throws IOException
      Description copied from class: LexicalTokenizer
      Specified by:
      toJson in interface com.azure.json.JsonSerializable<LexicalTokenizer>
      Overrides:
      toJson in class LexicalTokenizer
      Throws:
      IOException