Custom Filters
Custom filters can be viewed and modified from the index’s configuration page under the Index Settings section. Any custom filters that are configured for the current index can be viewed by expanding the Custom Filters panel. If no custom filters have been configured for the index, the Custom Filters panel will be empty.
Add Custom Filter
To add a custom filter to a Full Text Index via the Couchbase Capella UI, the following permissions are required:
You must have the Project View
privileges for the project that contains the cluster.
You must have a database user associated with your organization’s user account. The database user must have Read/Write permissions for the bucket on which the index was created.
The 'Custom Filters' panel shows no existing custom filters.
The following four options are provided:
Character Filter
Adds a new character filter to the list of those available. The new filter becomes available for inclusion in custom-created analyzers.
Left-click the + Add Character Filter. It displays the Custom Character Filter dialog:
The following interactive fields are provided:
-
Name: A suitable, user-defined name for the new character filter.
-
Type: The type of filtering to be performed. Available options can be accessed from the pull-down menu to the right of the field. (Currently, only
regexp
is available.) -
Regular Expression: The specific regular expression that the new character filter is to apply. Character-strings that match the expression will be affected; others will not.
-
Replacement: The replacement text that will be substituted for each character-string match returned by the regular expression. If no replacement text is specified, the matched character-string will be omitted.
The following completed fields define a character filter for deleting leading whitespace:
When saved, the new character filter is displayed on its own row, with options for further editing and deleting:
Tokenizer
Adds a new tokenizer to the list of those available.
The new tokenizer becomes available for inclusion in custom-created analyzers.
Left-click the + Add Tokenizer. It displays the Custom Tokenizer dialog:
The following interactive fields are provided:
-
Name: A suitable, user-defined name for the new tokenizer.
-
Type: The process used in tokenizing. Available options can be accessed from the pull-down menu to the right of the field. (Currently,
regexp
andexception
are available.) -
Regular Expression: The specific regular expression used by the tokenizing process.
The following completed fields define a tokenizer that removes uppercase characters:
When saved, the new tokenizer is displayed on its own row, with options for further editing and deleting:
Token filter
Adds a new token filter to the list of those available. The new token filter becomes available for inclusion in custom-created analyzers.
Left-click the + Add Token Filter. It displays the Custom Token Filter dialog:
The following interactive fields are provided:
-
Name: A suitable, user-defined name for the new token filter.
-
Type: The type of post-processing to be provided by the new token filter. The default is
length
, which creates tokens whose minimum number of characters is specified by the integer provided in the Min field and whose maximum by the integer provided in the Max. Additional post-processing types can be selected from the pull-down menu at the right of the field:The type-selection determines which interactive fields appear in the Custom Token Filter dialog, following Name and Type. The pull-down menu displays a list of available types. For descriptions, see the section Token Filters, on the page Understanding Analyzers. -
Min: The minimum length of the token, in characters. Note that this interactive field is displayed for the
length
type, and may not appear, or be replaced, when other types are specified. The default value is 3. -
Max: The maximum length of the token, in characters. Note that this interactive field is displayed for the
length
type and may not appear, or be replaced when other types are specified. The default value is 255.
The following completed fields define a token filter that restricts token-length to a minimum of 3, and a maximum of 255 characters:
When saved, the new token filter is displayed on its own row, with options for further editing and deleting:
Wordlist
Adds a list of words to be removed from the current search.
Left-click the + Add Word List. It displays the Custom Word List dialog
To create a custom word list, first, type a suitable name into the Name field. Then, add words by typing each individually into the field that bears the placeholder text, word to be added
.
After each word has been added, left-click on the + Add button, on the lower-right. The word is added to the central Words panel.
Continue adding as many words as are required.
For example:
To remove a word, select the word within the Words panel and left-click on the Remove button.
To save, left-click on Save. The new word list is displayed on its own row, with options for further editing and deleting: