1. Docs
  2. Cloaked Search
  3. Usage
  4. Configuration
  5. Filters - pattern_replace
  1. Docs
  2. Cloaked Search
  3. Usage
  4. Configuration
  5. Filters - pattern_replace

Pattern replace token filter

Replaces token substrings using regex matches.

The pattern_replace filter uses Rust’s regex crate syntax. By default, the filter replaces regex matching substrings with an empty substring ("").

This filter provides raw regex access. Be wary of degenerate regular expressions, they may run slowly or crash the service.

Notable differences from Java regex syntax

  • Capture groups must be referenced using ${1} through ${9} instead of $1 through $9. The upside to this is that named capture groups are supported.
  • Unicode character classes are always enabled.

Add to an analyzer

pattern_replace can be added to any analyzer as a filter.

JSON
"analysis": { "analyzer": { "watchdogs": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase", "add_watch_to_dog"] } }, "filter": { "add_watch_to_dog": { "type": "pattern_replace", "pattern": "(dog)", "replacement": "watch${1}" } } }

Given the text "foxes jump lazy dogs", this filter would produce the following tokens:

[ foxes, jump, lazy, watchdogs ]

Configurable parameters

  • all (Optional, Boolean) If true, all substrings matching the pattern parameter’s regex are replaced. If false, only the first matching substring is replaced. Defaults to true.
  • pattern (string) Regular expression, written in Rust’s regex crate syntax. Token substrings matching this pattern are replaced with the substring in the replacement parameter.
  • replacement (Optional, string) Replacement substring. Defaults to an empty substring ("").

Was this page helpful?

One sec... bot checking