- Docs
- Cloaked Search
- Usage
- Configuration
- Filters - pattern_replace
Pattern replace token filter
Replaces token substrings using regex matches.
The pattern_replace
filter uses Rust’s regex crate syntax. By default, the filter replaces regex matching substrings with an empty substring (""
).
This filter provides raw regex access. Be wary of degenerate regular expressions, they may run slowly or crash the service.
Notable differences from Java regex syntax
- Capture groups must be referenced using
${1}
through${9}
instead of$1
through$9
. The upside to this is that named capture groups are supported. - Unicode character classes are always enabled.
Add to an analyzer
pattern_replace
can be added to any analyzer as a filter.
JSON"analysis": { "analyzer": { "watchdogs": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase", "add_watch_to_dog"] } }, "filter": { "add_watch_to_dog": { "type": "pattern_replace", "pattern": "(dog)", "replacement": "watch${1}" } } }
Given the text "foxes jump lazy dogs"
, this filter would produce the following tokens:
[ foxes, jump, lazy, watchdogs ]
Configurable parameters
all
(Optional, Boolean) Iftrue
, all substrings matching thepattern
parameter’s regex are replaced. Iffalse
, only the first matching substring is replaced. Defaults totrue
.pattern
(string) Regular expression, written in Rust’s regex crate syntax. Token substrings matching this pattern are replaced with the substring in thereplacement
parameter.replacement
(Optional, string) Replacement substring. Defaults to an empty substring (""
).