- Docs
- Cloaked Search
- Usage
- Configuration
- Filters - pattern_replace
Pattern replace token filter
Replaces token substrings using regex matches.
The pattern_replace filter uses Rust’s regex crate syntax. By default, the filter replaces regex matching substrings with an empty substring ("").
This filter provides raw regex access. Be wary of degenerate regular expressions, they may run slowly or crash the service.
Notable differences from Java regex syntax
- Capture groups must be referenced using
${1}through${9}instead of$1through$9. The upside to this is that named capture groups are supported. - Unicode character classes are always enabled.
Add to an analyzer
pattern_replace can be added to any analyzer as a filter.
JSON"analysis": { "analyzer": { "watchdogs": { "type": "custom", "tokenizer": "standard", "filter": ["lowercase", "add_watch_to_dog"] } }, "filter": { "add_watch_to_dog": { "type": "pattern_replace", "pattern": "(dog)", "replacement": "watch${1}" } } }
Given the text "foxes jump lazy dogs", this filter would produce the following tokens:
[ foxes, jump, lazy, watchdogs ]
Configurable parameters
all(Optional, Boolean) Iftrue, all substrings matching thepatternparameter’s regex are replaced. Iffalse, only the first matching substring is replaced. Defaults totrue.pattern(string) Regular expression, written in Rust’s regex crate syntax. Token substrings matching this pattern are replaced with the substring in thereplacementparameter.replacement(Optional, string) Replacement substring. Defaults to an empty substring ("").