Limiting Regex Alternation
The pipe operator (|
, or the 'alternation' operator) is the 'or' of Regexes.
Here's a JavaScript example:
> 'hello'.match(/^hello|goodbye$/)
[ 'hello', index: 0, input: 'hello' ]
> 'goodbye'.match(/^hello|goodbye$/)
[ 'goodbye', index: 0, input: 'goodbye' ]
You might think this Regex can be true in two situations: the string is
'hello'
or 'goodbye'
, because the Regex includes the start-of-line and
end-of-line characters (^
and $
). I have bad news:
> 'hello there'.match(/^hello|goodbye$/)
[ 'hello', index: 0, input: 'hello there' ]
What's going on here? This is truthy because the |
has a low precedence. It's evaluated last, after other characters
like ()
and ?
. To put this Regex into words (I think): "does this string match
anything before the pipe (including the start of line character), or anything
after the pipe (including the end of line character)?". The
expression matches on ^hello
and ignores anything after that.
We can contain it by telling the pipe to stop evaluating. Parentheses work because they have a higher order of precedence. Here's our new Regex:
> 'hello there'.match(/^(hello|goodbye)$/)
null
> 'say goodbye'.match(/^(hello|goodbye)$/)
null
Bottom line: when using the pipe the way I've described, use parentheses.
Tweet