Today I Learned

A Hashrocket project

Limiting Regex Alternation

The pipe operator (|, or the ‘alternation’ operator) is the ‘or’ of Regexes. Here’s a JavaScript example:

> 'hello'.match(/^hello|goodbye$/)
[ 'hello', index: 0, input: 'hello' ]
> 'goodbye'.match(/^hello|goodbye$/)
[ 'goodbye', index: 0, input: 'goodbye' ]

You might think this Regex can be true in two situations: the string is 'hello' or 'goodbye', because the Regex includes the start-of-line and end-of-line characters (^ and $). I have bad news:

> 'hello there'.match(/^hello|goodbye$/)
[ 'hello', index: 0, input: 'hello there' ]

What’s going on here? This is truthy because the | has a low precedence. It’s evaluated last, after other characters like () and ?. To put this Regex into words (I think): “does this string match anything before the pipe (including the start of line character), or anything after the pipe (including the end of line character)?”. The expression matches on ^hello and ignores anything after that.

We can contain it by telling the pipe to stop evaluating. Parentheses work because they have a higher order of precedence. Here’s our new Regex:

> 'hello there'.match(/^(hello|goodbye)$/)
> 'say goodbye'.match(/^(hello|goodbye)$/)

Bottom line: when using the pipe the way I’ve described, use parentheses.

Looking for help? At Hashrocket, our JavaScript experts launch scalable, performant apps on the Web, Android and iOS. Contact us and find out how we can help you.