Today I Learned

hashrocket A Hashrocket project

Limiting Regex Alternation

The pipe operator (|, or the 'alternation' operator) is the 'or' of Regexes. Here's a JavaScript example:

> 'hello'.match(/^hello|goodbye$/)
[ 'hello', index: 0, input: 'hello' ]
> 'goodbye'.match(/^hello|goodbye$/)
[ 'goodbye', index: 0, input: 'goodbye' ]

You might think this Regex can be true in two situations: the string is 'hello' or 'goodbye', because the Regex includes the start-of-line and end-of-line characters (^ and $). I have bad news:

> 'hello there'.match(/^hello|goodbye$/)
[ 'hello', index: 0, input: 'hello there' ]

What's going on here? This is truthy because the | has a low precedence. It's evaluated last, after other characters like () and ?. To put this Regex into words (I think): "does this string match anything before the pipe (including the start of line character), or anything after the pipe (including the end of line character)?". The expression matches on ^hello and ignores anything after that.

We can contain it by telling the pipe to stop evaluating. Parentheses work because they have a higher order of precedence. Here's our new Regex:

> 'hello there'.match(/^(hello|goodbye)$/)
null
> 'say goodbye'.match(/^(hello|goodbye)$/)
null

Bottom line: when using the pipe the way I've described, use parentheses.

See More #javascript TILs
Looking for help? At Hashrocket, our JavaScript experts launch scalable, performant apps on the Web, Android and iOS. Contact us and find out how we can help you.