Fuzzy translation key merging with Gettext
Gettext
is a great tool for i18n in elixir. It provides a mix task for extracting translation keys from your code. The translation keys (or message ids) are natural language and look like this:
gettext("Hi, Welcome to Tilex!")
After running mix gettext.extract && mix gettext.merge
, an already translated Italian locale file would look like:
msgid "Hi, Welcome to Tilex!"
msgstr "Italian version of Welcome!"
There's a chance that the natural language key (which also serves as the default string) will change.
If it changes just a little bit then the Italian locale file will look like:
#, fuzzy
msgid "Hi, Welcome to Tilex!!"
msgstr "Italian version of Welcome!"
It gets marked as #, fuzzy
, and the new msgid
replaced the old msgid
.
Gettext determines how big of a change will constitute a fuzzy match with String.jaro_distance
.
iex> String.jaro_distance("something", "nothing")
0.8412698412698413
iex> String.jaro_distance("peanuts", "bandersnatch")
0.576984126984127
The higher the number the closer the match. fuzzy_threshold
is the configuration that determines whether a msgid
is fuzzy or not and the default for fuzzy_threshold
is 0.8
, set here.