Flipin' machines trying to be smart at me! I expected
egrep '^[a-z]{4,5}$' $WORDS > wordlist.txt
to match 4 or 5 character words that conain only the characters 'a' (U+0061 LATIN SMALL LETTER A) to 'z' (U+007A LATIN SMALL LETTER Z). But nooo, someone is being clever with their locales, and I get words like élan, which, if you missed it, contains 'é' (U+00E9 LATIN SMALL LETTER E WITH ACUTE) (assuming it hasn't been decomposed).
Fix:
LC_ALL=C egrep '^[a-z]{4,5}$' $WORDS > wordlist.txt
(Roughly translated means 'Ignore anything that happened after about 1963 and just give me ASCII)
(I should probably take the word list with the funky characters, since I'm using it to generate passwords. On the other hand, I think I should probably stick to ASCII, since I'm using it for passwords...)