The latest update of the documentation of one of my CPAN modules, String::CaseProfile, includes sample text in alphabets other than the Roman alphabet (namely, Armenian, Greek, and Cyrillic).
So I had to convert the encoding to UTF-8 and add the use utf8 pragma, but while checking the doc of several CPAN modules containing non-Latin1 Unicode characters, I noticed that for many of them, these characters were not displayed correctly. I wanted to make sure that everything was OK before the release, so I asked JoaquĆn Ferrero, a walking encyclopedia of Perl and administrator of the Spanish Perl Forum (sort of PerlMonks for the Spanish-speaking community), and he spoke with the wisdom that can only come from experience: Adding the use utf8 pragma is not enough. You also have to add the following line at the beginning of the Pod text:
=encoding utf8
This way, CPAN's Pod processor knows what is the encoding used in the module and transforms accordingly these Unicode characters into the corresponding HTML entities that will be displayed correctly with the ISO-8859-1 encoding used by default for CPAN documentation in HTML format.
When you think about it, it's quite logical, but the fact is that there are many CPAN modules that don't get this right.

Leave a comment