Displaying non-Latin1 Unicode Characters in CPAN Documentation

| No Comments | No TrackBacks

The latest update of the documentation of one of my CPAN modules, String::CaseProfile, includes sample text in alphabets other than the Roman alphabet (namely, Armenian, Greek, and Cyrillic).

So I had to convert the encoding to UTF-8 and add the use utf8 pragma, but while checking the doc of several CPAN modules containing non-Latin1 Unicode characters, I noticed that for many of them, these characters were not displayed correctly. I wanted to make sure that everything was OK before the release, so I asked JoaquĆ­n Ferrero, a walking encyclopedia of Perl and administrator of the Spanish Perl Forum (sort of PerlMonks for the Spanish-speaking community), and he spoke with the wisdom that can only come from experience: Adding the use utf8 pragma is not enough. You also have to add the following line at the beginning of the Pod text:

=encoding utf8

This way, CPAN's Pod processor knows what is the encoding used in the module and transforms accordingly these Unicode characters into the corresponding HTML entities that will be displayed correctly with the ISO-8859-1 encoding used by default for CPAN documentation in HTML format.

When you think about it, it's quite logical, but the fact is that there are many CPAN modules that don't get this right.

 

No TrackBacks

TrackBack URL: http://www.haboogo.com/cgi-bin/mt/mt-tb.cgi/19

Leave a comment

About this Entry

This page contains a single entry by Enrique Nell published on February 18, 2010 8:00 PM.

Installing X11::GUITest on Ubuntu Linux was the previous entry in this blog.

FreezePanes for Desperados is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Pages

Powered by Movable Type 4.23-en