January 2009 Archives

UTF-16 Processing Issue in Perl

| No Comments | No TrackBacks
Here is an interesting piece of information on how to avoid problems while editing UTF-16|32 files using Perl.
Some time ago I wanted to read an UTF-16LE-encoded XML file, modify some strings, and then write out the
updated version. I followed the guidelines to transform the encoding to Perl's internal format using PerlIO:

open my $in, "<:encoding(UTF-16LE)", $in_path;
# some code goes here
open my $out, ">:encoding(UTF-16LE)", $out_path;

but couldn't make it work right this way. The extended characters weren't displayed correctly. After doing some research on the Internet, I found a solution in this thread of the perl-unicode list. Dan Kogai mentions that BOMed UTF files are not suitable for streaming models, and they must be slurped as binary files instead:

# read file
open(my $in, '<:raw', $in_path) || die "Couldn't open file: $!";
my $text = do { local $/; <$in> };
close $in;

# now decode to use character semantics (no need to specify LE or BE when reading)
my $content = decode('UTF-16LE', $text);
my @lines = split /\n/, $content;

# some code that turns @lines into @processed goes here

# write file
my $output_str = join "\n", @processed;

open my $out, '>:raw', $out_path;
print $out encode("UTF-16LE", $output_str);
close $out;

and that did the trick.

Code Syntax Highlighting

| No Comments | No TrackBacks
I found an easy way of adding code samples with syntax highlighting to my blog entries, based on a "conversion to html" procedure in Vim. This method is valid for all the languages supported by Vim.

Select the code in the editor and type :TOhtml. This command opens a new buffer containing your code converted to html (including the colors). Now you only have to copy the html code and paste it to your entry.
As an example, here's my solution to Project Euler's #2:

#!/usr/bin/perl -w
use strict;
use feature 'say';
use Memoize;

memoize('fib');

my $n = 1;
my $fib = fib($n);
my $sum = 0;

while ($fib < 1_000_000) {
    if ($fib % 2 == 0) {
        say $fib;
        $sum += $fib;
    }
    $n++;
    $fib = fib($n);
}

say "Result: $sum";


sub fib {
    my $n = shift;
    
    return 1 if $n < 2;
    fib ($n-1) + fib($n-2);
}

As you can see, the code coloring functionality of a default setup of MacVim doesn't recognize the new Perl 5.10 features (like the say function), but the result is quite readable.

NOTE: For an MT4 blog, the entry format should be None (pure HTML).

There is a similar solution for Emacs/XEmacs users. You can find the details here: htmlize.

Movable Type

| No Comments | No TrackBacks
After a slow start, I have resumed work on my blog. The free weblogs provided by Wordpress.com are nice, but I wanted a more flexible solution, so today I installed Movable Type (which is written in Perl) on my server and migrated the blog. So far so good.
Now I'm checking the different options to add code syntax-highlighting functionality to my posts. It seems that the available plugins are outdated and won't work well with MT4, but there are other possibilities...

About this Archive

This page is an archive of entries from January 2009 listed from newest to oldest.

September 2008 is the previous archive.

February 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Pages

Powered by Movable Type 4.23-en