Reading planet-php.net today, there's some talk about multi-byte encodings in PHP for internationalized text handling. This is one of the issues I have to deal with in the Text_Wiki project (say with a Hebrew wiki).

John Lim points out in this post that there are already good multibyte functions in PHP. This is fantastic news!

The only downside is that you need to compile them into your PHP installation explicitly (they're not on by default). Also, in order to use them effectively, you need to either (1) code for multibyte function to begin with, say mb_substr() instead of substr(), or (2) (and this is this cool one) turn on multibyte overloading so that the multibyte function is used when you call a related single-byte function -- this means you don't need to re-code your app, just recompile PHP and change up your php.ini.

I love this stuff. :-)

Are you stuck with a legacy PHP application? You should buy my book because it gives you a step-by-step guide to improving you codebase, all while keeping it running the whole time.