Reading planet-php.net today, there’s some talk about multi-byte encodings in PHP for internationalized text handling. This is one of the issues I have to deal with in the Text_Wiki project (say with a Hebrew wiki).
The only downside is that you need to compile them into your PHP installation explicitly (they’re not on by default). Also, in order to use them effectively, you need to either (1) code for multibyte function to begin with, say mb_substr() instead of substr(), or (2) (and this is this cool one) turn on multibyte overloading so that the multibyte function is used when you call a related single-byte function — this means you don’t need to re-code your app, just recompile PHP and change up your php.ini.
I love this stuff. 🙂