| BstringMultibyteCharacterIssue |
 |
|
This recently bit me since I was looking at the source for the excellent ReName! program that I downloaded on BeShare. The program would crash on renaming files that had chinese characters. The source of this was a confusion between CountChars and Length in BString. This API needs to be cleared up with functions that work on potentially multi-byte characters and functions that work on straight bytes. Functions that work on straight bytes should be deprecated, and corresponding functions that work on multi-byte characters should be implemented.
Please note that there are a number of potential implementation pitfalls in this. For example, a trivial implementation of Length() [which returns the length in bytes) is one which uses the length of the array. Such an implementation is incorrect for CountChars(). One way to implement countchars is to go through the area, parsing the UTF to count the characters. That would be slow. Other solutions involve more space but can be considerably faster.
|
|
|