toString vs. toUtf8

Sean Kelly sean at f4.ca
Mon Nov 19 16:46:59 PST 2007


Christopher Wright wrote:
> Sean Kelly wrote:
>> I was looking at converting Tango's use of toUtf8 to toString today 
>> and ran into a bit of a quandry....
> 
> toUtf8 is ugly.
> toString/toWString/toDString are opaque and ugly, hard to distinguish 
> from each other.
> 
> toString, toStringW, toStringD? Still ugly.
> 
> toUtf, toUtf16, toUtf32? Slightly less clear, but easier to type.
> 
> toString, toUtf16, toUtf32? Inconsistent, but readable, and it fits well 
> with other conventions.

I tend to place a tremendous amount of value on consistency, because the 
more consistent an API is, the more likely my guesses about it are to be 
correct.  In my opinion, that precludes using the option you suggest.

In my opinion, Walter's suggestion that alternate encodings not be 
stored in strings is sufficient reason to not bother with the encoding 
format in the function name (ie. toUtf8/toUtf16/toUtf32).  I might 
counter that I don't see any reason to lose meaning where it is so 
easily provided, but on the other hand, I agree that new users are more 
likely to know what a function named toString does than were it named 
toUtf8.  These two points are a wash in my opinion.

The remaining concerns are less substantive.  I find toWString and 
toDString difficult to read, but those feelings hold little more weight 
than "toUtf8 is ugly."  I also feel that the term "string" is largely 
meaningless in programming.  But I certainly couldn't win a debate with 
either point.

I don't suppose there is anyone who does a lot of internationalization 
programming who can comment on the utility of one convention vs. the 
other?  I would love to hear some more practical concerns regarding the 
naming convention for these functions.


Sean



More information about the Digitalmars-d mailing list