To Walter, about char[] initialization by FF

Andrew Fedoniouk news at terrainformatica.com
Mon Jul 31 20:46:53 PDT 2006


"Derek Parnell" <derek at nomail.afraid.org> wrote in message 
news:8n0koj5wjiio.qwc8ok4mrvr3$.dlg at 40tude.net...
> On Mon, 31 Jul 2006 18:23:19 -0700, Andrew Fedoniouk wrote:
>
>> "Walter Bright" <newshound at digitalmars.com> wrote in message
>> news:eam1ec$10e1$1 at digitaldaemon.com...
>>> Andrew Fedoniouk wrote:
>>>> The problem as I can see is this:
>>>> D propose to use transport encoding for manipulation purposes
>>>> which is main problem imo here - transport encodings are not
>>>> designed for the manipulation - it is extremely difficult to use
>>>> them for manipualtion in practice as we may see.
>>>
>>> I disagree the characterization that it is "extremely difficult" to use
>>> for manipulation. foreach's direct support for it, as well as the
>>> functions in std.utf, make it straightforward. DMDScript is built around
>>> UTF-8, and manipulating multibyte characters in it has not turned out to
>>> be a significant problem.
>>
>> Sorry but strings in DMDScript are quite different in terms of
>> 0) there are no such thing as char in JavaScript.
>> 1) strings are Strings - not vectors of octets - js::string[] and 
>> d::char[]
>> are different things.
>> 2) are not supposed to be used by any OS API.
>> 3) there are 12 or so methods of String class in JS - limited perimeter -
>> what model you've choosen to store them is irrelevant -
>> in some implementations they represented even by list of fixed runs.
>
> For what its worth, to do *character* manipulation I convert strings to
> UTF-32, do my stuff and convert back to the initial format.
>
> char[] somefunc(char[] x)
> {
>   return std.utf.toUTF8( somefunc( std.utf.toUTF32(x) ) );
> }
>
> wchar[] somefunc(wchar[] x)
> {
>   return std.utf.toUTF16( somefunc( std.utf.toUTF32(x) ) );
> }
>
> dchar[] somefunc(dchar[] x)
> {
>   dchar[] result;
>   ...
>   return result;
> }
>
> This seems to work fast enough for my purposes. DBuild (nee Build) uses
> this a lot.
>
> -- 

Derek, using dchar (ultimate char) is perfectly fine in DBuild(*)
circumstances - you are parsing - not dealing with OS in each line.

Using dchar has drawback - you need to recreate all string primitive
ops from scratch including RegExp, etc.

Again dchar is ok - the only not ok is a strange selection for dchar
null/nothing/nihil/nil/whatever value.

(* dbuild does not sound good in russian - very close to idiot in medical 
meaning
consider builDer/buildDer/creaDor for example - with red D in the middle - 
stylish at least)

Andrew.







More information about the Digitalmars-d mailing list