Why foreach(c; someString) must yield dchar

Jonathan M Davis jmdavisprog at gmail.com
Thu Aug 19 12:49:49 PDT 2010


On Thursday, August 19, 2010 12:24:22 Kagamin wrote:
> Jonathan M Davis Wrote:
> > Not to mention, the Linux I/O stuff uses UTF-8, and
> > the Windows I/O stuff uses UTF-16, so dstring is less efficient for
> > dealing with I/O.
> 
> If we take dil as an example of application doing much of string
> processing. How much string processing it does and how intensively it
> communicates with OS (with string transcoding)?

I have never heard of dil. I have no idea. How big a hit the string type has on 
I/O is likely to be strongly dependent on the type of I/O you're using, the 
characteristics of your strings (as in things like what is the average number of 
code units in a code point in your strings and what is the average length of 
your strings), as well as all of the other CPU or memory-intensive stuff that you 
may be doing. However, it does make sense to make your string types the same 
size as the OS' native string types if you want to maximize efficiency.

Of more importance, however, is the fact that it costs a lot of memory to use 
UTF-32 strings if you have a lot of strings. The string processing itself could 
actually be more efficient using dstring since you can then use random access 
operations on them (or it could be less efficient because of the extra memory 
costs involved), but there are big memory costs to using lots of dstrings.

- Jonathan M Davis


More information about the Digitalmars-d mailing list