How to get a substring?

Jonathan M Davis jmdavisProg at gmx.com
Sun Oct 27 01:34:59 PDT 2013


On Sunday, October 27, 2013 09:14:28 Nicolas Sicard wrote:
> On Sunday, 27 October 2013 at 07:44:06 UTC, Jakob Ovrum wrote:
> > On Saturday, 26 October 2013 at 21:23:13 UTC, Gautam Goel wrote:
> >> Dumb Newbie Question: I've searched through the library
> >> reference, but I haven't figured out how to extract a
> >> substring from a string. I'd like something like
> >> string.substring("Hello", 0, 2) to return "Hel", for example.
> >> What method am I looking for? Thanks!
> > 
> > There are a lot of good answers in this thread but I also think
> > they miss the real issue here.
> 
> I don't think so. It's indeed worth noticing that Phobos'
> algorithms work with Unicode nicely, but:
> a) working on indices is sometimes the actual functionality you
> need

Sometimes, but it usually isn't. If you find that you frequently need to use 
indices for a string, then you should probably rethink how you're using 
strings. Phobos aims at operating on ranges, which rarely means using indices, 
and _very_ rarely means using indices on strings. In general, indices only get 
used on strings when you're trying to optimize a particular algorithm for 
strings and make sure that you slice the string so that the result is a string 
rather than a wrapper range.

Sure, indexing strings can be very useful, but they way that Phobos is 
designed does not lend itself to using string indices (quite the opposite in 
fact), and in my experince, using string indices is rarely needed even when 
doing heavy string manipulation.

> b) you need to allocate a new string from the range they return
> (the slice functions in this thread don't)

You _rarely_ want to do that. Allocating a new string is just plain wasteful 
in most cases. The fact that the elements in a string are immutable makes it 
so that you can slice without worrying about allocating new strings. You 
should pretty much only be allocating new strings when slicing when the 
original was something like char[] rather than string. The main place where 
that's usually forced is when reading from a file (since buffers are frequently 
reused when reading files and therefore not immutable).

> c) do they really handle grapheme clusters? (I don't know)

I believe that that sort of thing is properly supported by the updated std.uni 
in 2.064, but it is the sort of thing that you have to code for. Phobos as a 
whole operates on ranges of dchar - which is correct most of the time but not 
enough when you need full-on grapheme support. I haven't yet looked in detail 
at what std.uni now provides though. I just know that it's added some grapheme 
support.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list