How to get a substring?
Jonathan M Davis
jmdavisProg at gmx.com
Sun Oct 27 01:34:59 PDT 2013
On Sunday, October 27, 2013 09:14:28 Nicolas Sicard wrote:
> On Sunday, 27 October 2013 at 07:44:06 UTC, Jakob Ovrum wrote:
> > On Saturday, 26 October 2013 at 21:23:13 UTC, Gautam Goel wrote:
> >> Dumb Newbie Question: I've searched through the library
> >> reference, but I haven't figured out how to extract a
> >> substring from a string. I'd like something like
> >> string.substring("Hello", 0, 2) to return "Hel", for example.
> >> What method am I looking for? Thanks!
> >
> > There are a lot of good answers in this thread but I also think
> > they miss the real issue here.
>
> I don't think so. It's indeed worth noticing that Phobos'
> algorithms work with Unicode nicely, but:
> a) working on indices is sometimes the actual functionality you
> need
Sometimes, but it usually isn't. If you find that you frequently need to use
indices for a string, then you should probably rethink how you're using
strings. Phobos aims at operating on ranges, which rarely means using indices,
and _very_ rarely means using indices on strings. In general, indices only get
used on strings when you're trying to optimize a particular algorithm for
strings and make sure that you slice the string so that the result is a string
rather than a wrapper range.
Sure, indexing strings can be very useful, but they way that Phobos is
designed does not lend itself to using string indices (quite the opposite in
fact), and in my experince, using string indices is rarely needed even when
doing heavy string manipulation.
> b) you need to allocate a new string from the range they return
> (the slice functions in this thread don't)
You _rarely_ want to do that. Allocating a new string is just plain wasteful
in most cases. The fact that the elements in a string are immutable makes it
so that you can slice without worrying about allocating new strings. You
should pretty much only be allocating new strings when slicing when the
original was something like char[] rather than string. The main place where
that's usually forced is when reading from a file (since buffers are frequently
reused when reading files and therefore not immutable).
> c) do they really handle grapheme clusters? (I don't know)
I believe that that sort of thing is properly supported by the updated std.uni
in 2.064, but it is the sort of thing that you have to code for. Phobos as a
whole operates on ranges of dchar - which is correct most of the time but not
enough when you need full-on grapheme support. I haven't yet looked in detail
at what std.uni now provides though. I just know that it's added some grapheme
support.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list