[phobos] UTF-8 string slicing
Martin Nowak
dawg at dawgfoto.de
Fri Aug 19 07:06:27 PDT 2011
This is because popFront and front actually decode unicode chars. So
takeExactly works on 32-bit dchars, the ElementType of all strings.
On Fri, 19 Aug 2011 12:07:53 +0200, unDEFER <undefer at gmail.com> wrote:
> On Fri, 19 Aug 2011 06:53:37 +0400, Jonathan M Davis
> <jmdavisProg at gmx.com> wrote:
>
>> Hmmm. Such a function isn't entirely a bad idea, but it also makes me a
>> bit
>> nervous. Slicing is efficient. The slice function that you suggest is
>> not. I
>> mean, it's efficient enough for what it's doing, but it's not O(1) like
>> slicing
>> is, so having a slice function could be a bit misleading.
>
> I know that it is not efficient, but here just appears the question why
> D have decided not support 8-but encodings. Only its makes operations
> like this efficient.
>
>> Once drop has been merged in, you'll be able do to this
>> auto s = takeExactly(drop(str, firstIndex), lastIndex - firstIndex));
>> to get the same effect. It may be worth adding such a function though.
>
> I'm sorry, but looks like there is no "drop()" function.
> Anyway, thank you. I really don't understand how takeExactly works, but
> it works. For newbies it is really not obvious that std.range works fine
> with UTF-8 strings.
>
>> Certainly
>> auto s = slice(firstIndex, lastIndex);
>> is cleaner. If we add it though, then we should probably give it a
>> different name. Maybe sliceByElementType? That does seem a bit long
>> though, if accurate.
>
> In many other languages this function named as "subString".
>
>> We'd probably put it in std.range though rather than std.utf, since it
>> could
>> be useful for any range which isn't actually sliceable. And then
>> there's the
>> question of whether it would be better to make it lazy. It would make
>> it so
>> that it wasn't actually a string anymore, but it would make it more
>> efficient for all of the cases where you don't actually end up using
>> the whole slice.
>>
>> You can make a pull request for it if you want to, and the best way to
>> handle it - as well as whether we actually want such a function - can
>> be discussed in the pull request. I do think that some thought is going
>> to have to go into what behavior we really want such a function to have
>> though (as well as the best name for it).
>
> I'm not familiar with Git, but I'll try to think up anything.
>
More information about the phobos
mailing list