[phobos] UTF-8 string slicing

Martin Nowak dawg at dawgfoto.de
Fri Aug 19 07:06:27 PDT 2011


This is because popFront and front actually decode unicode chars. So  
takeExactly works on 32-bit dchars, the ElementType of all strings.

On Fri, 19 Aug 2011 12:07:53 +0200, unDEFER <undefer at gmail.com> wrote:

> On Fri, 19 Aug 2011 06:53:37 +0400, Jonathan M Davis  
> <jmdavisProg at gmx.com> wrote:
>
>> Hmmm. Such a function isn't entirely a bad idea, but it also makes me a  
>> bit
>> nervous. Slicing is efficient. The slice function that you suggest is  
>> not. I
>> mean, it's efficient enough for what it's doing, but it's not O(1) like  
>> slicing
>> is, so having a slice function could be a bit misleading.
>
> I know that it is not efficient, but here just appears the question why  
> D have decided not support 8-but encodings. Only its makes operations   
> like this efficient.
>
>> Once drop has been merged in, you'll be able do to this
>> auto s = takeExactly(drop(str, firstIndex), lastIndex - firstIndex));
>> to get the same effect. It may be worth adding such a function though.
>
> I'm sorry, but looks like there is no "drop()" function.
> Anyway, thank you. I really don't understand how takeExactly works, but  
> it works. For newbies it is really not obvious that std.range works fine  
> with UTF-8 strings.
>
>> Certainly
>> auto s = slice(firstIndex, lastIndex);
>> is cleaner. If we add it though, then we should probably give it a  
>> different name. Maybe sliceByElementType? That does seem a bit long  
>> though, if accurate.
>
> In many other languages this function named as "subString".
>
>> We'd probably put it in std.range though rather than std.utf, since it  
>> could
>> be useful for any range which isn't actually sliceable. And then  
>> there's the
>> question of whether it would be better to make it lazy. It would make  
>> it so
>> that it wasn't actually a string anymore, but it would make it more  
>> efficient for all of the cases where you don't actually end up using  
>> the whole slice.
>>
>> You can make a pull request for it if you want to, and the best way to  
>> handle it - as well as whether we actually want such a function - can  
>> be discussed in the pull request. I do think that some thought is going  
>> to have to go into what behavior we really want such a function to have  
>> though (as well as the best name for it).
>
> I'm not familiar with Git, but I'll try to think up anything.
>




More information about the phobos mailing list