Find Semantically Correct Word Splits in UTF-8 Strings
    monarch_dodra via Digitalmars-d-learn 
    digitalmars-d-learn at puremagic.com
       
    Wed Oct  1 10:09:56 PDT 2014
    
    
  
On Wednesday, 1 October 2014 at 11:47:41 UTC, Nordlöw wrote:
> On Wednesday, 1 October 2014 at 11:06:24 UTC, Nordlöw wrote:
>> I'm looking for a way to make my algorithm
>>
>
> Update:
>
>     S[] findMeaningfulWordSplit(S)(S word,
>                                    HLang[] langs = []) if 
> (isSomeString!S)
>     {
>         for (size_t i = 1; i + 1 < word.length; i++)
>         {
>             const first = word.takeExactly(i).to!string;
Does that even work? takeExactly would pop up to N *codepoints*, 
whereas your string only has N *codeunits*.
Something like:
for (auto second = str ; !second.empty ; second.popFront() )
{
     auto first = str[0 .. $ - second.length];
     ...
}
//special case str + str[$ .. $] here. (or adapt your loop)
Would also be unicode correct, without increasing the original 
complexity.
    
    
More information about the Digitalmars-d-learn
mailing list