Major performance problem with std.array.front()
w0rp
devw0rp at gmail.com
Tue Mar 11 05:44:33 PDT 2014
On Sunday, 9 March 2014 at 21:38:06 UTC, Nick Sabalausky wrote:
> On 3/9/2014 7:47 AM, w0rp wrote:
>>
>> My knowledge of Unicode pretty much just comes from having
>> to deal with foreign language customers and discovering the
>> problems
>> with the code unit abstraction most languages seem to use.
>> (Java and
>> Python suffer from similar issues, but they don't really have
>> algorithms
>> in the way that we do.)
>>
>
> Python 2 or 3 (out of curiosity)? If you're including Python3,
> then that somewhat surprises me as I thought greatly improved
> Unicode was one of the biggest reasons for the jump from 2 to
> 3. (Although it isn't *completely* surprising since, as we all
> know far too well here, fully correct Unicode is *not* easy.)
Late reply here. Python 3 is a lot better in terms of Unicode
support than 2. The situation in Python 2 was this.
1. The default string type is 'str', an immutable array of bytes.
2. 'str' could be one of many encodings, including UTF-16, etc.
3. There is an extra 'unicode' type for when you want a Unicode
string.
4. Python implicltly converts between the two, often in wrong
ways, often causing exceptions to appear where you didn't expect
them to.
In 3, this changed to...
1. The default string type is still named 'str', only now it's
like the 'unicode' of olde.
2. 'bytes' is a new immutable array of bytes type like the Python
2 'str'.
3. Conversion between 'str' and 'bytes' is always explicit.
However, Python 3 works on a code point level, probably some code
unit level in fact, and you don't see very many algorithms which
take, say, combining characters into account. So Python suffers
from similar issues.
More information about the Digitalmars-d
mailing list