isAsciiString in Phobos?
monarch_dodra
monarchdodra at gmail.com
Mon Oct 7 13:14:05 PDT 2013
On Monday, 7 October 2013 at 16:23:12 UTC, Andrej Mitrovic wrote:
> On 10/7/13, monarch_dodra <monarchdodra at gmail.com> wrote:
>> If we want even more efficiency, we could iterate on the
>> string,
>> interpreting it as a size_t[]. We mask each of its elements
>> with
>> 0x80808080/0x80808080_80808080, and if one of the resulting
>> masked elements is not null, then the string isn't ASCII.
>
> Clever! So I think we should definitely try and push it to the
> library.
I wrote this:
Only lightly tested.
//--------
bool isASCII(const(char[]) str)
{
static if (size_t.sizeof == 8)
{
enum size = 8;
enum size_t mask = 0x80808080_80808080;
enum size_t alignMask = ~cast(size_t)0b111;
}
else
{
enum size = 4;
enum size_t mask = 0x80808080;
enum size_t alignMask = ~cast(size_t)0b11;
}
if (str.length < size)
{
foreach (c; str)
if (c & 0x80)
return false;
return true;
}
immutable start = (cast(size_t)str.ptr & alignMask) + size;
immutable end = cast(size_t)(str.ptr + str.length) &
alignMask;
//we start with block, because it is faster
//and chances the start is aligned anyways (so we check it
later).
for ( auto p = cast(size_t*)start ; p != cast(size_t*)end ;
++p )
if (*p & mask)
return false;
//Then the trailing chars.
for ( auto p = cast(char*)end ; p != str.ptr + str.length ;
++p )
if (*p & 0x80)
return false;
//Finally, the first chars.
for ( auto p = str.ptr ; p != cast(char*)start ; ++p )
if (*p & 0x80)
return false;
return true;
}
//--------
assert( "hello".isASCII());
assert( "heellohelloellohelloellohelloellohellollohello");
assert( "hellellohelloellohelloo"[3 .. $].isASCII());
assert(!"heéppellohelloellohelloellohelloellohelloellohellollo".isASCII());
assert(!"heppellohelloellohelloellohéelloellohelloellohellollo".isASCII());
assert(!"heppellohelloellohelloellohelloellohelloellohellolléo".isASCII());
//--------
What do you think? I have some doubts though:
1. Does x64 require qword alignment for size_t, or is dword
enough?
2. Isn't there some built-in that'll give me the wanted
alignement, isntead of doing it by hand?
3. Are those casts 100% correct?
More information about the Digitalmars-d-learn
mailing list