Most performant way of converting int to string
cym13 via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Tue Dec 22 10:27:12 PST 2015
On Tuesday, 22 December 2015 at 17:52:52 UTC, Andrew Chapman
wrote:
> On Tuesday, 22 December 2015 at 17:43:00 UTC, H. S. Teoh wrote:
>
>> I wonder if the slowdown is caused by GC collection cycles
>> (because calling to!string will allocate, and here you're
>> making a very large number of small allocations, which is
>> known to cause GC performance issues).
>>
>> Try inserting this before the loop:
>>
>> import core.memory;
>> GC.disable();
>>
>> Does this make a difference in the running time?
>>
>>
>> T
>
> Thanks! Unfortunately that actually makes it run slightly
> slower :-)
I dislike guessing so here are my thought on it and the method I
used to get there:
First I used the following code:
void f() {
import std.conv;
string s;
foreach (i ; 0..1_000_000) {
s = i.to!string;
}
}
void g() {
import core.memory;
import std.conv;
GC.disable;
string s;
foreach (i ; 0..1_000_000) {
s = i.to!string;
}
}
extern (C) int snprintf(char*, size_t, const char*, ...);
void h() {
char[10] s;
foreach (i ; 0..1_000_000) {
snprintf(cast(char*)s, 10, cast(char*)"%d", i);
}
}
void main(string[] args) {
f;
// g;
// h;
}
Note that for h I didn't really use a string, I used a char[10].
We should
keep in mind that you may want to convert it to a proper string
later.
I compiled it three times (one per function) with `dmd -profile
-profile=gc`
and then ran it to get profiling data. I then compiled it with
`dmd` while
timing each execution 6 times and discarding the first time (as
the first one
has to heat the cache). The reason for the recompilation is that
as the C
code can't be profiled and profiling puts some overhead I wanted
to get that
off my shoulders while timing.
Here are the average times:
f: 0.50 user 0.00 system 95% CPU 0.530 total
g: 0.49 user 0.01 system 96% CPU 0.522 total
h: 0.17 user 0.00 system 92% CPU 0.188 total
The GC profiler didn't find any allocation be it in f, g or h.
The profiler was only available for f and g with very similar
results (as
they are the same function really, here is the interesting part):
======== Timer Is 3579545 Ticks/Sec, Times are in Microsecs
========
Num Tree Func Per
Calls Time Time Call
1000000 390439368 265416744 265 pure nothrow
@safe char[] std.array.array!(std.conv.toChars!(10, char, 1,
int).toChars(int).Result).array(std.conv.toChars!(10, char, 1,
int).toChars(int).Result)
1000000 83224994 83224994 83 pure nothrow ref
@nogc @safe std.conv.toChars!(10, char, 1,
int).toChars(int).Result std.conv.toChars!(10, char, 1,
int).toChars(int).Result.__ctor(int)
23666670 1182768507 73190160 3 _Dmain
1000000 525732328 35191373 35 pure nothrow
@trusted immutable(char)[] std.conv.toImpl!(immutable(char)[],
int).toImpl(int, uint, std.ascii.LetterCase)
5888890 43695659 33204064 5 pure nothrow ref
@nogc @safe char std.conv.emplaceRef!(char, char).emplaceRef(ref
char, ref char)
1000000 32745909 32745909 32 pure nothrow
char[] std.array.arrayAllocImpl!(false, char[],
ulong).arrayAllocImpl(ulong)
1000000 100101586 16876591 16 pure nothrow
@nogc @safe std.conv.toChars!(10, char, 1,
int).toChars(int).Result std.conv.toChars!(10, char, 1,
int).toChars(int)
5888890 13070014 13070014 2 pure nothrow
@property @nogc @safe char std.conv.toChars!(10, char, 1,
int).toChars(int).Result.front()
1000000 44899444 12153535 12 pure nothrow
@trusted char[] std.array.uninitializedArray!(char[],
ulong).uninitializedArray(ulong)
5888890 10491595 10491595 1 pure nothrow ref
@nogc @safe char
std.conv.emplaceImpl!(char).emplaceImpl!(char).emplaceImpl(ref
char, ref char)
6888890 9952971 9952971 1 pure nothrow
@property @nogc @safe bool std.conv.toChars!(10, char, 1,
int).toChars(int).Result.empty()
1000000 53086148 8186704 8 pure nothrow
@trusted char[] std.array.array!(std.conv.toChars!(10, char, 1,
int).toChars(int).Result).array(std.conv.toChars!(10, char, 1,
int).toChars(int).Result).__lambda2()
1000000 530599849 4867521 4 pure nothrow
@safe immutable(char)[] std.conv.toImpl!(immutable(char)[],
int).toImpl(int)
1000000 534801618 4201769 4 pure nothrow
@safe immutable(char)[]
std.conv.to!(immutable(char)[]).to!(int).to(int)
5888890 2520677 2520677 0 pure nothrow
@nogc @safe void std.conv.toChars!(10, char, 1,
int).toChars(int).Result.popFront()
1000000 2138919 2138919 2 pure nothrow
@nogc @trusted char[] std.array.array!(std.conv.toChars!(10,
char, 1, int).toChars(int).Result).array(std.conv.toChars!(10,
char, 1, int).toChars(int).Result).__lambda3()
1000000 558232 558232 0 pure nothrow
@property @nogc @safe ulong std.conv.toChars!(10, char, 1,
int).toChars(int).Result.length()
You may not be used to read such a bare listing so here are some
pointers:
- Num Calls is the number of calls made to that function
- Tree Time is the total cumulative time spent in the function
and its subfunctions
- Func Time is the total time spent in the function only
- Per Call is the average time spent in the function only
The functions are sorted by their Func Time. As we can see what
took the most
time was converting the range of chars to an array in order to
store it in a
string. This is a cost that we obviously don't have in h.
It seems that using snprintf would be the fastest here although
one could
replicate its behaviour in D. I don't think there is anything in
the standard
library that would really help here as (if I read it correctly)
it is mainly
because of the conversion from ranges to arrays that this code is
slow.
More information about the Digitalmars-d-learn
mailing list