[Issue 9646] New: std.algorithm.splitter for strings has opportunities for improvement
d-bugmail at puremagic.com
d-bugmail at puremagic.com
Mon Mar 4 11:10:43 PST 2013
http://d.puremagic.com/issues/show_bug.cgi?id=9646
Summary: std.algorithm.splitter for strings has opportunities
for improvement
Product: D
Version: D2
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Phobos
AssignedTo: nobody at puremagic.com
ReportedBy: schveiguy at yahoo.com
--- Comment #0 from Steven Schveighoffer <schveiguy at yahoo.com> 2013-03-04 11:10:42 PST ---
Based on the way std.algorithm.splitter is implemented, it should perform at
least as well as a simple hand-written range that does a brute force search for
separators.
However, with certain test data, and a simple range, it does not perform as
well.
An example (take from a post on the newsgroups), along with a simple MySplitter
type that can split on any substring. I have not done any in-depth research as
to why this works better:
import std.stdio;
import std.datetime;
import std.algorithm;
struct MySplitter
{
private string s;
private string separator;
private string source;
this(string src, string sep)
{
source = src;
separator = sep;
popFront();
}
@property string front()
{
return s;
}
@property bool empty()
{
return s.ptr == null;
}
void popFront()
{
if(!source.length)
{
s = source;
source = null;
}
else
{
size_t i = 0;
bool found = false;
outer:
for(; i + separator.length <= source.length; i++)
{
if(source[i] == separator[0])
{
found = true;
for(size_t j = i+1, k=1; k < separator.length; ++j, ++k)
if(source[j] != separator[k])
{
found = false;
continue outer;
}
break;
}
}
s = source[0..i];
if(found)
source = source[i + separator.length..$];
else
source = source[$..$];
}
}
}
auto message = "REGISTER sip:example.com SIP/2.0\r
Content-Length: 0\r
Contact:
<sip:12345 at 10.1.3.114:59788;transport=tls>;expires=4294967295;events=\"message-summaryq\";q=0.9\r
To: <sip:12345 at comm.example.com>\r
User-Agent: (\"VENDOR=MyCompany\" \"My User Agent\")\r
Max-Forwards: 70\r
CSeq: 1 REGISTER\r
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690\r
Call-ID: 2910497622026445\r
From: <sip:12345 at comm.example.com>;tag=2910497618150713\r\n\r\n";
void main()
{
enum iterations = 5_000_000;
auto t1 = Clock.currTime();
foreach(i; 0..iterations)
{
foreach(notused; splitter(message, "\r\n"))
{
if(!i)
writeln(notused);
}
}
writefln("std.algorithm.splitter took %s", Clock.currTime()-t1);
t1 = Clock.currTime();
foreach(i; 0..iterations)
{
foreach(notused; MySplitter(message, "\r\n"))
{
if(!i)
writeln(notused);
}
}
writefln("MySplitter took %s", Clock.currTime()-t1);
}
result (macbook pro 2.2GHz i7):
REGISTER sip:example.com SIP/2.0
Content-Length: 0
Contact:
<sip:12345 at 10.1.3.114:59788;transport=tls>;expires=4294967295;events="message-summaryq";q=0.9
To: <sip:12345 at comm.example.com>
User-Agent: ("VENDOR=MyCompany" "My User Agent")
Max-Forwards: 70
CSeq: 1 REGISTER
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690
Call-ID: 2910497622026445
From: <sip:12345 at comm.example.com>;tag=2910497618150713
std.algorithm.splitter took 5 secs, 197 ms, and 89 μs
REGISTER sip:example.com SIP/2.0
Content-Length: 0
Contact:
<sip:12345 at 10.1.3.114:59788;transport=tls>;expires=4294967295;events="message-summaryq";q=0.9
To: <sip:12345 at comm.example.com>
User-Agent: ("VENDOR=MyCompany" "My User Agent")
Max-Forwards: 70
CSeq: 1 REGISTER
Via: SIP/2.0/TLS
10.1.3.114:59788;branch=z9hG4bK2910497772630690
Call-ID: 2910497622026445
From: <sip:12345 at comm.example.com>;tag=2910497618150713
MySplitter took 3 secs, 668 ms, and 714 μs
--
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
More information about the Digitalmars-d-bugs
mailing list