optimized array operations

Denis Koroskin 2korden at gmail.com
Mon Sep 22 10:15:15 PDT 2008


On Mon, 22 Sep 2008 20:33:40 +0400, Eugene Pelekhay <pelekhay at gmail.com>  
wrote:

> bearophile Wrote:
>
>> Eugene Pelekhay:
>> > I'm finished optimized version of array operations, using SSE2  
>> instructions.
>>
>> Where are the benchmarks to compare the performance of the old version  
>> with the new one?
>
> It's bencmark() function old implementation begins form undrscore
>

I renamed benchmark() to main() and run the tests with -release -inline.
Well done! Your implementation does indeed perform up to 2 times better  
that built-in version, according to the results. Here they are:

length=1
_add Time elapsed: 280.319/0 ticks OK
_sub Time elapsed: 256.594/0 ticks OK
add Time elapsed: 214.942/0 ticks OK
sub Time elapsed: 154.829/0 ticks OK
mul Time elapsed: 150.332/0 ticks OK
div Time elapsed: 157.477/0 ticks OK
length=2
_add Time elapsed: 156.033/151.279 ticks OK
_sub Time elapsed: 156.565/152.039 ticks OK
add Time elapsed: 150.95/157.855 ticks OK
sub Time elapsed: 156.006/159.271 ticks OK
mul Time elapsed: 154.156/158.504 ticks OK
div Time elapsed: 195.399/187.488 ticks OK
length=3
_add Time elapsed: 171.777/171.285 ticks OK
_sub Time elapsed: 172.744/173.135 ticks OK
add Time elapsed: 158.411/171.716 ticks OK
sub Time elapsed: 158.756/167.616 ticks OK
mul Time elapsed: 159.1/168.461 ticks OK
div Time elapsed: 236.523/230.023 ticks OK
length=4
_add Time elapsed: 187.279/187.507 ticks OK
_sub Time elapsed: 187.64/186.197 ticks OK
add Time elapsed: 158.703/173.175 ticks OK
sub Time elapsed: 157.075/174.885 ticks OK
mul Time elapsed: 160.297/174.931 ticks OK
div Time elapsed: 275.418/262.012 ticks OK
length=5
_add Time elapsed: 204.594/200.33 ticks OK
_sub Time elapsed: 203.379/207.323 ticks OK
add Time elapsed: 163.637/183.095 ticks OK
sub Time elapsed: 164.883/179.372 ticks OK
mul Time elapsed: 166.262/178.485 ticks OK
div Time elapsed: 317.013/317.882 ticks OK
length=6
_add Time elapsed: 209.717/210.563 ticks OK
_sub Time elapsed: 209.713/211.452 ticks OK
add Time elapsed: 163.643/181.725 ticks OK
sub Time elapsed: 164.512/187.003 ticks OK
mul Time elapsed: 162.201/185.62 ticks OK
div Time elapsed: 352.298/340.421 ticks OK
length=7
_add Time elapsed: 238.152/232.379 ticks OK
_sub Time elapsed: 232.444/226.117 ticks OK
add Time elapsed: 167.371/190.912 ticks OK
sub Time elapsed: 164.427/188.946 ticks OK
mul Time elapsed: 163.788/189.1 ticks OK
div Time elapsed: 385.258/396.481 ticks OK
length=8
_add Time elapsed: 235.129/230.914 ticks OK
_sub Time elapsed: 184.234/229.212 ticks OK
add Time elapsed: 165.899/192.069 ticks OK
sub Time elapsed: 167.237/191.622 ticks OK
mul Time elapsed: 167.428/196.7 ticks OK
div Time elapsed: 452.226/439.266 ticks OK
length=9
_add Time elapsed: 242.522/242.835 ticks OK
_sub Time elapsed: 198.431/186.926 ticks OK
add Time elapsed: 178.967/202.155 ticks OK
sub Time elapsed: 173.741/197.983 ticks OK
mul Time elapsed: 176.276/211.608 ticks OK
div Time elapsed: 457.407/479.647 ticks OK
length=10
_add Time elapsed: 260.412/257.576 ticks OK
_sub Time elapsed: 207.76/206.046 ticks OK
add Time elapsed: 169.342/202.356 ticks OK
sub Time elapsed: 169.686/201.182 ticks OK
mul Time elapsed: 168.23/201.069 ticks OK
div Time elapsed: 504.553/499.578 ticks OK
length=11
_add Time elapsed: 270.73/266.472 ticks OK
_sub Time elapsed: 226.691/221.545 ticks OK
add Time elapsed: 171.588/212.554 ticks OK
sub Time elapsed: 172.838/213.734 ticks OK
mul Time elapsed: 177.953/223.758 ticks OK
div Time elapsed: 525.984/539.466 ticks OK
length=12
_add Time elapsed: 302.982/305.531 ticks OK
_sub Time elapsed: 247.649/236.579 ticks OK
add Time elapsed: 175.799/218.129 ticks OK
sub Time elapsed: 174.062/217.84 ticks OK
mul Time elapsed: 175.902/221.012 ticks OK
div Time elapsed: 573.274/560.699 ticks OK
length=13
_add Time elapsed: 313.339/311.964 ticks OK
_sub Time elapsed: 250.683/242.214 ticks OK
add Time elapsed: 181.252/219.131 ticks OK
sub Time elapsed: 182.489/220.372 ticks OK
mul Time elapsed: 177.835/227.545 ticks OK
div Time elapsed: 598.186/618.626 ticks OK
length=14
_add Time elapsed: 326.999/320.797 ticks OK
_sub Time elapsed: 264.637/257.676 ticks OK
add Time elapsed: 173.004/226.604 ticks OK
sub Time elapsed: 176.729/226.711 ticks OK
mul Time elapsed: 179.686/234.603 ticks OK
div Time elapsed: 649.893/640.516 ticks OK
length=15
_add Time elapsed: 364.665/354.617 ticks OK
_sub Time elapsed: 274.117/266.381 ticks OK
add Time elapsed: 179.939/233.329 ticks OK
sub Time elapsed: 179.812/233.004 ticks OK
mul Time elapsed: 179.694/235.165 ticks OK
div Time elapsed: 673.612/699.527 ticks OK
length=16
_add Time elapsed: 223.35/327.695 ticks OK
_sub Time elapsed: 227.287/272.908 ticks OK
add Time elapsed: 185.393/236.471 ticks OK
sub Time elapsed: 182.69/236.823 ticks OK
mul Time elapsed: 186.721/243.525 ticks OK
div Time elapsed: 717.472/725.396 ticks OK
length=17
_add Time elapsed: 248.901/230.275 ticks OK
_sub Time elapsed: 247.695/229.207 ticks OK
add Time elapsed: 191.686/253.113 ticks OK
sub Time elapsed: 192.252/255.335 ticks OK
mul Time elapsed: 197.645/255.323 ticks OK
div Time elapsed: 757.526/783.176 ticks OK
length=18
_add Time elapsed: 256.292/251.377 ticks OK
_sub Time elapsed: 283.259/272.567 ticks OK
add Time elapsed: 239.513/350.044 ticks OK
sub Time elapsed: 230.761/312.844 ticks OK
mul Time elapsed: 216.002/295.263 ticks OK
div Time elapsed: 835.063/882.578 ticks OK
length=19
_add Time elapsed: 269.324/266.079 ticks OK
_sub Time elapsed: 268.632/260.185 ticks OK
add Time elapsed: 202.924/277.364 ticks OK
sub Time elapsed: 204.761/282.057 ticks OK
mul Time elapsed: 212.51/280.277 ticks OK
div Time elapsed: 859.039/862.403 ticks OK
length=20
_add Time elapsed: 491.68/478.187 ticks OK
_sub Time elapsed: 302.505/292.327 ticks OK
add Time elapsed: 251.343/365.493 ticks OK
sub Time elapsed: 253.363/364.301 ticks OK
mul Time elapsed: 283.059/409.972 ticks OK
div Time elapsed: 1000.98/1085.22 ticks OK
length=21
_add Time elapsed: 503.658/492.33 ticks OK
_sub Time elapsed: 518.409/509.397 ticks OK
add Time elapsed: 288.601/403.432 ticks OK
sub Time elapsed: 253.521/355.318 ticks OK
mul Time elapsed: 247.759/342.146 ticks OK
div Time elapsed: 1070.34/1187.87 ticks OK
length=22
_add Time elapsed: 504.76/489.249 ticks OK
_sub Time elapsed: 511.489/493.343 ticks OK
add Time elapsed: 287.776/419.468 ticks OK
sub Time elapsed: 290.898/428.521 ticks OK
mul Time elapsed: 350.474/504.08 ticks OK
div Time elapsed: 1023.04/1086.01 ticks OK
length=23
_add Time elapsed: 364.767/356.623 ticks OK
_sub Time elapsed: 437.096/440.046 ticks OK
add Time elapsed: 239.832/351.762 ticks OK
sub Time elapsed: 215.47/304.019 ticks OK
mul Time elapsed: 234.817/332.294 ticks OK
div Time elapsed: 978.904/995.593 ticks OK
length=24
_add Time elapsed: 331.951/333.321 ticks OK
_sub Time elapsed: 273.76/333.263 ticks OK
add Time elapsed: 205.865/305.647 ticks OK
sub Time elapsed: 205.496/299.6 ticks OK
mul Time elapsed: 208.869/308.821 ticks OK
div Time elapsed: 1049.6/1086.68 ticks OK
length=25
_add Time elapsed: 351.851/349.097 ticks OK
_sub Time elapsed: 302.465/285.624 ticks OK
add Time elapsed: 218.116/306.209 ticks OK
sub Time elapsed: 219.07/311.153 ticks OK
mul Time elapsed: 220.504/305.367 ticks OK
div Time elapsed: 1053.65/1094.66 ticks OK


More information about the Digitalmars-d-announce mailing list