Help optimize D solution to phone encoding problem: extremely slow performace.

Renato renato at athaydes.com
Wed Jan 17 10:24:31 UTC 2024


On Wednesday, 17 January 2024 at 09:15:02 UTC, evilrat wrote:
> On Wednesday, 17 January 2024 at 07:11:02 UTC, Renato wrote:
>>
>> If you want to check your performance, you know you can run 
>> the `./benchmark.sh` yourself?
>
> Out of curiosity I've tried to manually run this on Windows and 
> it seems that Java generator for these numbers files is 
> "broken", the resulting count or print runs fine for both Java 
> and D versions provided in your D branch, but fails with 
> generated files.
>
> D version complains about bad utf8 encoding.
> I've opened the generated file in text editor and it is UTF-16 
> (little-endian with BOM).
>
> Tried with adoptium jdk 17 and 21 (former openjdk), but I guess 
> it doesn't matter since UTF-16 is default on Windows.

It's not Java writing the file, it's the bash script 
[`benchmark.sh`](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/master/benchmark.sh#L31):

```
java -cp "build/util" util.GeneratePhoneNumbers 1000 > 
phones_1000.txt
```

Java is just printing to stdout. I wasn't aware that piping like 
this would use the OS default encoding. Unfortunately I don't 
have Windows here, but try to change the encoding used by the 
bash script maybe?


More information about the Digitalmars-d-learn mailing list