A little story

Sun Jun 24 14:52:35 PDT 2012

Just a recent little coding story.

Python is a good language for "exploratory programming", because 
the programs are succinct, it's flexible, its data structures are 
easy to use and good, there are already written modules to do 
most things, there are very easy to use libraries to plot results 
or import data and so on.

Doing this kind of programing I have written a small 
single-module command line Python program, just few hundred lines 
long. Once debugged it seemed to work, but only up to N=18, 
beyond that it uses too much memory and CPU time for me.

To solve the performance problem I've ported it to D. This 
translation didn't took a lot of time, and the resulting 
single-module D program was just 10-15% longer than the Python 
module. But it didn't work, the results were different from the 
Python veersion. I have spent several hours to find the bug, that 
was a nasty D associative array bug already present in Bugzilla.

This was not nice, but once fixed that, it gave the same results 
as the Python version from N=1 to N=18. Now the program is tens 
of times faster and I'm able to solve for N=19 and even N=20, 
good. Being this exploratory programming I don't know the correct 
results for those N=19 and N=20. I am only able to estimate the 
correct results and the estimate is compatible with the results 
given by the D program. This is good, but spending some hours to 
look for a bug has made me suspicious of the D program, maybe it 
was buggy still.

So I translate the Python program to FreePascal (a modern free 
Object Pascal, similar to Delphi). This translation wasn't hard 
to do, I have used one library of mine that implements 
associative arrays in FreePascal, but it was a little slow and 
boring, and the resulting program was long.

When I run the FreePascal program it gives the same results from 
N=1 to N=18 as the Python version, this is good. But with N=19 it 
gave a run-time integer overflow error. Up to N=18 a certain 32 
bit int variable was enough, but right with N=19 the program 
tried to store inside it a value past 2 billions. A compilation 
switch of FreePascal allows to turn run-time overflows of all 
integral values used by the program into run-time errors.

I have replaced that 32 bit int with a 64 bit int in the 
FreePascal code. Running again the FreePascal program again I 
have found another integral overflow error. If I replace that 
variable too with a 64 bit int, the program runs and it gives a 
number slightly different from the number given by the D code. If 
I compile and run the original FreePascal code without the 
run-time overflow tests it gives the same results as the D 
version for N=19 and N=20.

I can study the D code, looking for variables that cause 
overflow, but this study requires time, because the program is 
small, but its algorithms are intricate.

So for this program born from explorative programming that does 
some integral number crunching:
- Python language is not fit because it's too much slow and 
because in certain cases I prefer a little stronger static type 
safety, that's useful to not waste time debugging the usage of 
intricate data structures.
- FreePascal is not fit because it's not flexible enough and 
because it's too much low-level for a exploration that must be 
quick.
- And D is too much unsafe for such kind of programs, because 
integral numbers can silently overflow.

Maybe C# running on Mono is a good enough language for those 
purposes (it's probably fast enough despite not running on the 
dotnet and it has structs to save memory, it detects run-time 
integral overflows, it has data structures and maybe it's 
flexible enough).

Bye,
bearophile