Rosetta Commatizing numbers
Ivan Kazmenko via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Tue May 30 21:31:14 PDT 2017
On Tuesday, 30 May 2017 at 10:54:49 UTC, Solomon E wrote:
> I ran into a Rosetta code solution in D that had obvious
> errors. It's like the author or the previous editor wasn't even
> trying to do it right, like a protest against how many detailed
> rules the task had. I assumed that's not the way we want to do
> things in D.
> ...
> Does anyone have any thoughts about this? Did I do right by D?
I'd say the previous version (by bearophile) suited the task much
better, but both aren't perfect.
As a general note, consider the following paragraph of the
problem statement:
"Some of the commatizing rules (specified below) are arbitrary,
but they'll be a part of this task requirements, if only to make
the results consistent amongst national preferences and other
disciplines."
This literally means that, while there are complex rules in the
real world for commatizing numbers, the problem is kept simple by
enforcing strict rules. The minute concerns of the Real World,
like "Current New Zealand dollar format overrides old Zimbabwe
dollar format", are irrelevant to the formal problem being
solved. Perhaps the example inputs section ("Strings to be used
as a minimum") gets misleading, but that's what they are:
examples, not general rules. By the way, as it's a wiki page,
problem statement text could also be improved ;) .
Why? For example, look at Indian numbering system where
commatizing is visibly different
(https://en.wikipedia.org/wiki/Indian_numbering_system) - and we
don't know whether the string should use it or not without the
context. Or consider that hexadecimal numbers are usually split
in groups of four digits, not three - and we don't know whether a
[0-9]+ number is decimal or hexadecimal without the context.
See, trying to provide an ultimate solution to real-world
commatizing, while keeping it a single function without the
context, can't possibly succeed.
What can be done, then? Well, the page authors already did the
difficult part for us: they extracted the essence of a complex
real-world problem into a small set of formal rules, which are
now the formal problem statement. Now comes the easy part: to do
exactly what is asked in the problem statement. The flexibility
comes from having function parameters. If we have a solution to
a formal problem, using it for the real-world version of the
problem is either just specifying the right parameters
(hopefully), or changing the function if the real world gets too
complex for it. In the latter case, the more short and readable
the existing solution is, the faster can we change the function
to suit our real-world case.
-----
Now, where is the old version wrong? Turns out it just calls the
function with default parameters for every line of input - which
is wrong since the first two input lines need to be handled
specially. Well, that's what the function parameters are for.
To have a correct solution, we have to use custom parameters for
the first two lines of input. The function itself is fine.
Your solution addresses this problem by special-casing the inputs
inside the function, perhaps because of the misleading inputs
section in the problem statement. That's a wrong approach.
First, it introduces magic numbers 33 and 36 into the code, which
is a bad programming practice (see here:
https://en.wikipedia.org/wiki/Magic_number_(programming)#Unnamed_numerical_constants). Second, it's plain wrong. According to the problem statement, we don't have these rules for every possible line of >33 standalone decimals, or >36 characters in total. We just have to call our function with a concrete set of custom parameters for one concrete example, and other set of parameters for another example. That's to demonstrate that our function accepts and makes proper use of custom parameters! Special-casing example inputs inside the function is not a solution: if we go down this path, the perfect solution would be a bunch of "if" statements for every possible example input producing the respective example outputs, and empty function for all other possible inputs.
So, how do we call with special parameters? Currently, we can
look at every other language except C# as inspiration: ALGOL 68,
J, Java, Perl 6, Phix, Racket, and REXX. Your solution also has
a good way to check example inputs: a unittest block. It even
shows one of D's strengths compared to other languages. And
there, you do use custom parameters to check that the function
works. A good approach would be to put all the examples in the
unittest instead of reading them from a file. This way, the
program will be immediately usable and runnable: no need to
create an additional arbitrarily-named file just to test it.
-----
All in all, the only thing I'd change in bearophile's solution is
to remove the file reading loop, add the unittest block from your
solution instead, and place all the examples there. Printing the
result does not seem imperative on Rosettacode, and there are at
least some entries in D which already use unittest for checking
the problem requirements (for example,
https://rosettacode.org/wiki/Sorting_algorithms/Cocktail_sort#D).
Lastly, please note that Rosettacode supports multiple versions
in a single language (example:
http://rosettacode.org/wiki/99_Bottles_of_Beer#D). As
bearophile's version certainly has its merits, I strongly suggest
to keep it available, either merged with your current version to
produce the right solution, or as a second version.
Ivan Kazmenko.
More information about the Digitalmars-d-learn
mailing list