[phobos] datetime review

Andrei Alexandrescu andrei at erdani.com
Sat Oct 9 12:27:26 PDT 2010


On 10/8/10 16:04 CDT, Jonathan M Davis wrote:
> Okay. Here is my current datetime code for review.

Solid work. I'm on vacation and I gave it a thorough read last night. 
David's parallelism library would have come first, but that's not 
vacation reading...

A word of caution. I've been on the boost mailing list for a good while 
and they have the best review process ever. The result is invariably 
twofold: (a) a very good quality library; (b) a suicidal-depressive 
library author.

You need to expect to take quite a few hits from me and others, but 
please don't take it personally. Also, a word for the other reviewers: 
it's much easier (and therefore more luring) to bury a submission for 
its weaknesses instead of suggesting concrete steps to make it stronger.

High-level thoughts
===================

* At 40KLOC, the library is probably on a par with the first Netscape 
browser. This is a bit extravagant for a date and time library, and last 
thing we want is Phobos to become a collection of elaborate trivialities 
(I'm not saying date/time is a trivial matter, but of the 40KLOC there 
are probably only a couple thousand that do actual work, of which a 
couple hundred that do nontrivial work). We should look into ways to 
reduce its sheer size, and, most importantly, the size of its 
user-visible API. (More concrete suggestions to follow.)

* The library has a fundamental inconsistency: it can't decide whether 
it goes with sheer "long" representation for various durations (days, 
months, hnseconds etc.) versus typed representations (MonthDuration, 
HNDuration, JointDuration). This hurts everybody everywhere: the library 
spends real estate to implement durations, and then the API explodes 
because it doesn't use them so it needs to be explicit about measurement 
units. For example, we have a hecatomb of addXxx functions: addDays, 
addWeeks, addThis, addThat, addTheOther. The design should have 
exploited typed durations and support += for durations. Then you write d 
+= add(weeks(3)); and so on.

* Speaking of which, we come back at Walter's point: should we go with 
"long" for representing various durations and then make the meaning 
clear with function names, OR should we go with explicit strong types 
for times and durations? It's a good question, and arguments could be 
made for either case. Using "long" would probably trim away 60% of the 
code size. On the other hand, if we look at successful date and time 
libraries, we see that they do tend to be elaborate about granular 
types. So my suggestion is to not mess with success and go with 
elaborate typing. BUT I do want to cut a good fraction of sheer 
repetition by use of the language. AND I also want to not be able to 
find "long" in the library except in e.g. multiplication of a duration 
with a scalar and in the private definition of the typed abstractions.

* This brings me to one beef I have: the library is repetitive and 
tedious. This is because it makes very little use of generative and 
generic capabilities. Most of it could have been written in C with 
structs and functions. I'll follow up with concrete suggestions.

* The library uses at least one actively unrecommended technique (global 
alias for built-ins) and one questionable or at least 
not-yet-widely-used technique (use of a class as a namespace). 
Explanations follow.

Details
=======

* core.d is by far the weakest part of the library. In fact, it being 
the first one I've read, I got biased early on against the library. I'm 
very glad I continued to read through.

* Global aliases of the form

alias short  Year;              /// Unit for holding a Gregorian year.

are a poor technique in C, C++, and D. I've read articles and rants 
about it in several places. In brief, the problem is that such aliases 
lure their user into thinking they are genuine types with the usual 
safeguards that types offer (such as disallowing illegal operations 
during compilation time and/or run time).

DayOfMonth dom;
Month m;
...
m += dom; // won't compile I guess?!?
dom = 32; // throw I suppose?!?

Recommendation: eliminate these. At best make them private and give them 
the "Rep" (representation) suffix. But I suspect you don't even need them.

* This enum:

enum Month : MonthOfYear { jan = 1, /// January.
                            feb,     /// February.
                            mar,     /// March.
                            apr,     /// April.
                            may,     /// May.
                            jun,     /// June.
                            jul,     /// July.
                            aug,     /// August.
                            sep,     /// September.
                            oct,     /// October.
                            nov,     /// November.
                            dec      /// December.
                          }

The doc comments are not needed. Just put "/// Month" for the first and 
"/// Ditto" for all others.

* Similar remark about comments to:

enum DayOfWeek : ushort { sunday = 0, /// Sunday.
                           monday,     /// Monday.
                           tuesday,    /// Tuesday.
                           wednesday,  /// Wednesday.
                           thursday,   /// Thursday.
                           friday,     /// Friday.
                           saturday    /// Saturday.
                         }

The names are self-explanatory, no need to repeat them! I also notice 
that (a) DayOfWeek uses ushort directly (why not ubyte?) instead of 
going through a global alias (like Month) that I'd recommend you remove 
anyway.

* These types suggest to me that we need to finally implement a Bounded 
type such that Bounded!(ubyte, 1, 7) would be a good representation for 
a day of week. We discussed Bounded a fair amount in the newsgroup a 
while ago. I think that could save a lot of code down the road (e.g. in 
duration.d), in addition to being a good library artifact of its own.

* This enum and its kin:

enum AllowDayOverflow
{
     /// Yes, allow day overflow.
     yes,

     /// No, don't allow day overflow.
     no
}

should be like this:

enum AllowDayOverflow
{
     /// No, don't allow day overflow.
     no,
     /// Yes, allow day overflow.
     yes
}

such that code can write:

void fun(AllowDayOverflow allowDayOverflow) {
    ...
    if (allowDayOverflow) ...
    ...
    if (!allowDayOverflow) ...
}

Right now writing such works but with the reverse effect.

* "this(string msg, string file = __FILE__, size_t line = __LINE__, 
Throwable next = null) nothrow" -> the __FILE__ and __LINE__ are useless 
(they are yours, not your caller's). You need to make them template 
parameters to work, see other places in Phobos. At least that's what I 
recall. Walter?

* "final class TUnitConv" used as a namespace. Can't we organize things 
such that we introduce few enough names for such namespaces to be unneeded?

* "static long to(TUnit tuFrom, TUnit tuTo)(long value)" is a spark of 
brilliance that I'd love to see a lot more of. It's essentially the 
difference between reading and memorizing 90 function names versus 
memorizing one function name and an enum with 10 values. That being 
said, the rest of the code sadly doesn't transform this spark into a 
full-blown explosion - it goes right back to triviality such as e.g. 
hnsecsToDays. Those are hardly needed in the implementation (use 
template constraints) and DEFINITELY not needed in the documentation. 
They just pollute it.

Suggestions: (a) throw TUnitConv away; (b) define this and friends at 
global scope:

long convert(TUnit from, TUnit to)(long);

That being said, convert() does go against the grain of elaborate typing 
for durations. I'd accept it as a low-level function, but really the 
library currently fosters elaborate types for durations... so there's a 
disconnect here.

* To be specific: throw away yearsToMonths et al. No need for them.

* I suggest even experimenting with strings instead of TUnit. Really 
TUnit is a vestige from C++ and Java, but we do accept string template 
parameters so how about

assert(convert!("years", "months")(2) == 24);

etc. Opinions?

* All string functions in core.d make me nervous because they hint to no 
support for locales, and are very pedestrian as they go. I'd prefer at 
least some sort of table with names for "year", "years" etc. in 
conjunction with reusable code, instead of rigid, nonreusable code with 
inline strings. Same about monthOfYearToString etc.

* validXxx and validXxxOfYyy is in desperate need for a template 
approach a la to/convert. Virtually anything that encodes types in 
function names is such a candidate.

* As another random example:

DayOfWeek getDayOfWeek(DayOfGregorianCal day) pure nothrow

could/should be (in client use):

get!(TUnit.dayOfWeek, TUnit.dayOfGregorianCal)(day);

* Generally, it's clear from above why I was displeased with core.d. At 
1952 lines, it's a lot of pain for very little gain. It encodes date 
artifacts in function names leading to an explosion of API functions. It 
uses too little generic mechanisms to reduce both API size and 
implementation size.

* On to duration.d. It's simple: there's no need for three durations. 
Rename JointDuration to Duration and throw everything else. I'd find 
myself hard pressed to produce situations in which efficiency 
considerations make it impossible to cope with a 128-bit representation.

* If you do want to go fine-grained durations without the aggravation, 
use templates. Then you can have Duration!HNSeconds, Duration!Seconds, 
Duration!Months, whatever you want, without inflicting pain on yourself 
and others. If you play your cards right, such durations would match 
100% the templates you define in core.d and elsewhere so you don't need 
to make a lot of implementation effort. BUT again I think one duration 
is enough. Maximum two - short duration and duration.

* timezone.d looks okay.

* At 23KLOC, timepoint.d better had a walkOnWater() function in there.

* currLocalTime, currUTC, currOtherTZ should be 
current!(TUnit.localTime), current!(TUnit.utc) - bringing again API 
consolidation into discussion.

(need to finish this just about here)

* other.d - bad name. Like Walter, I think the physical design should be 
one .di file and one or more .d files.

* Duration and Interval should be consolidated.


Andrei


More information about the phobos mailing list