[phobos] datetime review
Andrei Alexandrescu
andrei at erdani.com
Sat Oct 9 12:27:26 PDT 2010
On 10/8/10 16:04 CDT, Jonathan M Davis wrote:
> Okay. Here is my current datetime code for review.
Solid work. I'm on vacation and I gave it a thorough read last night.
David's parallelism library would have come first, but that's not
vacation reading...
A word of caution. I've been on the boost mailing list for a good while
and they have the best review process ever. The result is invariably
twofold: (a) a very good quality library; (b) a suicidal-depressive
library author.
You need to expect to take quite a few hits from me and others, but
please don't take it personally. Also, a word for the other reviewers:
it's much easier (and therefore more luring) to bury a submission for
its weaknesses instead of suggesting concrete steps to make it stronger.
High-level thoughts
===================
* At 40KLOC, the library is probably on a par with the first Netscape
browser. This is a bit extravagant for a date and time library, and last
thing we want is Phobos to become a collection of elaborate trivialities
(I'm not saying date/time is a trivial matter, but of the 40KLOC there
are probably only a couple thousand that do actual work, of which a
couple hundred that do nontrivial work). We should look into ways to
reduce its sheer size, and, most importantly, the size of its
user-visible API. (More concrete suggestions to follow.)
* The library has a fundamental inconsistency: it can't decide whether
it goes with sheer "long" representation for various durations (days,
months, hnseconds etc.) versus typed representations (MonthDuration,
HNDuration, JointDuration). This hurts everybody everywhere: the library
spends real estate to implement durations, and then the API explodes
because it doesn't use them so it needs to be explicit about measurement
units. For example, we have a hecatomb of addXxx functions: addDays,
addWeeks, addThis, addThat, addTheOther. The design should have
exploited typed durations and support += for durations. Then you write d
+= add(weeks(3)); and so on.
* Speaking of which, we come back at Walter's point: should we go with
"long" for representing various durations and then make the meaning
clear with function names, OR should we go with explicit strong types
for times and durations? It's a good question, and arguments could be
made for either case. Using "long" would probably trim away 60% of the
code size. On the other hand, if we look at successful date and time
libraries, we see that they do tend to be elaborate about granular
types. So my suggestion is to not mess with success and go with
elaborate typing. BUT I do want to cut a good fraction of sheer
repetition by use of the language. AND I also want to not be able to
find "long" in the library except in e.g. multiplication of a duration
with a scalar and in the private definition of the typed abstractions.
* This brings me to one beef I have: the library is repetitive and
tedious. This is because it makes very little use of generative and
generic capabilities. Most of it could have been written in C with
structs and functions. I'll follow up with concrete suggestions.
* The library uses at least one actively unrecommended technique (global
alias for built-ins) and one questionable or at least
not-yet-widely-used technique (use of a class as a namespace).
Explanations follow.
Details
=======
* core.d is by far the weakest part of the library. In fact, it being
the first one I've read, I got biased early on against the library. I'm
very glad I continued to read through.
* Global aliases of the form
alias short Year; /// Unit for holding a Gregorian year.
are a poor technique in C, C++, and D. I've read articles and rants
about it in several places. In brief, the problem is that such aliases
lure their user into thinking they are genuine types with the usual
safeguards that types offer (such as disallowing illegal operations
during compilation time and/or run time).
DayOfMonth dom;
Month m;
...
m += dom; // won't compile I guess?!?
dom = 32; // throw I suppose?!?
Recommendation: eliminate these. At best make them private and give them
the "Rep" (representation) suffix. But I suspect you don't even need them.
* This enum:
enum Month : MonthOfYear { jan = 1, /// January.
feb, /// February.
mar, /// March.
apr, /// April.
may, /// May.
jun, /// June.
jul, /// July.
aug, /// August.
sep, /// September.
oct, /// October.
nov, /// November.
dec /// December.
}
The doc comments are not needed. Just put "/// Month" for the first and
"/// Ditto" for all others.
* Similar remark about comments to:
enum DayOfWeek : ushort { sunday = 0, /// Sunday.
monday, /// Monday.
tuesday, /// Tuesday.
wednesday, /// Wednesday.
thursday, /// Thursday.
friday, /// Friday.
saturday /// Saturday.
}
The names are self-explanatory, no need to repeat them! I also notice
that (a) DayOfWeek uses ushort directly (why not ubyte?) instead of
going through a global alias (like Month) that I'd recommend you remove
anyway.
* These types suggest to me that we need to finally implement a Bounded
type such that Bounded!(ubyte, 1, 7) would be a good representation for
a day of week. We discussed Bounded a fair amount in the newsgroup a
while ago. I think that could save a lot of code down the road (e.g. in
duration.d), in addition to being a good library artifact of its own.
* This enum and its kin:
enum AllowDayOverflow
{
/// Yes, allow day overflow.
yes,
/// No, don't allow day overflow.
no
}
should be like this:
enum AllowDayOverflow
{
/// No, don't allow day overflow.
no,
/// Yes, allow day overflow.
yes
}
such that code can write:
void fun(AllowDayOverflow allowDayOverflow) {
...
if (allowDayOverflow) ...
...
if (!allowDayOverflow) ...
}
Right now writing such works but with the reverse effect.
* "this(string msg, string file = __FILE__, size_t line = __LINE__,
Throwable next = null) nothrow" -> the __FILE__ and __LINE__ are useless
(they are yours, not your caller's). You need to make them template
parameters to work, see other places in Phobos. At least that's what I
recall. Walter?
* "final class TUnitConv" used as a namespace. Can't we organize things
such that we introduce few enough names for such namespaces to be unneeded?
* "static long to(TUnit tuFrom, TUnit tuTo)(long value)" is a spark of
brilliance that I'd love to see a lot more of. It's essentially the
difference between reading and memorizing 90 function names versus
memorizing one function name and an enum with 10 values. That being
said, the rest of the code sadly doesn't transform this spark into a
full-blown explosion - it goes right back to triviality such as e.g.
hnsecsToDays. Those are hardly needed in the implementation (use
template constraints) and DEFINITELY not needed in the documentation.
They just pollute it.
Suggestions: (a) throw TUnitConv away; (b) define this and friends at
global scope:
long convert(TUnit from, TUnit to)(long);
That being said, convert() does go against the grain of elaborate typing
for durations. I'd accept it as a low-level function, but really the
library currently fosters elaborate types for durations... so there's a
disconnect here.
* To be specific: throw away yearsToMonths et al. No need for them.
* I suggest even experimenting with strings instead of TUnit. Really
TUnit is a vestige from C++ and Java, but we do accept string template
parameters so how about
assert(convert!("years", "months")(2) == 24);
etc. Opinions?
* All string functions in core.d make me nervous because they hint to no
support for locales, and are very pedestrian as they go. I'd prefer at
least some sort of table with names for "year", "years" etc. in
conjunction with reusable code, instead of rigid, nonreusable code with
inline strings. Same about monthOfYearToString etc.
* validXxx and validXxxOfYyy is in desperate need for a template
approach a la to/convert. Virtually anything that encodes types in
function names is such a candidate.
* As another random example:
DayOfWeek getDayOfWeek(DayOfGregorianCal day) pure nothrow
could/should be (in client use):
get!(TUnit.dayOfWeek, TUnit.dayOfGregorianCal)(day);
* Generally, it's clear from above why I was displeased with core.d. At
1952 lines, it's a lot of pain for very little gain. It encodes date
artifacts in function names leading to an explosion of API functions. It
uses too little generic mechanisms to reduce both API size and
implementation size.
* On to duration.d. It's simple: there's no need for three durations.
Rename JointDuration to Duration and throw everything else. I'd find
myself hard pressed to produce situations in which efficiency
considerations make it impossible to cope with a 128-bit representation.
* If you do want to go fine-grained durations without the aggravation,
use templates. Then you can have Duration!HNSeconds, Duration!Seconds,
Duration!Months, whatever you want, without inflicting pain on yourself
and others. If you play your cards right, such durations would match
100% the templates you define in core.d and elsewhere so you don't need
to make a lot of implementation effort. BUT again I think one duration
is enough. Maximum two - short duration and duration.
* timezone.d looks okay.
* At 23KLOC, timepoint.d better had a walkOnWater() function in there.
* currLocalTime, currUTC, currOtherTZ should be
current!(TUnit.localTime), current!(TUnit.utc) - bringing again API
consolidation into discussion.
(need to finish this just about here)
* other.d - bad name. Like Walter, I think the physical design should be
one .di file and one or more .d files.
* Duration and Interval should be consolidated.
Andrei
More information about the phobos
mailing list