[phobos] datetime review (new attempt at URL)
Steve Schveighoffer
schveiguy at yahoo.com
Thu Oct 14 19:04:12 PDT 2010
----- Original Message ----
> From: Jonathan M Davis <jmdavisProg at gmx.com>
> To: phobos at puremagic.com
> Sent: Thu, October 14, 2010 7:07:34 PM
> Subject: Re: [phobos] datetime review (new attempt at URL)
>
> On Thursday, October 14, 2010 12:38:57 Steve Schveighoffer wrote:
> > 1. Are you going to use an extended Gregorian calendar, or use a Julian
> > calendar for the appropriate dates? I recommend using extended Gregorian,
> > because it's much easier to deal with, and anyone who wants to deal with
> > such historic accuracy should be using a much more complete library. In
> > any case, it should be noted what you are using.
>
> It uses the Proleptic Gregorian Calender (so, it uses the Gregorian Calender
> calculations for its whole length), and it follows ISO 8601 by using 0 for 1
> B.C. It's mentioned in the ddoc comments.
Yes, this is what I meant (except when I did it for Tango, I used 1 B.C. for the
year before 1 A.D.). I think I even used the term proleptic, but for some
reason in my head I thought it was called extended :)
>
> >
> > 2. I highly recommend ignoring the concept of leap-seconds, as it just adds
> > constant maintenance (since leap seconds cannot be predicted) and doesn't
> > add much to the library. However, it should be noted whether you support
> > them.
>
> PosixTimeZone will support them if you use one of the time zones that has them
>
> (the ones starting with right/ I believe), but that's it. Since, they come
>from
>
> the tz files, no maintenance is required. Whether LocalTime or UTC uses leap
> seconds is completely system-dependent, but I wouldn't expect them to (and I
> know that they won't on posix systems since posix ignores leap seconds). So,
> truth be told, the right/ time zones will act differently if used with
> PosixTimeZone than if you were to have your system time using them.
What I mean was, if you subtract two points in time that cross a leap-second
boundary, will you take into account the extra second?
>
> >
> > ----
> >
> > I don't like all the aliases for Year, etc. This accomplishes almost
> > nothing except documentation. And even that doesn't add much. I don't
> > see the point of doing:
> >
> > int foo(Year years)
> >
> > vs.
> >
> > int foo(short years)
> >
> > Both seem equally documented to me.
>
> All such aliases have been removed at Andrei's request, though I think that
>one
>
> of the main reasons that I used them originally was because he'd put them in
> std.gregorian. Discussion on that is what led to the discussion of a bounded
> integral type (which I'm not worrying about at this point, but perhaps we can
> change std.datetime to use it later).
One of the benefits of using a single value to represent a duration/point in
time is everything gets normalized. For example, if ask for a date/time that
represents 13/40/2010 at 33 o'clock, it just normalizes out to the correct point
in time.
>
> >
> > ----
> >
> > I like the to!(TUnit, TUnit) conversion function, but I think it is
> > redundant to also have daysTohnsecs (btw, this isn't properly cased, I
> > think it should have been daysToHnsecs). I see the first uses the other,
> > but a better implementation is possible without requiring all the others.
> > Also, you risk unnecessary truncation in your calculations.
> >
> > I'd say get rid of all the extra functions. I would also renumber the enum
> > for TUnit to go from smallest to largest, and I would rewrite the to
> > conversions as:
> >
> > template hnsecPer!(TUnit un) if(TUnit >= TUnit.week) // note reverse this
> > if you reorder enum
> > {
> > static if(un == TUnit.hnsec)
> > enum hnsecPer = 1L;
> > else static if(un == TUnit.usec)
> > enum hnsecPer = 10L;
> > else static if(un == TUnit.msec)
> > enum hnsecPer = 1000 * hnsecPer!TUnit.usec;
> > else static if(un == TUnit.second)
> > enum hnsecPer = 1000 * hnsecPer!TUnit.msec;
> > else static if(un == TUnit.minute)
> > enum hnsecPer = 60 * hnsecPer!TUnit.second;
> > else static if(un == TUnit.hour)
> > enum hnsecPer = 60 * hnsecPer!TUnit.minute;
> > else static if(un == TUnit.day)
> > enum hnsecPer = 24 * hnsecPer!TUnit.hour;
> > else static if(un == TUnit.week)
> > enum hnsecPer = 7 * hnsecPer!TUnit.day;
> > else static assert(0);
> > }
> >
> > ...
> >
> > static long to(TUnit tuFrom, TUnit tuTo)(long value) pure nothrow
> > if(tuFrom >= TUnit.week && tuFrom <= TUnit.hnsec &&
> > tuTo >= TUnit.week && tuTo <= TUnit.hnsec)
> > {
> > static if(tuFrom > tuTo)
> > return value * (hnsecPer!tuTo / hnsecPer!tuFrom);
> > else
> > return value / (hnsecPer!tuFrom / hnsecPer!tuTo);
> > }
> >
> > (Note -- untested)
>
> I'm very torn on the ordering of time units when it matters because years are
> larger units than months, etc. but they have a smaller resolution. So, for
> instance, maxResolution() on Date returns days, while minResolution() returns
> years. And if you treat years as the largest unit, then the units returned
>from
>
> maxResolution() are actually less than the ones returned from
>minResolution()...
I skipped over maxResolution and minResolution. But I thought everything in the
conversion functions used longs?
Oh wait, this is in response to the "reorder the enums" comment, ok.
But now that I'm looking at it, minResolution makes little sense. It only makes
sense about crossing the month/year boundary, but this is well defined by
functions that require it. In other words, when would minResolution matter?
>
> As for rewriting the conversions, I've pretty much just taken the bodies of
>the
>
> other conversion functions that to!() was using and put them directly in to!()
>
> (and renamed it to convert!() at Andrei's request).
The template hnsecsPer I defined above can have many uses throughout the
date/time code, and in other code as well. I think if you consistently based
things off of something like that, and consistently used TUnit as a template
parameter, the code gets much smaller. Look at my conversion function, it's 4
lines of code, and handles every possible conversion besides months (with
maximum resolution).
>
> >
> > ----
> >
> > toString!(TUnit) -- this seems like an extremely fringe need. Not often
do
> > I care about printing a value representing seconds. More often I care
> > about printing a duration, time of day, or a date. Can we drop this and
> > functions that depend on it?
> >
> > Also note that string representation of date/time is one of those things
> > that is highly sensitive to locale. Tango has a ginormous library
> > dedicated to printing locale-dependent stuff including date/time.
> >
> > I agree that having a default print for date/time is fine, esp. for
> > debugging, but let's not try to reinvent formatted printing in the date
> > time module.
>
> Most of the stuff like toString!(TUnit) is helper code for the few functions
> which actually print them out as strings. It wouldn't really hurt to make them
>
> private, thereby restricting whatever locale issues they present to datetime
> itself.
>
> As for printing functions, what I did follows Boost. The various time point
> types have toISOString(), toISOExtendedString(), and toSimpleString(). The
>only
>
> one of the three which would care about locale is toSimpleString() since it
>uses
>
> an abbreviation for the month in it.
>
> Other than that, I believe that the only locale-specific stuff is for printing
>out
>
> units of time which is done primarily (maybe even solely) in Duration. Very
> little is actually locale-specific. It's primarily ISO stuff which ignores
> locales. As such, we could just stick to English for the little that has
>locale-
> specific stuff. For instance, I wouldn't expect printing a Duration to really
>need
>
> to be locale-specific. I would expect it to be primarily for debugging.
> toSimpleString() (and its corresponding fromSimpleString()) would matter a bit
>
> more, but I'd expect code that really cared about intercommunicating with
>other
>
> stuff would use the ISO or ISO Extended strings. However, the simple strings
>do
>
> have the virtue of being somewhat more legible, so I don't think that I'd want
>
> to get rid of them.
I'm looking at things like UseLongName and all those other specific options that
really belong in formatting code. I'd say don't print text versions of
anything, just print numerical representation in a standard format. Libs
outside of datetime will handle text representation, specific to the locale of
the user. Trimming out all this string printing stuff will save a lot on code
size.
>
> >
> > ----
> >
> > timeT2StdTime and stdTime2TimeT -- I really don't like the names here. Can
> > we call it C Time? T has such a known usage as representing a generic
> > type, this was my immediate thought of what it does. In Tango, we called
> > it toUnixTime and fromUnixTime. Also, don't abbreviate to as 2. I hate
> > that :)
>
> I believe that I used the 2 because all of the t's that were already there
>made
>
> using To harder to read. It's not as bad if you use the term unixTime
though.
>
> > Also, I don't think we need another version of this
> > (fromTimeTEpoch2StdTimeEpoch), if you have the wrong unit, just use the
to!
> > functions defined above.
>
> Hmm. The problem is that stdTime2TimeT() and its reverse are not only
>converting
>
> between epochs but also (at least potentially) converting between types
> (depending on the size of time_t on the system in question) as well as the
>unit
>
> type, and stdTime2TimeT() and its reverse have special code for handling the
> fact that time_t and long aren't necessarily the same size, so I'm not sure
>how
>
> you could really combine them with fromTimeTEpoch2StdTime() and its reverse.
> They do similar but different things.
// pseudocode
immutable UnixEpoch = StdTime(1/1/1970);
// if you want to convert to unix epoch time, use:
x - UnixEpoch;
If you want to convert from a time_t, use the defined special function. Note, I
resisted in Tango for a long time defining the to/fromUnixTime functions, but
people complained constantly -- it will definitely be a used feature :) But
nobody asked for converting epochs.
> > I'll echo what I've read from Andrei -- I don't like using classes as
> > namespaces. Find another way.
>
> At the moment, it's down to just Clock and IRange. And personally, I think
>that
>
> they really improve code legibility, so for the moment, I'm leaving them in.
>If
>
> when it finally comes down to it, Andrei absolutely insists that they go, then
>
> I'll obviously have to get rid of them, but personally, I think that they
> definitely make the code that uses them easier to understand (especially for
> IRange).
I'll wait for your updated code before passing judgment ;)
> > General nitpick comment, your ddoc is over 80 chars wide (over 100 in some
> > spots), can you fix this? I don't mind code being wider than necessary,
> > but comments should fit within an 80xN terminal.
>
> I may take the time to fix it, but I generally find trying to restrict stuff
>to 80
>
> characters to be highly annoying, and I won't even consider doing it for code
>-
>
> that definitely harms code readabliity. The only use case that makes any sense
>to
>
> me to really care about is for if you're trying to print out code on paper,
> since (unless you're printing landscape) the number of columns on paper is
> pretty limited.
>
> How many people even use 80xN terminals? There are so many better options. So,
>I
>
> may or may not take the time to fix the ddoc comments to fit in 80 characters,
>but
>
> I don't see much point, and I have enough else to do that I may not get around
>
> to it.
At least one :) I usually do d development by starting an xterm and loading
vim.
It's a simple few keystrokes in my vim to do it (per paragraph), if you want I
can reformat them once you are done. I have certain OCD issues, and this is one
of them :)
>
> > In e.g. HNSecDuration.msecs, you are using a very convoluted way to get the
> > number of milliseconds :) Use mod instead.
>
> Hmmm. That convulated way is pretty much necessary for larger units, but I
>guess
>
> that it wouldn't be for msecs, usecs, or hnsecs. Actually, I'm halfway tempted
>
> to make it use FracSec instead, since splitting out the msecs, usecs, and
>hnsecs
>
> doesn't make sense in the same way that days, hours, minutes, etc. do. You're
> really dealing with the fractional seconds beyond the second at different
>levels
>
> of precision, not full-on separate units.
Here's what I mean. Change:
@property long msecs() const pure nothrow
{
auto hnsecs = removeUnitsFromHNSecs!(TUnit.week)(_hnsecs);
hnsecs = removeUnitsFromHNSecs!(TUnit.day)(hnsecs);
hnsecs = removeUnitsFromHNSecs!(TUnit.hour)(hnsecs);
hnsecs = removeUnitsFromHNSecs!(TUnit.minute)(hnsecs);
hnsecs = removeUnitsFromHNSecs!(TUnit.second)(hnsecs);
return getUnitsFromHNSecs!(TUnit.msec)(hnsecs);
}
to
@property long msecs() const pure nothrow
{
return (_hnsecs % hnsecsPer!(TUnit.second)) / hnsecsPer!(TUnit.msec);
}
Note, TUnit.seconds could be replaced with (TUnit.msec + 1) or whatever, and
then you have a pretty well defined pattern that can be changed into a template.
Hey look, there's that cool hnsecsPer template again :)
>
> >
> > And in general, can we just use a template? If we continuously
> > parameterize everything based on the unit type, generic programming is
> > going to be much easier.
> >
> > i.e.
> >
> > @property long get(TUnit unit type)() {...}
> >
> > alias get!(TUnit.msec) msecs;
> > ...
>
> That's going to make for an annoying large number of static ifs, but it
>probably
>
> should be done.
No static ifs necessary if you use hnsecsPer.
> > JointDuration -- aside from operators, do we need to wrap the other methods
> > of HNSecDuration and MonthDuration? Can we just provide accessors for the
> > MonthDuration and HNSecDuration (in fact, you may want this).
>
> Well, sadly, I removed MonthDuration and JointDuration, so that's no longer an
>
> issue.
Understood, I responded to your first message without reading most of the
responses (I had about 100 unread phobos messages to go through, and I wanted to
express my opinions without reading all of them first).
> > TimeOfDay: Can we use HNSecDuration with an invariant that it's < 24 hours?
> > I can't see why you'd want to reimplement all this. FWIW, Tango uses
> > Span (the duration type) as it's timeofday component.
> >
> > Date is one thing -- the durations are based on a point in time. But time
> > of day is always the same no matter the day.
>
> Hmm. I didn't think of that. It doesn't quite work though. Aside from the fact
>
> that that would take more memory (for better or for worse), stuff like
> rollHours() wouldn't work if you just used an HNSecDuration, and if you're
> wrapping an HNSecDuration, you have to do a lot more calculations for that
>sort
>
> of code. I think that the result is a net-loss, personally.
More memory for more resolution and less types to deal with (no FracSec)
But let's calculate it out, there are 10 * 1000 * 1000 * 60 * 60 * 24 hnsecs in
a day. That comes out to require 5 bytes of storage (max value 864 billion).
In the grand scheme of things 3 wasted bytes aren't that bad.
as for rollHours, I skipped over that as well. Let me read it...
Ah ok, here's a complete implementation (with hnsecsPer):
long roll!(TUnit un)(long orig, long nOfUnit)
{
auto x = orig / hnsecsPer!(un - 1) * hnsecsPer!(un - 1);
return (orig + nOfUnit * hnsecsPer!(un)) % hnsecsPer!(un - 1) + x;
}
and accompany this with the ensuing aliases e.g.:
alias roll!(TUnit.days) rollDays;
>
> >
> > I'd expect the following structs in timepoint.d:
> >
> > Date -- a date with the fields year month day
> > Time -- A HNSecDuration with the limitation that it must be less than 24
> > hours DateTime -- a combination of both Date and Time
> >
> > As far as time zone, I've not yet dealt with it. It was on my plate to add
> > time zones to Tango, but I never got around to it. I think a type that
> > combines a point in time with a time zone might be the best solution,
> > similar to how an interval combines a point in time with a duration.
>
> SysTime combines the time in hnsecs from midnight January 1st, 1 AD UTC with a
>
> time zone object.
>
> >
> > You have other time types in timepoint which I think are not necessary.
> > Like FracSec.
>
> I think that it's a very good idea to have FracSec. The reason is that when
> you're dealing with the units smaller than a second, it doesn't really make
>much
>
> sense to treat them individually anymore. They're just differing
> precisions/resolutions for/of the some thing - the fractional seconds. So, I
> think that it's clearer and cleaner with FracSec.
But we already have that in HNSecDuration.
>
> > OK, so that's what I have. I think it's a very well thought out lib, it
> > just needs to be trimmed down.
> >
> > One final thought -- after reading through all the stuff, unit tests take
> > up the vast bulk of the lines of code. I think it's safe to say the .di
> > file would be like 5000 LOC. I think it definitely should be one file.
>
> I'll be turning it into a single .di file / .d file pair, and I concur that
>the
>
> .di file will not be particularly large, but if all the ddoc comments are in
> there, I have no idea how large that would make it. We'll see. But there's no
> question that the unit tests take up the most space (and they've been a life-
> saver too; much of the code would likely be broken in a lot of subtle ways if
>I
>
> didn't have them), and that won't cost in space won't transfer over to the .di
>
> file.
I'd say just the d file please. The only point of having a .di file is to hide
implementation. There are almost no .di files in phobos/druntime except where
it makes sense (mostly druntime). And there is only one manually maintained .di
file -- object.di.
-Steve
More information about the phobos
mailing list