RFC: SI Units facility for Phobos

Wed Jan 5 14:06:59 PST 2011

In conclusion (yes I know this normally goes at the bottom) I think we are wanting different and contradictorily things from this library.

Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:
> On 1/5/11 10:32 AM, BCS wrote:
> > Andrei Alexandrescu<SeeWebsiteForEmail at erdani.org>  wrote:
> [snip]
> > Ah. I see what you are getting at. OTOH I'm still not convinced it's any better.
> >
> > A quick check shows that 1 light years = 9.4605284 ¡Ñ 10^25
> > angstroms. A mere 25 orders of magnitude differences, where IEEE754
> > doubles have a range of 307 orders of magnitude. As to the issue of
> > where to do the conversions: I suspect that the majority of
> > computation will be between unit carrying types (particularly if the
> > library is used the way I'm intending it to be) and as such, I expect
> > that both performance and precision will benefit from having a
> > unified internal representation.
> People might want to use float for compactness, which has range 1e38 or 
> so. But that's not necessarily the largest issue (see below).
> > There /might/ be reason to have a very limited set of scaling factors
> > (e.g. atomic scale, human scale, astro scale) and define each of the
> > other units from one of them. but then you run into issues of what to
> > do when you do computations that involve more than one (for example,
> > computing the resolution of an X-ray telescope involves all three
> > scales).
>
> There are two issues apart from scale. One, creeping errors due to 
> conversions. Someone working in miles would not like that after a few 
> calculations that look integral they get 67.9999998 miles. Second, let's 
> not forget the cost of implicit conversions to and from. The way I see 
> it, forcing an internal unit for representation has definite issues that 
> reduce its potential applicability.

IMHO both of these are somewhat synthetic, that is they aren't significant issues in the real word. For the first, anyone who expects FP to give exact answers needs to learn more about FP. If you need exact answers, use an integer or rational type as your base type. As for the second point about perf; the usage mode I designed for will only perform conversions in I/O operations. Values are converted to Unit-bearing types at the first opportunity and remain there until the last possible moment. As such I expect that any operation that is doing more that a handful of conversions will be I/O bound not compute bound. 

> > When I started writing the library, I looked at these issue just
> > enough that I knew sorting it wasn't going to be a fun project. So,
> > rather than hash out these issue my self, I copied as much as I could
> > from the best units handling tool I know of: MathCAD. As best I can
> > tell, it uses the same setup I am.
> I don't know MathCAD, but as far as I understand that's a system, not a 
> library, and as such might have a slightly different charter. In terms 
> of charter Boost units 
> (http://www.boost.org/doc/libs/1_38_0/doc/html/boost_units.html) is the 
> closest library to this. I haven't looked at it for a while, but indeed 
> it does address the issue of scale as I suggested: it allows people to 
> store numbers in their own units instead of forcing a specific unit. In 
> fact the library makes it a point to distinguish itself from an "SI" 
> library as early as its second page:
> "While this library attempts to make simple dimensional computations 
> easy to code, it is in no way tied to any particular unit system (SI or 
> otherwise). Instead, it provides a highly flexible compile-time system 
> for dimensional analysis, supporting arbitrary collections of base 
> dimensions, rational powers of units, and explicit quantity conversions. 
> It accomplishes all of this via template metaprogramming techniques."
> Like it or not, Boost units will be the yardstick against which anything 
> like it in D will be compared. I hope that D being a superior language 
> it will make it considerably easier to implement anything 
> metaprogramming-heavy.

Reiterating my prior point, what I'm interested in developing is a library that handles the set of base physical units, of witch there is a know, finite set of base units and the derived units. You might be able to talk me into doing statically scaled units as distinct types but I'm not at all interested in allowing an arbitrary number of base units or in treating physicality equivalent units (e.g. feet and meters) to be considered as different base units. My unwillingness to go there is because I see very little value in doing a little of that (what other dimensions can be added that act like the SI base units?) and enormous cost in doing more of it (if you allow dimensions to act differently; how? in what ways? where do you stop?).

> >> The crux of the matter is that Radians and Degrees should be distinct
> >> types, and that a conversion should be defined taking one to the other.
> >> How can we express that in the current library, or what could be added
> >> to it to make that possible?
> >>
> >
> > I don't think there /is/ a good solution to that problem because many
> > of the computations that result in radians naturally give scalar
> > values (arc-length/radius). As a result, the type system has no way
> > to determine what the correct type for the expression is without the
> > user forcing a cast or the like.
> Not a cast, but a conversion. Consider:

That's what I was referring to by "and the like". :)

> void computeFiringSolution(Radians angle)
> {
>      auto s = sin(angle.value);
>      ...
>      auto newAngle = Radians(arcsin(s));
> }

The way I would like that code to look would be:

void computeFiringSolution(Radians angle)
{
     auto s = angle.sin(); // only exist for Radians (and Scaler)
     ...
     auto newAngle = std.units.arcsin(s);  // returns Radians
     static assert(is(typeof(newAngle) : Radians));
}

> Much of the point of using units is that there is a good amount of being 
> explicit in their handling.

The objective I was going for is that you are explicit at the edges (converting to and from other types) and ignore it in the middle.

> The user knows that sin takes a double which 
> is meant in Radians.

I'd rather the user know they can take the sin of something that is an angle and not worry about the units.

> Her program encodes that assumption in a type, but 
> is also free to simply fetch the value when using the untyped primitives.

The way I wrote it, accessing the value directly is much the same as using a reinterpret_cast; a blunt hack. Rather than doing that, the user is forced (by design) to explicitly state what unit the value should be returned in or what it is being provided as.

> > If angles are treated as an alias for scalar then the conversion to
> > degrees can be handled in a reasonable way (but that would also allow
> > converting any scalar value to degrees). I again punted on this one
> > because people who have put more time than I have available (MathCAD
> > again) couldn't come up with anything better.
>
> ArcDegrees and Radians would be two distinct types. You wouldn't be able 
> to add Angles to Radians without explicitly stating where you want to be:
> ArcDegrees a1;
> Radians a2;
> auto a = a1 + a2; // error!

That expression is something I explicitly want to be valid (thus the reason the type aliases are named Length, Mass, Time, ... rather than Meter, Kilogram, Second, ...). They are both a measure of angle so should be addable. One of the fundamental requirements of the library is that things that measure the same property can be used interchangeably. This ability is a very large part of the reason that I wrote the library in the first place and I have no interest in continuing without it.

> auto b = a1 + ArcDegrees(a2); // fine, b is stored in ArcDegrees
> auto c = Radians(a1) + a2;    // fine, c is stored in Radians
> The same goes about Kilometers and Miles:
> Kilometers d1;
> Miles d2;
> ...
> auto a = d1 + d2; // error!
> auto b = d1 + Kilometers(a2); // fine, b is stored in Kilometers
> auto c = Miles(a1) + a2;    // fine, c is stored in Miles

Ditto the same as above. 

> >>> Again, that sounds to me like what the library does. All distance units
> >>> are of the same type and internally are encoded as meters, The rest of
> >>> the units are converted on access.
> >> The issue is that the choice of the unified format may be problematic.
> >
> > The issue I see is that the choice of a non unified format will be
> > problematic. Unless you can show examples (e.i. benchmarks, etc.) of
> > where the current solution has precision or performance problems or
> > where it's expressive power is inadequate, I will remain reluctant to
> > change it.
> Examples of precision issues with scaling back and forth by means of a 
> multiplier shouldn't be necessary as the problem is obvious. Here's an 
> example that took me a couple of minutes to produce:
> immutable real metersPerLightyear = 9.4605284e15;
> auto a1 = metersPerLightyear * 15.3;
> auto a2 = metersPerLightyear * 16.3;
> auto a3 = metersPerLightyear * 1;
> writeln("Total distance in lightyears: ", (a1 - a2 + a3) / 
> metersPerLightyear);
> auto b1 = 15.3;
> auto b2 = 16.3;
> auto b3 = 1;
> writeln("Total distance in lightyears: ", (b1 - b2 + b3));

I wasn't asking for cases where values come out unequal but where they come out unusable.

> Regarding expressiveness, it is quite clear that there are features 
> simply missing: working in Celsius vs. Fahrenheit vs. Kelvin,

I'll grant I don't have Celsius and Fahrenheit but they are very special cases as they have non zero origins. OTOH it will give differences in Fahrenheit via the Rankine scale.

> allowing 
> the user to define and use their own units, allowing the user to define 
> units with runtime multipliers (monetary) etc. There's always a need to 
> stop somewhere as the list could go on forever,

Agreed

> but I think the current submission stops a bit too early.

I think that the current point is the only logical point (in that any other is just arbitrary).

> If you believe that the library is good as it its, that's definitely 
> fine. Don't forget, however, that a good part of the review's purpose is 
> to improve the library, not to defend its initial design and 
> implementation. A submitter who is willing to go with the library as-is 
> although there are beneficial suggested improvements (and that refers to 
> everything including e.g. documentation) may be less likely to maintain 
> the library in the future. At least that's my perception.

To be clear, many of your points are very relevant and I omitted commenting on them because a long list of Yup, yup, yup, ... is just noise. Also, given that we are thrashing out one or two fundamental points about it, I think those "lesser" issues can wait. 

> In contrast, 
> I'm quite hopeful Jonathan will follow through with std.datetime because 
> he has been willing to act on all sensible feedback.
> Andrei