[GSoC Proposal] Statically Checked Measurement Units

David Nadlinger see at klickverbot.at
Tue Mar 29 04:51:56 PDT 2011


I am in a slight dilemma, because although I would love to share my work 
and ideas with you, right now this would automatically weaken my own 
units proposal in comparison to yours. However, as this would be grossly 
against the open source spirit, and the point of GSoC certainly can't be 
to encourage that, I'll just do it anyway.

Regarding IDs: As I wrote in my previous post, the only point of the 
unit IDs in Boost.Units is to provide a strict total order over the set 
of units. If you can achieve it without that (see below), you won't need 
any artificial numbers which you have to manage.

But why would you need to be able to sort the base units in the first 
place? The answer is simple: To define a single type representation for 
each possible unit, i.e. to implement type canonicalization. To 
illustrate this point, consider the following (pseudocode) example:

auto force = 5.0 * newton;
auto distance = 3.0 * meter;
Quantity!(Newton, Meter) torque = force * distance;
torque = distance * force;

Both of the assignments to »torque« should obviously work, because the 
types of »force * distance« and »distance * force« are semantically the 
same. In a naïve implementation, however, the actual types would be 
different because the pairs of base units and exponents would be 
arranged in a different order, so at least one of the assignments would 
lead to type mismatch – because a tuple of units is, well, a tuple and 
not an (unordered) set.

And this is exactly where the strictly ordered IDs enter the scheme. By 
using them to sort the base unit/exponent pairs, you can guarantee that 
quantities semantically equivalent always end up with the same 
»physical« type.

Luckily, there is no need to require the user to manually assign 
sortable, unique IDs to each base type because we can access the mangled 
names of types at compile time, which fulfill these requirements. There 
are probably other feasible approaches as well, but using them worked 
out well for me (you can't rely on .stringof to give unique strings). 
When implementing the type sorting code, you might probably run into 
some difficulties and/or CTFE bugs, feel free to contact me for related 
questions (as I have already wasted enough time on this to get a working 
solution…^^).

Regarding strings: I might not have expressed my doubts clearly, but I 
didn't assume that your proposed system would use strings as internal 
representation at all. What I meant is that I don't see a way how, given 
»Quantity!("Widgets/Gadgets")«, to get the Widget and Gadget types in 
scope inside Quantity. Incidentally, this is exactly the reason for 
which you can't use arbitrary functions/types in the »string lambdas« 
from std.algorithm.

David


On 3/28/11 9:43 PM, Cristi Cobzarenco wrote:
> - I too was playing around with a units project before GSoC, that is why
> I thought doing this project was a good idea. The way I was doing it
> without numerical IDs was simply by having more complicated algorithms
> for equality, multiplications etc. For example, equality would be
> implemented as:
> template UnitTuple(P...) {
>    alias P Units;
> }
>
> template contains( Unit, UT ) {
>    /* do a linear search for Unit in UT.Units (since UT is a UnitTuple)
> - O(n)*/
> }
>
> template includes( UT1, UT2 ) {
>    /* check for each Unit in UT1 that it is also in UT2 (using contains)
> - O(n^2) */
> }
>
> template equals( UT1, UT2 ) {
>    immutable bool equals = includes!(UT1,UT2) && includes!(UT2, UT1);
> }
> Granted this means that each check takes O(n^2) where n is the number of
> different units, but it might be worth it - or not. On the small tests
> I've done it didn't seem to increase compile time significantly, but
> more research needs to be done. I think that as long as there aren't
> values with _a lot_ of units (like ten), the extra compile time
> shouldn't be noticeable. The biggest problem I have with adding IDs is
> that one will have to manage systems afterwards or have to deal with
> collisions. Neither one is very nice.
>
> - You're right, you don't need dimensions for implicit conversions, of
> course. And you're also right about possibly making the decision later
> about implicit conversions. The thing is F#, where units are very
> popular, only has explicit conversions, and I was trying to steer more
> towards that model.
>
> - I seem not to have been to clear about the way I would like to use
> strings. The names of the units in the strings have to be the type names
> that determine the units. Then one needs a function that would convert a
> string like "Meter/Second" to Division!(Meter, Second), I'm not sure how
> you would do that in C++. Maybe I'm wrong, but I can't see it.
>
> - I hope it is by now clear that my proposal is not, in fact, string
> based at all. The strings are just there to be able to write derived
> units in infix notation, something boost solves by using dummy objects
> with overloaded operators. The lack of ADL is a problem which I
> completely missed; I have immersed myself in C++ completely lately and
> I've gotten used to specializing templates in different scopes. These
> are the solutions I can come up with, but I will have to think some more:
> 1. There is an intrusive way of solving this, by making the conversion
> factors static members of the unit types, but this would not allow, for
> example, having a Widget/Gadget counter the way I intended.
> 2. The way they get away with it in F# is to use global conversion
> factors, that one manually uses. That actually is not bad at all. The
> only problem was that I was hoping that conversion between derived units
> could automatically be done using the conversion factors of the
> fundamental units: (meter/second) -> (kilometer/hour) knowing
> meter->kilometer and second->hour.
>
> Again I will have to think some more about the latter point. And I'll do
> some more tests on the performance of doing linear searches. Is there
> way to get the name of a type (as a string) at compile time (not the
> mangled name you get at runtime)? I wasn't able to find any way to do
> this. My original idea was actually to use the fully qualified typenames
> to create the ordering.
>
> Thanks a lot for your feedback, it's been very helpful, especially in
> pointing out the lack of ADL. Hope to hear from you again.


More information about the Digitalmars-d mailing list