Adding Unicode operators to D
KennyTM~
kennytm at gmail.com
Sun Oct 26 10:53:03 PDT 2008
Andrei Alexandrescu wrote:
> Bruno Medeiros wrote:
>> Andrei Alexandrescu wrote:
>>> Spacen Jasset wrote:
>>>> Bill Baxter wrote:
>>>>> On Thu, Oct 23, 2008 at 7:27 AM, Andrei Alexandrescu
>>>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>>> Please vote up before the haters take it down, and discuss:
>>>>>>
>>>>>> http://www.reddit.com/r/programming/comments/78rjk/allowing_unicode_operators_in_d_similarly_to/
>>>>>>
>>>>>>
>>>>>
>>>>> (My comment cross posted here from reddit)
>>>>>
>>>>> I think the right way to do it is not to make everything Unicode. All
>>>>> the pressure on the existing symbols would be dramatically relieved by
>>>>> the addition of just a handful of new symbols.
>>>>>
>>>>> The truth is keyboards aren't very good for inputting Unicode. That
>>>>> isn't likely to change. Yes they've dealt with the problem in Asian
>>>>> languages by using IMEs but in my opinion IMEs are horrible to use.
>>>>>
>>>>> Some people seem to argue it's a waste to go to Unicode only for a few
>>>>> symbols. If you're going to go Unicode, you should go whole hog. I'd
>>>>> argue the exact opposite. If you're going to go Unicode, it should be
>>>>> done in moderation. Use as little Unicode as necessary and no more.
>>>>>
>>>>> As for how to input unicode -- Microsoft Word solved that problem ages
>>>>> ago, assuming we're talking about small numbers of special characters.
>>>>> It's called AutoCorrect. You just register your unicode symbol as a
>>>>> misspelling for "(X)" or something unique like that and then every
>>>>> time you type "(X)" a funky unicode character instantly replaces those
>>>>> chars.
>>>>>
>>>>> Yeh, not many editors support such a feature. But it's very easy to
>>>>> implement. And with that one generic mechanism, your editor is ready
>>>>> to support input of Unicode chars in any language just by adding the
>>>>> right definitions.
>>>>>
>>>>> --bb
>>>> I am not entirely sure that 30 or (x amount) of new operators would
>>>> be a good thing anyway. How hard is it to say m3 =
>>>> m1.crossProduct(m2) ? vs m3 = m1 X m2 ? and how often will that
>>>> happen? It's also going to make the language more difficult to learn
>>>> and understand.
>>>
>>> I have noticed that in pretty much all scientific code, the f(a, b)
>>> and a.f(b) notations fall off a readability cliff when the number of
>>> operators grows only to a handful. Lured by simple examples like
>>> yours, people don't see that as a problem until they actually have to
>>> read or write such code. Adding temporaries and such is not that
>>> great because it further takes the algorithm away from its
>>> mathematical form just for serving a notation that was the problem in
>>> the first place.
>>>
>>
>> But what operators would be added? Some mathematician programmers
>> might want vector and matrix operators, others set operators, others
>> still derivation/integration operators, and so on. Where would we stop?
>> I don't deny it might be useful for them, but it does seem like too
>> specific a need to integrate in the language.
>
> I was thinking of allowing a general way of defining one Unicode
> character to stand in as one operator, and then have libraries implement
> the actual operators.
>
> There's the remaining problem of different libraries defining the same
> character to mean different operators. This may not be huge as math
> subdomains tend to be rather consistent in their use of operators.
> Across math subdomains, types and overloading can take care of things.
>
> Also, ascii representation should be allowed for operators, and one nice
> thing about Unicode characters is that many have HTML ascii and
> human-readable names, see
> http://www.fileformat.info/format/w3c/htmlentity.htm. So
> \unicodecharname may be a good alternate way to enter these operators.
> For example, the empty set could be \empty, and the cross-product could
> be written as \times. So
>
> c = a \times b;
>
> doesn't quite look bad to me.
>
> One nice thing about this is that we don't need to pore over naming and
> such, we just use stuff that others (creators and users alike) have
> already pored over. Saves on documentation writing too :o).
>
>
> Andrei
LaTeX in D? :p
Anyway we already have \× and \∅ so we could reuse them in
source code level as I've described somewhere in this thread.
auto torque = position \× force;
This is uglier than
auto torque = position \times force;
but it gives a uniform syntax between escape sequences inside and
outside strings.
The problem is you may have to invent some names, i.e. the composition
operator ∘ (U+2218 ring operator) has no name in SGML entities. In LaTeX
it is represented as \circ but \ˆ is already taken by ˆ (U+02C6
modifier letter circumflex accent).
And you'll need to predefine the associativity and operation precedence
too. ;) See my other entry in this thread.
More information about the Digitalmars-d-announce
mailing list