numericValue for (unicode) characters
monarch_dodra
monarchdodra at gmail.com
Thu Jan 10 10:19:56 PST 2013
On Thursday, 10 January 2013 at 18:09:31 UTC, Dmitry Olshansky
wrote:
> 10-Jan-2013 03:21, H. S. Teoh пишет:
>> On Mon, Jan 07, 2013 at 07:51:19PM +0100, monarch_dodra wrote:
>>> On Saturday, 5 January 2013 at 00:47:14 UTC, H. S. Teoh wrote:
>>>> [...]
>>>> I, for one, would love to know why isNumeric !=
>>>> hasNumericValue.
>> [...]
>>> I guess it's just bad wording from the standard.
>>>
>>> The standard defined 3 groups that make up Number:
>>> [Nd] Number, Decimal Digit
>>> [Nl] Number, Letter
>>> [No] Number, Other
>>>
>>> However, there are a couple of characters that *are* numbers,
>>> but
>>> aren't in those goups.
>>>
>>> The "Good" news is that the standard, *does* define
>>> number_types to
>>> classify the kind of number a char is:
>>> * Null: Not a number
>>> * Digit: Obvious
>>> * Decimal: Any decimal number that is NOT a digit
>>> * Numeric: Everything else.
>>>
>>> So they used "Numeric" as wild, and "Number" as their general
>>> category.
>>>
>>> This leaves us with ambiguity when choosing our word:
>>> Technically '5' does not clasify as "numeric", although you
>>> could
>>> consider it "has a numeric value".
>>>
>>> I hope that makes sense.
>>
>> Hmph. I guess we need to differentiate between the unicode
>> category
>> called "numeric", and the property of having a numerical
>> value. So we'd
>> need both isNumeric and hasNumericValue. Ugh. It's ugly but if
>> that's
>> what the standard is, then that's what it is.
>
> isNumber - _Number_ General category (as defined by Unicode 1:1)
>
> isNumeric - as having NumericType != None (again going be
> definition of Unicode properties)
>
> And that's all, correct and to the latter.
Are you sure about that? The four values of Numeric_Type are:
* Decimal
* Digit
* None
* Numeric <= !!!
http://unicode.org/cldr/utility/properties.jsp?a=Numeric_Type#Numeric_Type
Hopefully, we'll have "isDecimal", "isDigit", and eventually
"isNumeric", which according to definition, would simply be
"Numeric_Type == Numeric_Type.Numeric"
The problem is that by the definitions of Unicode properties,
there is no name for "not in Numeric_Type.None"
"hasNumericValue" is the best name I could come up with to
differentiate between "Not Numeric_Type.None" and
"Numeric_Type.Numeric"
More information about the Digitalmars-d
mailing list