numericValue for (unicode) characters

monarch_dodra monarchdodra at gmail.com
Thu Jan 10 10:19:56 PST 2013


On Thursday, 10 January 2013 at 18:09:31 UTC, Dmitry Olshansky 
wrote:
> 10-Jan-2013 03:21, H. S. Teoh пишет:
>> On Mon, Jan 07, 2013 at 07:51:19PM +0100, monarch_dodra wrote:
>>> On Saturday, 5 January 2013 at 00:47:14 UTC, H. S. Teoh wrote:
>>>> [...]
>>>> I, for one, would love to know why isNumeric != 
>>>> hasNumericValue.
>> [...]
>>> I guess it's just bad wording from the standard.
>>>
>>> The standard defined 3 groups that make up Number:
>>> [Nd] 	Number, Decimal Digit
>>> [Nl] 	Number, Letter
>>> [No] 	Number, Other
>>>
>>> However, there are a couple of characters that *are* numbers, 
>>> but
>>> aren't in those goups.
>>>
>>> The "Good" news is that the standard, *does* define 
>>> number_types to
>>> classify the kind of number a char is:
>>> * Null: Not a number
>>> * Digit: Obvious
>>> * Decimal: Any decimal number that is NOT a digit
>>> * Numeric: Everything else.
>>>
>>> So they used "Numeric" as wild, and "Number" as their general
>>> category.
>>>
>>> This leaves us with ambiguity when choosing our word:
>>> Technically '5' does not clasify as "numeric", although you 
>>> could
>>> consider it "has a numeric value".
>>>
>>> I hope that makes sense.
>>
>> Hmph. I guess we need to differentiate between the unicode 
>> category
>> called "numeric", and the property of having a numerical 
>> value. So we'd
>> need both isNumeric and hasNumericValue. Ugh. It's ugly but if 
>> that's
>> what the standard is, then that's what it is.
>
> isNumber - _Number_ General category (as defined by Unicode 1:1)
>
> isNumeric - as having NumericType != None (again going be 
> definition of Unicode properties)
>
> And that's all, correct and to the latter.

Are you sure about that? The four values of Numeric_Type are:
* Decimal
* Digit
* None
* Numeric <= !!!
http://unicode.org/cldr/utility/properties.jsp?a=Numeric_Type#Numeric_Type

Hopefully, we'll have "isDecimal", "isDigit", and eventually 
"isNumeric", which according to definition, would simply be 
"Numeric_Type == Numeric_Type.Numeric"

The problem is that by the definitions of Unicode properties, 
there is no name for "not in Numeric_Type.None"

"hasNumericValue" is the best name I could come up with to 
differentiate between "Not Numeric_Type.None" and 
"Numeric_Type.Numeric"


More information about the Digitalmars-d mailing list