eliminate junk from std.string?

Jerry Quinn jlquinn at optonline.net
Tue Jan 11 22:09:14 PST 2011


Andrei Alexandrescu Wrote:

> On 1/11/11 1:45 PM, Jerry Quinn wrote:
> > Unclear if iswhite() refers to ASCII whitespace or Unicode.  If Unicode, which version of the standard?
> 
> Not sure.
> 
> enum dchar LS = '\u2028';                                   /// UTF line 
> separator
> enum dchar PS = '\u2029';                                   /// UTF 
> paragraph separator
> 
> bool iswhite(dchar c)
> {
>      return c <= 0x7F
>          ? indexOf(whitespace, c) != -1
>          : (c == PS || c == LS);
> }
> 
> Which version?

This looks pretty incomplete if the goal is to return true for any unicode whitespace character.   My comment was really that if we're going to offer things like this, they need to be more completely defined.


> > Same comment for icmp().  Also, in the Unicode standard, case folding can depend on the specific language.
> 
> That uses toUniLower. Not sure how that works.

And doesn't mention details about the Unicode standard version it implements.


> > You've got chop() marked as deprecated.  Is popBack() going to make
> > sense as something that removes a variable number of chars from a
> > string in the CR-LF case?  That might be a bit too magical.
> 
> Well I found little use for chop in e.g. Perl. People either use chomp 
> or want to remove the last character. I think chop is useless.

Agreed, chomp is more useful.  My question is whether popBack() should automatically act like perl chomp() for strings or not?

> > One set of functions I'd like to see are startsWith() and endsWith().  I find them frequently useful in Java and an irritating lack in the C++ standard library.
> 
> Yah, those are in std.algorithm. Ideally we'd move everything that's 
> applicable beyond strings to std.algorithm.

Ah, missed those.

Jerry



More information about the Digitalmars-d mailing list