Notice/Warning on narrowStrings .length

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Apr 26 18:36:25 PDT 2012


On Thu, Apr 26, 2012 at 09:03:59PM -0400, Nick Sabalausky wrote:
[...]
> Heh, any usage of Notepad *needs* to be justified. For example, it has an 
> undo buffer of exactly ONE change.

Don't laugh too hard. The original version of vi also had an undo buffer
of depth 1. In fact, one of the *current* vi's still only has an undo
buffer of depth 1. (Fortunately vim is much much saner.)


> And the stupid thing doesn't even handle Unix-style newlines.
> *Everything* handes Unix-style newlines these days, even on Windows.
> Windows *BATCH* files even accept Unix-style newlines, for 
> goddsakes! But not Notepad.
> 
> It is nice in it's leanness and no-nonsence-ness. But it desperately needs 
> some updates.

Back in the day, my favorite editor ever was Norton Editor. It's tiny
(only about 50k or less, IIRC) yet had innovative (for its day)
features... like split pane editing, ^V which flips capitalization to
EOL (so a single function serves for both uppercasing and lowercasing,
and you just apply it twice to do a single word).  Unfortunately it's a
DOS-only program.  I think it works in the command prompt, but I've
never tested it (the modern windows command prompt is subtly different
from the old DOS command prompt, so things may not quite work as they
used to).

It's ironic how useless Notepad is compared to an ancient DOS program
from the dinosaur age.


> At least it actually supports Unicode though. (Which actually I find 
> somewhat surprising.)

Now in that, at least, it surpasses Norton Editor. :-) But had Norton
not been bought over by Symantec, we'd have a modern, much more powerful
version of NE today. But, oh well. Things have moved on. Vim beats the
crap out of NE, Notepad, and just about any GUI editor out there. It
also beats the snot out of emacs, but I don't want to start *that*
flamewar. :-P


[...]
> > http://www.arthaey.com/conlang/ashaille/writing/sarapin.html
> >
> > whose components are graphically composed in, shall we say, entirely
> > non-trivial ways (see the composed samples at the bottom of the
> > page)?
> >
> 
> That's insane!
> 
> And yet, very very interesting...

Here's more:

	http://www.omniglot.com/writing/conscripts2.htm

Imagine if some of the more complicated scripts there were actually used
in a real language, and Unicode had to support it...  Like this one:

	http://www.omniglot.com/writing/talisman.htm

Or, if you *really* wanna go all-out:

	http://www.omniglot.com/writing/ssioweluwur.php

(Check out the sample text near the bottom of the page and gape in
awe at what creative minds let loose can produce... and horror at the
prospect of Unicode being required to support it.)


[...]
> > Currently, std.uni code (argh the pun!!)
> 
> Hah! :)
> 
> > is hand-written with tables of which character belongs to which
> > class, etc.. These hand-coded tables are error-prone and
> > unnecessary. For example, think of computing the layout width of a
> > UTF-8 stream. Why waste time decoding into dchar, and then doing all
> > sorts of table lookups to compute the width? Instead, treat the
> > stream as a byte stream, with certain sequences of bytes evaluating
> > to length 2, others to length 1, and yet others to length 0.
> >
> > A lexer engine is perfectly suited for recognizing these kinds of
> > sequences with optimal speed. The only difference from a real lexer
> > is that instead of spitting out tokens, it keeps a running total
> > (layout) length, which is output at the end.
> >
> > So what we should do is to write a tool that processes Unicode.txt
> > (the official table of character properties from the Unicode
> > standard) and generates lexer engines that compute various Unicode
> > properties (grapheme count, layout length, etc.) for each of the UTF
> > encodings.
> >
> > This way, we get optimal speed for these algorithms, plus we don't
> > need to manually maintain tables and stuff, we just run the tool on
> > Unicode.txt each time there's a new Unicode release, and the correct
> > code will be generated automatically.
> >
> 
> I see. I think that's a very good observation, and a great suggestion.
> In fact, it'd imagine it'd be considerably simpler than a typial lexer
> generator. Much less of the fancy regexy-ness would be needed. Maybe
> put together a pull request if you get the time...?
[...]

When I get the time? Hah... I really need to get my lazy bum back to
working on the new AA implementation first. I think that would
contribute greater value than optimizing Unicode algorithms. :-) I was
hoping *somebody* would be inspired by my idea and run with it...


T

-- 
What do you mean the Internet isn't filled with subliminal messages? What about all those buttons marked "submit"??


More information about the Digitalmars-d mailing list