Arbitrary abbreviations in phobos considered ridiculous

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Mar 8 18:34:28 PST 2012


On Thu, Mar 08, 2012 at 07:07:04PM -0500, Nick Sabalausky wrote:
[...]
> But yea, it would be interesting to see a langauge that was based on
> something very different. A German-based one would be fun. Or even
> better, something that doesn't use the Latin alphabet, like Japanese
> or Hebrew or Russian. Or Swahili (which is an awesome-sounding
> language). Designing/using an Arabic (right-to-left, IIRC) programming
> language would be a great mind-fuck. Heh one of us should hack up DMD
> to produce a NihonD, using (or at least allowing) kanji instead of the
> kanas wherever appropriate :) That'd be both fun to make and to use.
[...]

You *do* realize that D allows non-Latin characters for identifiers,
right? With suitable use of alias, you could write some really funky
code:

	import std.stdio;
	alias std.stdio.writeln показать;
	alias size_t размер;
	alias string строка;
	alias void пустой;

	размер количество(Тип)(Тип[] массив) {
		return массив.length;
	}

	пустой main(строка[] параметров) {
		размер к = количество(параметров);
		показать("Я получил ", к, " параметров из командой строки");
	}

(Pity D doesn't let you alias keywords, else you could get rid of
"return" and "main" too. :-P)

However, this is only the lexical aspect of things. The grammar is still
essentially inherited from C, which inherited from a primarily
English-speaking tradition. So the above code is still rather awkward
because many nouns are in the wrong case.

We can do better. What about a language that has inflections, like Greek
or Russian? Function calls can then be indicated by putting the function
name (verb) in imperative mood, whereas using the function name in
nominative case (gerund) turns it into a function pointer.  Variables
(nouns) can have nominative case to indicate the object a particular
method should be invoked on, and accusative case for other parameters.

Now of course, to keep things manageable (and consistent), we'll have to
eliminate the nasty complicated special-cases, spelling exceptions, and
all that stuff that we find in natural languages. All word endings are
universally applied, and must all be unambiguous.  Furthermore, most of
human language is descriptive, whereas in a programming language,
especially an imperative one, you'd want to be using imperatives almost
all the time. So some natural language features won't be very useful.

So what we want is to identify common functionality that programming
languages need, and encode those in our "verbs" and "nouns". Here's my
first stab at it:

- Verbs represent functions, and can have an imperative form (function
  call), a gerund form (function pointer/delegate), or an indicative
  form (function declaration).

- Nouns represent variables, and can have a nominative form (variable
  declaration), an instrumental form (indicating the object a method is
  invoked from), a genitive form (indicating data source, i.e. "in" in
  D), a dative form (indicating data sink, or "out" in D), an accusative
  form (generic parameter), or a construct form (member access).

- Furthermore, plural nouns represent arrays, and have their own set of
  endings for nominative, instrumental, genitive, dative, accusative.
  (So you get array notation for free, no need for special symbols.)

- Adjectives represent types, and agree with the modified noun in case
  and number, so they can appear anywhere in a command without
  ambiguity, even separated from the modified noun proper. Adjectives
  have no construct form.

Here's an arbitrary assignment of endings to word forms, just for
illustration's sake:

	Imperative verb:	-ize
	Gerundive verb:		-ing
	Indicative verb:	-ation

			Nouns		Adjectives
			sg	pl	sg	pl
	Nominative:	-on	-ons	-dic	-tic
	Instrumental:	-ect	-ects	-oid	-idic
	Genitive:	-in	-ins	-nous	-rous
	Dative:		-out	-outs	-ny	-ney
	Accusative:	-or	-ors	-like	-ive
	Construct:	-'s	-s'

Some grammatical particles we might need:

	is		Introduces function body
	;		Separates statements
	.		Ends function body

So here's some sample code. For illustration purposes I'm just
transliterating D keywords, though in an actual implementation of such a
language you'd want language-specific words instead.

	stringtic truncation stringrous arrayins is
		returnize arrays' 1..$ons

	intdic maination stringtic argumentors is
		intdic lenon arguments' lengthin;
		writelnize stdoutout lenor;
		writelnize stdoutout truncize argumentors;
		returnize 0or.

Here's the equivalent D code:

	string[] trunc(string[] array) {
		return array[1..$];
	}
	int main(string[] argument) {
		int len = argument.length;
		stdout.writeln(len);
		stdout.writeln(trunc(argument));
	}

Note that word order is relatively free, because word endings make the
function of each word unambiguous. So the above code could be written
like this instead:

	stringtic stringrous arrayins truncation is
		1..$ons arrays' returnize

	intdic maination stringtic argumentors is
		intdic lenon lengthin arguments';
		lenor stdoutout writelnize;
		stdoutout writelnize truncize argumentors;
		0or returnize.

OK, this sounds like a horrendous butchering of English, but imagine if
the root words were non-English, and the endings weren't butcherings of
English endings. You'd have a really unique language with almost free
word order.

Or, if spelt-out endings are too annoying to type, we can use symbols
instead, like this:

	stdout> writeln! arguments<


T

-- 
Creativity is not an excuse for sloppiness.


More information about the Digitalmars-d mailing list