array of randomly generated names

Jonathan M Davis jmdavisProg at gmx.com
Fri Oct 15 17:46:03 PDT 2010


On Friday, October 15, 2010 12:50:53 spir wrote:
> Hello,
> 
> A few questions raised by a single func.
> 
> ===================
> alias char[] Text ;
> 
> Text letters = ['a','b','c',...] ;
> 
> Text[] nameSet (uint count , uint size) {
> 	/* set of count random names of size size */
> 	Text[] names ; names.length = count ;
> 	Text name ; name.length = size ;
> 	for (int i=0 ; i<count ; i++) {
> 		for (int j=0 ; j<size ; j++)
> 		    name[j] = letters[uniform(0u,26u)] ;
> 		names[i].length = size ;
> 		names[i][] = name ;
> 	}
> 	return names ;
> }
> ===================

First off, _never_ iterate over chars unless you're _sure_ that that's what you 
want. char and wchar are code units, not code points, so you potentially need 
multiple of them to have a code point. Letters are code points, not code units. 
If you need to iterate over characters, use dchar, so dchar[] or dstring. If 
you're just reading the string, you can have foreach do the conversion for you.

string s = "my string";

foreach(dchar c; s)
{
	//...
}

But that won't work for setting the characters.

> 1. In the  inner loop generating name, I have found neither a way to feed
> directly ints into name, nore a way to cast ints to chars using to! (also
> found no chr()). So, I had to list letters. But this wouldn't work with a
> wide range of unicode chars... How to build name directly from random
> ints?

I would expect to!dchar(num) to work where num is an integral value, but do 
_not_ do this with char unless you're specifically dealing with code units rather 
than code points. If, for some reason, to!dchar(num) does not work, then you can 
simply cast it cast(dchar)(num), much as that's less desirable. However, given 
that many of the ASCII characters are not really for printing and than most of 
them aren't appropriate for names, let alone what the majority of unicode 
characters are like, you're probably going to want to something rather fancier 
than simply generating a number in a certain range and converting that to a 
character.

> 2. I was surprised to get all names equal... Seems that "names[i] = name"
> actually copies a ref to the name. Is there another way to produce a copy
> than "names[i][] = name"?

You really should take a look at http://www.digitalmars.com/d/2.0/arrays.html . 
Static arrays are value types, but dynamic arrays are reference types. You can 
ever slice them without making any copies. e.g.

string a = "hello world";
string b = a[1 .. 7]; //it's a slice
assert(b == "ello w");

No copying is taking place there. If you want a copy an array, you use dup (or 
idup if you want an immutable copy). e.g.

string a = "hello world";
string b = a.idup; //It's an immutable copy.

Or, if you want to copy an array into an array, you'd do

string a = "hello world";
char[] b = new char[](a.length);
b[] = a[]; //it's a copy.
assert(b == "hello world");

Notice the empty []. That indicates a slice of the whole array. You could do 
partial slices instead:

string a = "hello world";
car[] b = "silly string".dup; //literals are immutable on Linux, though I think 
that they're mutable on Windows

b[2..5] = a[4..7]; //a copy of part of the array.
assert(b == "sio w string");

Regardless of how much of the array you copy with [], notice that the slices of 
the arrays must be of the some length.

> 3. As you see, I individually set the length of each names[i] in the outer
> loop. (This, only to be able to copy, else the compiler complains about
> unequals lengths.) How can I set the length of all elements of names once
> and for all?

You're dealing with a multi-dimensional array. The inner array is empty until 
you set it, so of course it won't work to index it until it's been set. If you 
want to set the whole thing at once, then do

auto names =new dchar[][](numNames, nameLength);

Now, personally, I would argue that you really should be using string as much as 
possible (or dstring when you have to) and avoid mutable arrays of char, wchar, 
or dchar. That being the case, I'd advise doing this

auto names = new string[](numNames);

then use dchar[] in the for loop (maybe even make it a static one to avoid the 
memory allocation) and the use to!string() to create a string from it and put it 
in the list of names. e.g.

dchar[nameLength] name;
//...
names[i] = to!string(name[]); (since it's a static array in this case, you have 
to slice it to pass it to to!()).

> 4. Is there a kind of map(), or a syntax like list comprehension, to
> generate array content from a formula? (This would here replace both
> loops.)

Not that I'm aware of. Though, if you could define a range (IIRC it would need to 
be an input range) which generates the next element in the array when popFront() 
is called, then you could use std.array.array() to create an array from such a 
range.

> 5. Seems there is no auto-conversion between char[] and string. Thus, I use
> only char[], else I'm constantly blocked with immutability. But for this
> reason I cannot use nice facilities to construct text expressions, like
> format(). Also, I need to convert every literal, even chars, when they
> must go into a char[]. Grrr! Even to initialise: Text text =
> to!Text("start text") ;
> Hints welcome.

It's typical in D to just use strings everywhere rather than char[]. However, 
there are definitely cases where you need mutable arrays of characters, so there 
has been some discussion of making many (perhaps all) of the std.string 
functions work with all string types, but that hasn't been done yet. Also, as I 
mentioned before, if you're dealing with individual characters, you really 
should be dealing with dchar[] or dstring.

I'd say that it's fairly typical to do whatever string processing that you need 
to do with a mutable string type and then idup it or using to!() to convert it 
to an immutable one (most typically string) when you're done.

If efficency isn't a big issue, you can simply append to an immutable string type 
with ~= and forget about using indexing to set the individual characters.

Also, as a side note, I wouldn't advise using an alias for char[] if you intend 
other people to be reading your code. It's just going to confuse people.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list