array trouble

Frits van Bommel fvbommel at REMwOVExCAPSs.nl
Tue May 1 02:40:00 PDT 2007


Jan Hanselaer wrote:
> Hi all,
> 
> When looking in to the documentation of arrays I found something that 
> appeared rather "buggy" to me.
> In the documentation you can read this (it's an example to show how changing 
> the size of one array could affect the contents of another, namely when the 
> one is a slice of the other, her in the example b is a slice of a):
> 
> char[] a = new char[20];
> char[] b = a[0..10];
> char[] c = a[10..20];
> 
> b.length = 15; // always resized in place because it is sliced
>                     // from a[] which has enough memory for 15 chars

Correct.

> Now I have 2 questions:
> 1) How come that when I want to print al chars from b I get an error 
> (Error: 4invalid UTF-8 sequence)
> Normaly I think that the new elements 10..15 from b should get the default 
> char, and then there would be no error.

They *do* get the default character. The default character, however, has 
value 0xff and is an invalid byte in UTF-8 (which is what char[]s 
store). This is by design, to force you to make sure to properly 
initialize your data. Similar values are used for wchar and dchar.

The same reasoning is applied to floating point variables, by the way 
(float, double, real and their imaginary and complex variants). They are 
initialized to special value called NaN (Not a Number) that always 
results when the outcome of a calculation depends on its value.

> 2) Like they say in the documentation, also array a should be affected but 
> that doesn't seem to be true because I can still print the 20 original 
> chars.

Your attached code is flawed:
---
a = "abcdefghijklmnopqrstuv";
---
doesn't allocate the string on the heap, it makes 'a' refer to a 
statically-allocated string. Arrays not allocated on the heap can't be 
resized in place (to a larger length, anyway). Change it to
---
a = "abcdefghijklmnopqrstuv".dup;
---
to explicitly allocate on the heap.

You can change your "%s" format strings to "%x" to see the hexadecimal 
values of the characters in the string.

> Now to make it a little bit stranger, when I try all this with ints instead 
> of chars it works like they say in the documentation. The new elements from 
> b get default int 0 and when I print the 3 arrays, I see they are all 
> effected by the change of length in b.

Your int code uses an array literal instead of a string literal 
(obviously, since those can't be used for int arrays), so it allocates 
on the heap.

> Can anyone explain all this? Am I just overlooking something?
> My test code for char and int is in the attachment.

See above.


More information about the Digitalmars-d-learn mailing list