C is Brittle D is Plastic

Quirin Schroll qs.il.paperinik at gmail.com
Fri Apr 3 12:20:19 UTC 2026


On Sunday, 22 March 2026 at 04:47:41 UTC, Walter Bright wrote:
> It's true that writing code in C doesn't automatically make it 
> faster.
>
> For example, string manipulation. 0-terminated strings (the 
> default in C) are, frankly, an abomination. String processing 
> code is a tangle of strlen, strcpy, strncpy, strcat, all of 
> which require repeated passes over the string looking for the 
> 0. (Even worse, reloading the string into the cache just to 
> find its length makes things even slower.)
>
> Worse is the problem that, in order to slice a string, a malloc 
> is needed to copy the slice to. And then carefully manage the 
> lifetime of that slice.
>
> The fix is simple - use length-delimited strings.

Working with C++ and having implemented my own `span` and 
`string_view` types, I think D’s strings are under-appreciated. 
They’re by far the best strings I’ve seen in a programming 
language.

C++ has: `char*` and `const char*`, `char[]`, `const char[]`, 
`std::string`, `std::string_view`, `std::span<char>`, and 
`std::span<const char>` (those times 7 because there’s `wchar_t`, 
`char8_t`, `char16_t` and `char32_t` and `signed`/`unsigned 
char`) (those times 2 for `volatile` to be pedantic). At least. 
Knowing when to use which is not straightforward. Comparing 
substrings efficiently is difficult, I always have to look up the 
arguments for `compare`. Most people just allocate substrings and 
compare those naïvely. Returning a length-delimited string by 
mutable content and with a default is impossible before C++20’s 
`span`: You can return `data()` or `nullptr` erasing the length, 
or return a `string_view` making the characters `const`. 
`std::string_view` and `span`s easily dangle, so using them e.g. 
for map keys is an issue: If *one* key would be dangling, you 
*have* to use `std::string` and copy all the keys, simply because 
C++ doesn’t have a GC; and then, you have `std::less<>` on your 
ordered map type or `std::equal_to` and a custom hash on your 
unordered map type because you still want to look up using 
`string_view` keys without copying them. Appending is done with 
`+` and only ever returns `std::string`. C++ has no `switch`ing 
on strings; if they’re short, you can write a `constexpr` utility 
function that maps them to numbers and switch on them. To top it 
all off, (`unsigned`) `char*` are also used for random data 
(instead of `void*`), and (`un`)`signed char` as the smallest 
integer types.

C has a small subset of this, which makes it arguably worse.

D has: `immutable(char)[]`, `const(char)[]`, and `char[]` (times 
3 for `wchar` and `dchar`). Their use-cases are straightforward, 
you never need to decide vanilla/signed/unsigned or `whcar` vs 
`char16_t`/`char32_t`. You can just return a `char[]` you have 
and an empty one as the default. They can’t easily dangle. 
Appending its own operator `~` and you can “just append” things. 
Comparing substrings is straightforward. You can just use 
`string` as map key type and just perform lookup with a `char[]`. 
You can just `switch` on strings. (Honestly, D should add 
`switch` for all slice types over switchable types: You should be 
able to switch over `int[]` and `int[][]`.) Sure, there’s also 
`shared` and `inout`, which are basically in the same camp as 
C++’s `volatile`: You rarely encounter them.

Built-in length-delimited strings (or slices/spans) would be a 
win for C, but C++ shows they’re not a panacea. The GC enables 
D’s length-delimited strings to be great instead of just good; 
when you disable the GC, they’re still good and do profit off of 
the GC existing at compile-time. You can build a static string at 
compile-time in D. That’s impossible in C++ before C++26; I’m not 
sure if C++26’s reflection can do it. A lot of C++ code is C++03, 
lacking even basics, and lots more is C++14 (still many Linux 
distros’ default compiler’s default), which has no `string_view` 
and C++20 brought `span` and transparent lookup in unordered 
containers. Before actually working with those, I couldn’t have 
imagined how terrible it was.

The worst part about D’s strings are auto-decoding and that 
literals include a secret zero past the end without you asking 
for it.

About the last thing, maybe in the next Edition, we can have 
`""z` strings that request a secret zero and only add the zero if 
a non-z string is used to initialize an  `immutable(char)*` or 
`const(char)*` (or make it an error to omit the `z` in that 
case). That would allow for some compression to be done by the 
compiler: If you have strings in your code like `"BC"` and 
`"ABCD"`, it could just re-use segment. With the secret zero, it 
can only re-use suffixes.


More information about the Digitalmars-d mailing list