Proposal for new compiler built-ins: __CTX__ and __CONTEXT__
Andrej Mitrovic
andrej.mitrovich at gmail.com
Thu Jun 22 17:18:25 UTC 2023
This isn't a DIP but I'd like to start this conversation to see
what people think.
One pattern I often see in D libraries is the capturing of
context of the call site, for example the file and line number
and sometimes the function and module names.
There are many examples of this in the real world. Here's a quick
search result of all the D modules containing `__FILE__` on
Github:
https://github.com/search?q=%22__FILE__%22+lang%3AD+&type=code
And here's similar results for `__FUNCTION__`:
https://github.com/search?q=%22__FUNCTION__%22+lang%3AD+&type=code
They're often passed in pairs, or sometimes quadruplets.
I have a few things in mind:
- It's repetitive having to list out all of these context
keywords and assign each one of them to their own free parameter.
- It's visually noisy seeing them and it can distract the reader.
- It's possible to accidentally pass the wrong argument to a
parameter that is default initialized to `__FILE__` or
`__LINE__`. Maybe your logging function takes a formatting string
but you accidentally pass the formatting string to the parameter
`string file = __FILE__` parameter.
- Once these parameters are received it's cumbersome to pass them
around to other routines. Passing these downwards is a pretty
common use-case of many libraries.
- It consumes stack space due to the need to pass context
parameters on the stack, and might even eat up the EAX register
(according to the ABI:
https://dlang.org/spec/abi.html#parameters). This may or may not
be a problem for you, depending on your use-case and depending on
how many parameters you actually pass around.
We could simplify this and maybe even make it more flexible.
I'm envisioning a new type `__CTX__` which contains all of these
different contexts which are currently separate keywords on:
https://dlang.org/spec/expression.html#specialkeywords.
Here it is:
```D
struct __CTX__ {
string file;
string file_full_path;
int line;
string func;
string pretty_func;
}
```
Having it defined as a struct serves a few objectives:
- It makes it easy to declare and use this type in user code.
- All library code will have one single compatible type they can
easily pass around to each other.
- Makes it harder to confuse parameters. For example it's
currently easy to pass a random string to a function expecting a
`__FILE__`, because the parameter is a string.
- It makes it possible to choose whether to receive and pass this
structure around by value or by reference.
We also need to initialize it. So perhaps we'd call this
initialization keyword `__CONTEXT__`.
Here's how the client code might look like:
```D
// ctx passed by stack
void infoStack(string msg, __CTX__ ctx = __CONTEXT__) {
writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}
// ctx passed by pointer
void infoRef(string msg, ref __CTX__ ctx = __CONTEXT__) {
writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}
void main() {
infoStack("Hello world");
infoRef("Hello world");
}
```
Notice that the calls to `infoStack` and `infoRef` will generate
different assembly code, as `__CTX__` is passed by value when
calling `infoStack` and by reference when calling `infoRef`.
Here's a full example with some fake context just to give a
clearer picture:
```D
import std.stdio : writefln;
ref __CTX__ __CONTEXT__() {
static __CTX__ ctx = __CTX__("mymod.d",
"/project/src/mymod.d", "mymod", 123,
"mymod.func", "void mymod.func()");
return ctx;
}
struct __CTX__ {
string file;
string file_full_path;
string mod;
int line;
string func;
string pretty_func;
}
// ctx passed by stack
void infoStack(string msg, __CTX__ ctx = __CONTEXT__) {
writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}
// ctx passed by pointer
void infoRef(string msg, ref __CTX__ ctx = __CONTEXT__) {
writefln("%s(%s): %s", ctx.file, ctx.line, msg);
}
void main() {
infoStack("Hello world"); // mymod.d(123): Hello world
infoRef("Hello world"); // mymod.d(123): Hello world
}
```
I think it should be possible to receive `__CTX__` by reference
from the compiler as all of the default parameters must be known
at compile-time. So the context could be stored in `RODATA` or
somewhere and an address taken from it (But I'm a bit out of my
depth here).
I can think of some downsides:
- Passing the `__CTX__` by stack can make you capture more than
you're interested in. Perhaps you're only interested in file +
line and don't care about module name, function name, etc. That
means you're suddenly consuming a lot more stack space.
- Counter-point: You can always continue to use `__FILE__` and
`__LINE__`. I'm not suggesting to deprecate or remove these.
- It's possible `__CTX__` could later grow if the compiler
developers decide to add a new context field to it. This would
somewhat negatively impact all user-code which takes `__CTX__` by
value as it would grow the stack usage.
- Therefore people might opt to use `ref __CTX__`, but that adds
pointer indirection.
I believe context is often used in more expensive situations so
passing by reference might be okay in many cases, for example:
- File and line contexts are used when throwing exceptions.
Passing the file+line through a pointer indirection (`ref
__CTX__`) isn't that expensive compared to throwing the exception
itself.
- File and line contexts are used when logging messages. If this
involves File I/O as it often does then the pointer indirection
is a fraction of the total cost of the operation.
-----
Any other benefits / drawbacks, or unforeseen complications you
can think of?
More information about the Digitalmars-d
mailing list