struct Location size

max haughton maxhaton at gmail.com
Thu May 11 03:10:17 UTC 2023


On Wednesday, 10 May 2023 at 20:21:18 UTC, Walter Bright wrote:
> On 5/10/2023 10:16 AM, max haughton wrote:
>> I also don't see why its a perf issue?
>
> Every time the line number is needed, the file source has to be 
> scanned from the start of the file source. Line numbers are 
> needed not just for error messages, but for symbolic debug 
> info. Generally debug compiles should be fast.

Approximately once or never per symbol in a pattern that is 
probably mostly linear (i.e. why do it from scratch all time?).

We live in a world where we can *parse gigabytes* of JSON per 
second (https://github.com/simdjson/simdjson), it can be made 
fast. We can fit most full western names in a SIMD register now, 
even.

> Scanning the source files also faults them into memory.

As does parsing them. If aren't going to use it except for a 
burst at the beginning and the end you can tell the kernel as 
much very easily and it can use the physical side of the memory 
map for something else.

Building some of the code we have at Symmetry literally takes up 
all the memory on my machine, its not the files.

Overarching point here is that it needs to be measured properly.

I would also like to point out that making `Loc` smaller is 
basically tittle-tattle at the scale of dmd's memory allocation 
at the moment. It only feels good in the numbers Dennis posted 
because (in relative terms) druntime doesn't stress the compiler 
that much, and (in absolute terms) dmd is still allocating way 
too many objects overall.

If you want another free saving, do less `Array`s as pointer to 
struct (with a pointer in it) and make it smaller. Last time I 
measured it the modal length was either 0 or 1 depending on how I 
measured it so a lot of them are just wasted memory even if they 
never actually allocate anything themselves. Anecdotally, most 
memory allocated with the bump the pointer scheme is never 
written to (or more precisely is always 0)

Longer term:

Make dmd more GC friendly: Currently, activating -lowmem often 
makes no difference because there's usually a reference to 
everything somewhere. I'm not sure how you'd find easy things to 
change, though.

I suppose the academic solution would be to find a way to export 
the GC's graph and stare at it, the engineering solution might be 
to print a report of which objects were still alive when the 
program exits and then stare at that instead.




More information about the Digitalmars-d mailing list