What is the memory usage of my app?
via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Fri Apr 17 07:49:18 PDT 2015
On Thursday, 16 April 2015 at 12:17:24 UTC, Adil wrote:
> I've written a simple socket-server app that securities (stock
> market shares) data and allows clients to query over them. The
> app starts by loading instrument information from a CSV file
> into
> some structs, then listens on a socket responding to queries. It
> doesn't mutate the data or allocate anything substantial.
>
> There are 2 main structs in the app. One stores security data,
> and the other groups together securities. They are defined as
> follows :
>
> ````
> __gshared Securities securities;
>
> struct Security
> {
> string RIC;
> string TRBC;
> string[string] fields;
> double[string] doubles;
>
> @nogc @property pure size_t bytes()
> {
> size_t bytes;
>
> bytes = RIC.sizeof + RIC.length;
> bytes += TRBC.sizeof + TRBC.length;
>
> foreach(k,v; fields) {
> bytes += (k.sizeof + k.length + v.sizeof +
> v.length);
> }
>
> foreach(k, v; doubles) {
> bytes += (k.sizeof + k.length + v.sizeof);
> }
>
> return bytes + Security.sizeof;
> }
> }
>
> struct Securities
> {
> Security[] securities;
> private size_t[string] rics;
>
> // Store offsets for each TRBC group
> ulong[2][string] econSect;
> ulong[2][string] busSect;
> ulong[2][string] IndGrp;
> ulong[2][string] Ind;
>
> @nogc @property pure size_t bytes()
> {
> size_t bytes;
>
> foreach(Security s; securities) {
> bytes += s.sizeof + s.bytes;
> }
>
> foreach(k, v; rics) {
> bytes += k.sizeof + k.length + v.sizeof;
> }
>
> foreach(k, v; econSect) {
> bytes += k.sizeof + k.length + v.sizeof;
> }
>
> foreach(k, v; busSect) {
> bytes += k.sizeof + k.length + v.sizeof;
> }
>
> foreach(k, v; IndGrp) {
> bytes += k.sizeof + k.length + v.sizeof;
> }
>
> foreach(k, v; Ind) {
> bytes += k.sizeof + k.length + v.sizeof;
> }
>
> return bytes + Securities.sizeof;
> }
> }
> ````
>
> Calling Securities.bytes shows "188 MB", but "ps" shows about
> 591
> MB of Resident memory. Where is the memory usage coming from?
> What am i missing?
After a quick look, it seems like you are only count the fields
memory in the associative arrays, but forgetting about the
internal data structure memory - this is a common mistake.
Depending on D's associative array implementation and growth
policies, (which I am not familiar with, yet), you might be
paying a lot of overhead from having so many of them, all of them
holding relatively small types,
which make the overhead/payload ratio very bad.
Unfortunately, to my knowledge, there is no way to query the
current capacity or load factor of an AA.
If I am reading druntime's code correctly, if your hash table
contains at least five elements, you are already paying at least
for sizeof(void*) * 31. The 31 grows based on predefined prime
number list you can see here:
https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d#L36
I hope you can see how this overhead is gigantic for your case,
when you're mapping string -> double, or string -> ulong[2]
In addition, each allocation on the runtime heap incurs a booking
keeping cost of at least one pointer size, *often more*, and a
lot of times an addition extra padding cost for alignment
requirements.
There are a few more hidden costs that you can't easily avoid or
even calculate from within your binary that you will see in the
size the OS reports.
The solution in your case is to use more flat arrays and less AAs.
AAs are not a silver bullet! Sometimes it's faster to do
linear/binary search in a contiguous block of an array than to
search through an AA. This is very often the case for D's current
AA implementation.
Rant: I think D's associative array implementation is pretty bad
for such an integral and often used part of the language. Mostly
due to it being implemented in the runtime, as opposed to being
an inlineable library template, but also because it's using an
old-school linked-list approach which is pretty bad for you CPU
caches. I generally roll my own hash tables for perf sensitive
scenarios, which are more cpu efficient and almost always also
more memory efficient.
Sorry for the wall of text! I thought I'd elaborate a bit more
since I rarely see these hidden costs mentioned anywhere, in
addition to a general overuse of AAs.
More information about the Digitalmars-d-learn
mailing list