Large memory allocations

Sat Nov 15 13:35:09 PST 2008

bearophile wrote:
> While allocating lot of memory for a little memory-hungry program, I have found results that I don't understand. So I have written the following test programs. Maybe someone can give me some information on the matter.
> I am using a default install of a 32 bit Win XP with 2 GB RAM (so for example I can't allocate 3 GB of RAM). (I presume answers to my questions are Windows-related).
> 
> From C (MinGW 4.2.1) this is about the largest memory block I can allocate (even it swaps and requires 7+ seconds to run), 1_920_000_000 bytes:
> 
> #include "stdio.h"
> #include "stdlib.h"
> #define N 480000000
> int main() {
>     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
>     unsigned int i;
>     if (a != NULL)
>         for (i = 0; i < N; ++i)
>            a[i] = i;
>     else
>         printf("null!");
>     return 0;
> }
> 
> 
> But from D this is about the largest memory block I can allocate with std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?
> 
> //import std.gc: malloc;
> import std.c.stdlib: malloc;
> void main() {
>     const uint N = 411_000_000;
>     uint* a = cast(uint*)malloc(N * uint.sizeof);
>     if (a !is null)
>         for (uint i; i < N; ++i)
>            a[i] = i;
>     else
>         printf("null!");
> }
> 
> (If I use std.gc.malloc the situation is different yet, and generally worse).
> 
> -----------------------
> 
> So I have tried to use a sequence of smaller memory blocks, this is the C code (every block is about 1 MB):
> 
> #include "stdio.h"
> #include "stdlib.h"
> 
> #define N 250000
> 
> int main(int argc, char** argv) {
>     unsigned int i, j;
>     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;
> 
>     for (j = 0; j < m; ++j) {
>         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
> 
>         if (a != NULL) {
>             for (i = 0; i < N; ++i)
>                a[i] = i;
>         } else {
>             printf("null! %d\n", j);
>             break;
>         }
>     }
> 
>     return 0;
> }
> 
> 
> And the D code:
> 
> //import std.gc: malloc;
> import std.c.stdlib: malloc;
> import std.conv: toUint;
> 
> void main(string[] args) {
>     const uint N = 250_000;
>     uint m = toUint(args[1]);
> 
>     for (uint j; j < m; ++j) {
>         uint* a = cast(uint*)malloc(N * uint.sizeof);
> 
>         if (a !is null) {
>             for (uint i; i < N; ++i)
>                a[i] = i;
>         } else {
>             printf("null! %d\n", j);
>             break;
>         }
>     }
> }
> 
> With such code I can allocate 1_708_000_000 bytes from D and up to 2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C code swaps a lot).
> So can't I can't use all my RAM from my D code? And do you know why?
> 
> Bye,
> bearophile

bearophile wrote:
 > While allocating lot of memory for a little memory-hungry program, I 
have found results that I don't understand. So I have written the 
following test programs. Maybe someone can give me some information on 
the matter.
 > I am using a default install of a 32 bit Win XP with 2 GB RAM (so for 
example I can't allocate 3 GB of RAM). (I presume answers to my 
questions are Windows-related).
 >
 > From C (MinGW 4.2.1) this is about the largest memory block I can 
allocate (even it swaps and requires 7+ seconds to run), 1_920_000_000 
bytes:
 >
 > #include "stdio.h"
 > #include "stdlib.h"
 > #define N 480000000
 > int main() {
 >     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
 >     unsigned int i;
 >     if (a != NULL)
 >         for (i = 0; i < N; ++i)
 >            a[i] = i;
 >     else
 >         printf("null!");
 >     return 0;
 > }
 >
 >
 > But from D this is about the largest memory block I can allocate with 
std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?
 >
 > //import std.gc: malloc;
 > import std.c.stdlib: malloc;
 > void main() {
 >     const uint N = 411_000_000;
 >     uint* a = cast(uint*)malloc(N * uint.sizeof);
 >     if (a !is null)
 >         for (uint i; i < N; ++i)
 >            a[i] = i;
 >     else
 >         printf("null!");
 > }
 >
 > (If I use std.gc.malloc the situation is different yet, and generally 
worse).
 >
 > -----------------------
 >
 > So I have tried to use a sequence of smaller memory blocks, this is 
the C code (every block is about 1 MB):
 >
 > #include "stdio.h"
 > #include "stdlib.h"
 >
 > #define N 250000
 >
 > int main(int argc, char** argv) {
 >     unsigned int i, j;
 >     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;
 >
 >     for (j = 0; j < m; ++j) {
 >         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned 
int));
 >
 >         if (a != NULL) {
 >             for (i = 0; i < N; ++i)
 >                a[i] = i;
 >         } else {
 >             printf("null! %d\n", j);
 >             break;
 >         }
 >     }
 >
 >     return 0;
 > }
 >
 >
 > And the D code:
 >
 > //import std.gc: malloc;
 > import std.c.stdlib: malloc;
 > import std.conv: toUint;
 >
 > void main(string[] args) {
 >     const uint N = 250_000;
 >     uint m = toUint(args[1]);
 >
 >     for (uint j; j < m; ++j) {
 >         uint* a = cast(uint*)malloc(N * uint.sizeof);
 >
 >         if (a !is null) {
 >             for (uint i; i < N; ++i)
 >                a[i] = i;
 >         } else {
 >             printf("null! %d\n", j);
 >             break;
 >         }
 >     }
 > }
 >
 > With such code I can allocate 1_708_000_000 bytes from D and up to 
2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C 
code swaps a lot).
 > So can't I can't use all my RAM from my D code? And do you know why?
 >
 > Bye,
 > bearophile

Different allocation schemes have different strengths and weaknesses. 
Some are fast, some fragment less, some have less overhead, some allow 
larger sized blocks.  Often these things arn't mutual so there are 
always tradoffs.  For example, to improve speed an allocator may 
allocate into particular buckets which might restrict the maximum size 
of one allocation.

I wonder how Ned-Malloc or Hord perform with your tests?

-Joel