Is there any reasons to not use "mmap" to read files?

Ali Çehreli acehreli at yahoo.com
Sun Feb 6 16:45:59 UTC 2022


On 2/6/22 04:21, rempas wrote:
 > On Sunday, 6 February 2022 at 10:53:49 UTC, Temtaime wrote:
 >> Perso i'm almost always use mmap for opening large files for r/w. It
 >> IS faster.

Ditto.

 > how big files are we talking about?

So big that they can't fit in memory. For example, I benefit from mmap 
on a 16G system where a file would be 30G.

As others said, it depends on the use case. If the entire file will be 
read anyway especially in sequential order, then mmap may not have much 
benefit. In my use case though it is common to just read unknown small 
amounts of bytes from unknown places of the huge file. (Say, 5G total 
out of a 30G.)

Instead of my making multiple reads to those interesting parts of the 
file, mmap handles everything transparently: Just mmap the whole thing 
as a single array and access parts of that memory as needed.

One huge improvement is to add madvise(2) system call to the picture to 
tell the system the exact amount of memory that will be touched so the 
OS reads in a single shot. Otherwise, the system reads by a default 
amount, which I think is 4K, which can turn out to be pathetically slow 
e.g. when the file is accessed over a slow network. (Why read 4K when 
the need is just 200 bytes and why read in 4K steps when the need is 
already to be 1M?)

 > Also like another guy
 > told me in another (C) forum, "mmap" is for Unix systems so do you know
 > if Windows or MacOS can emulate that behavior with their memory
 > allocation system calls?

I haven't used mmap on Windows but it's in Phobos, so it should work. 
After all, mmap uses the virtual memory system of the OS and non-ancient 
Windows versions do use virtual memory and std.mmfile does include 
'version (windows)' sections; so, yes. :)

Ali



More information about the Digitalmars-d mailing list