Is there any reasons to not use "mmap" to read files?

Steven Schveighoffer schveiguy at gmail.com
Sun Feb 13 19:35:45 UTC 2022


On 2/13/22 6:02 AM, Patrick Schluter wrote:
> On Sunday, 13 February 2022 at 03:13:43 UTC, H. S. Teoh wrote:
>> On Sat, Feb 12, 2022 at 07:01:09PM -0800, Ali Çehreli via 
>> Digitalmars-d wrote:
>>> On 2/12/22 05:17, rempas wrote:
>>>
>>> > a system call every single time
>>>
>>> I have a related experience: I realized that very many ftell() calls 
>>> that I were making were very costly. I saved a lot of time after 
>>> realizing that I did not need to make the calls because I could 
>>> maintain a 'long' variable to keep track of where I was in the file.
>>>
>>> I assumed ftell() would do the same but apparently not.
>> [...]
>>
>> I think the reason is the ftell involves an OS API call, because 
>> fread() uses the underlying read() syscall which reads from where it 
>> left off last, and there could be multiple threads reading from the 
>> same file descriptor, so the only way for fseek/ftell to work 
>> correctly is via a syscall into the kernel.  Obviously, this would be 
>> expensive, as it would involve a kernel context-switch as well as 
>> acquiring and releasing a lock on the file descriptor.
>>
> fread reads from its internal buffer when it can. By default it uses 1 
> page (4096 bytes on x86 and ARM). After a seek operation it will always 
> try to fill the buffer with 4096 bytes (of course the read syscall might 
> return less). As long as the reads are within the buffer fread() will 
> not invoke a read syscall.

If you seek within the buffer it could potentially leave the buffer 
alone. But it chooses to flush the buffer completely. Not sure why it 
does that. It's not so it can keep the data filled, it tries to read the 
full buffer at that point (meaning it removed all the buffered data).

This could be potentially really slow if you were skipping a few bytes 
at a time using fseek, as it would reload the entire buffer every seek.

-Steve


More information about the Digitalmars-d mailing list