Announcing Elembuf

Cyroxin 34924561+Cyroxin at users.noreply.github.com
Tue Dec 18 01:13:32 UTC 2018


On Monday, 17 December 2018 at 22:31:22 UTC, H. S. Teoh wrote:
> On Mon, Dec 17, 2018 at 09:16:16PM +0000, Cyroxin via 
> Digitalmars-d-announce wrote:
>> Elembuf is a library that allows writing efficient parsers and 
>> readers. It looks as if it were just a regular T[], making it 
>> work well with libraries and easy to use with slicing. To 
>> avoid copying, the buffer can only be at maximum one page long.
> [...]
>
> What advantage does this have over using std.mmfile to mmap() 
> the input file into the process' address space, and just using 
> it as an actual T[] -- which the OS itself will manage the 
> paging for, with basically no extraneous copying except for 
> what is strictly necessary to transfer it to/from disk, and 
> with no arbitrary restrictions?
>
> (Or, if you don't like the fact that std.mmfile uses a class, 
> calling
> mmap() / the Windows equivalent directly, and taking a slice of 
> the
> result?)
>
>
> T

Hello,

I would assume that there is much value in having a mapping that 
can be reused instead of having to remap files to the memory when 
a need arises to change source. While I cannot comment on the 
general efficiency between a mapped file and a circular buffer 
without benchmarks, this may be of use: 
https://en.wikipedia.org/wiki/Memory-mapped_file#Drawbacks

An interesting fact I found out was that std.mmfile keeps a 
reference of the memory file handle, instead of relying on the 
system's handle closure after unmap. There seems to be quite a 
lot of globals, which is odd as Elembuf only has one.

In std.mmfile OpSlice returns a void[] instead of a T[], making 
it difficult to work with as it requires a cast, there would also 
be a need to do costly conversions should "T.sizeof != 
void.sizeof" be true.

However, from purely a code perspective Elembuf attempts to have 
minimal runtime arguments and variables, with heavy reliance on 
compile time arguments. It also uses a newer system call for 
Linux (Glibc) that is currently not in druntime, the reason for 
this system call is that it allows for faster buffer 
construction. Read more about it here: 
https://dvdhrm.wordpress.com/2014/06/10/memfd_create2/


More information about the Digitalmars-d-announce mailing list