randomIO, std.file, core.stdc.stdio

Tue Jul 26 10:57:52 PDT 2016


On 07/26/2016 10:18 AM, Steven Schveighoffer via Digitalmars-d-learn wrote:
> On 7/26/16 12:58 PM, Charles Hixson via Digitalmars-d-learn wrote:
>
>> Ranges aren't free, are they? If so then I should probably use stdfile,
>> because that is probably less likely to change than core.stdc.stdio.
>
> Do you mean slices?
>
>> When I see "f.rawRead(&item[0 .. 1])" it looks to me as if unneeded code
>> is being generated explictly to be thrown away.  (I don't like using
>> pointer/length either, but it's actually easier to understand than this
>> kind of thing, and this LOOKS like it's generating extra code.)
>
> This is probably a misunderstanding on your part.
>
> &item is accessing the item as a pointer. Since the compiler already 
> has it as a reference, this is a noop -- just an expression to change 
> the type.
>
> [0 .. 1] is constructing a slice out of a pointer. It's done all 
> inline by the compiler (there is no special _d_constructSlice 
> function), so that is very very quick. There is no bounds checking, 
> because pointers do not have bounds checks.
>
> So there is pretty much zero overhead for this. Just push the pointer 
> and length onto the stack (or registers, not sure of ABI), and call 
> rawRead.
>
>> That said, perhaps I should use stdio anyway.  When doing I/O it's the
>> disk speed that's the really slow part, and that so dominates things
>> that worrying about trivialities is foolish.  And since it's going to be
>> wrapped anyway, the ugly will be confined to a very small routine.
>
> Having written a very templated io library 
> (https://github.com/schveiguy/iopipe), I can tell you that in my 
> experience, the slowdown comes from 2 things: 1) spending time calling 
> the kernel, and 2) not being able to inline.
>
> This of course assumes that proper buffering is done. Buffering should 
> mitigate most of the slowdown from the disk. It is expensive, but you 
> amortize the expense by buffering.
>
> C's i/o is pretty much as good as it gets for an opaque non-inlinable 
> system, as long as your requirements are simple enough. The std.stdio 
> code should basically inline into the calls you should be making, and 
> it handles a bunch of stuff that optimizes the calls (such as locking 
> the file handle for one complex operation).
>
> -Steve
Thanks.  Since there isn't any excess overhead I guess I'll use stdio.  
Buffering, however, isn't going to help at all since I'm doing 
randomIO.  I know that most of the data the system reads from disk is 
going to end up getting thrown away, since my records will generally be 
smaller than 8K, but there's no help for that.