[OT] Re: How to read fastly files ( I/O operation)
Jay Norwood
jayn at prismnet.com
Wed Dec 18 14:17:40 PST 2013
On Friday, 8 February 2013 at 06:22:18 UTC, Denis Shelomovskij
wrote:
> 06.02.2013 19:40, bioinfornatics пишет:
>> On Wednesday, 6 February 2013 at 13:20:58 UTC, bioinfornatics
>> wrote:
>> I agree the spec format is really bad but it is heavily used
>> in biology
>> so i would like a fast parser to develop some D application
>> instead to
>> use C++.
>
> Yes, lets also create 1 GiB XML files and ask for fast
> encoding/decoding!
>
> The situation can be improved only if:
> 1. We will find and kill every text format creator;
> 2. We will create a really good binary format for each such
> task and support it in every application we create. So after
> some time text formats will just die because of evolution as
> everything will support better formats.
>
> (the second proposal is a real recommendation)
There is a binary resource format for emf models, which normally
use xml files, and some timing improvements stated at this link.
It might be worth looking at this if you are thinking about
writing your own binary format.
http://www.slideshare.net/kenn.hussey/performance-and-extensibility-with-emf
There is also a fast binary compression library named blosc that
is used in some python utilities, measured and presented here,
showing that it is faster than doing a memcpy if you have
multiple cores.
http://blosc.pytables.org/trac
On the sequential accesses ... I found that windows writes blocks
of data all over the place, but the best way to get it to write
something in more contiguous locations is to modify the file
output routines to use specify write through. The sequential
accesses didn't improve read times on ssd.
Most of the decent ssds can read big files at 300MB/sec or more
now, and you can raid 0 a few of them and read 800MB/sec.
More information about the Digitalmars-d-learn
mailing list