A bit of binary I/O
Chris Nicholson-Sauls
ibisbasenji at gmail.com
Sat Jan 20 16:42:42 PST 2007
Heinz wrote:
> Jarrett Billingsley Wrote:
>
>> "Heinz" <billgates at microsoft.com> wrote in message
>> news:eou69k$8tf$1 at digitaldaemon.com...
>>
>>> The first way is to write primitives manually one by one:
>>>
>>> // primitive way
>>> ulong i = 9;
>>> char[] s = "hello world";
>>> myFile.writeExact(&i, i.sizeof);
>>> myFile.writeExact(&s, s.sizeof);
>>>
>>> Reading data:
>>> // Is done by reading each primitive.
>>> ulong i2; char[] s2;
>>> myFile.readExact(&i2, i2.sizeof);
>>> myFile.readExact(&s2, s2.sizeof);
>> You're writing the string wrong. All you're doing is writing the length and
>> pointer of the array data, without actually writing the data.
>>
>> The Stream class (and by extension, the File class) provides functions for
>> writing out every basic type:
>>
>> ulong i = 9;
>> char[] s = "hello world";
>> myFile.write(i);
>> myFile.write(s);
>>
>> ...
>> ulong i2;
>> char[] s2;
>> myFile.read(i2);
>> myFile.read(s);
>>
>>> The second way is to write a structure with all the primitives as members:
>>>
>>> // struct way
>>> struct t
>>> {
>>> ulong i;
>>> char[] s;
>>> }
>>>
>>> t mt;
>>> mt.i = 9;
>>> mt.s = "hello world";
>>> myFile.writeExact(&mt, mt.sizeof);
>>>
>>> Reading data:
>>> // We read the entire struct.
>>> t mt2;
>>> myFile.readExact(&mt2, mt2.sizeof);
>> Again, you're just writing out the array reference without writing its
>> contents. You have to write out each member individually. If there were no
>> reference types in the struct, this would work fine.
>>
>>> And the third way is to write a class with all the primitives as members:
>>>
>>> // class way
>>> class tt
>>> {
>>> ulong i;
>>> char[] s;
>>> }
>>>
>>> tt mtt = new tt();
>>> mtt.i = 9;
>>> mtt.s = "hello world";
>>> ResFile.writeExact(&mtt, mtt.sizeof);
>>>
>>> Reading data:
>>> // We read the entire class.
>>> tt mtt2;
>>> myFile.readExact(&mtt2, mtt2.sizeof);
>>>
>> This is incorrect, and is only working because of how you've written your
>> program. You're not writing the data out at all, you're writing a class
>> reference. The 00913FC0 is just the memory address of the class instance
>> that mtt points to, and when you read that address back in, you're just
>> looking at the data in memory. This program wouldn't work if you write the
>> file, exited, then had another program that read the data. You'd end up
>> with a memory access violation, and none of the data in the class is
>> actually written out.
>>
>> If you want to write a class out to a file, a common way is to have some
>> kind of generic "serialize" and "unserialize" functions for the class:
>>
>> class C
>> {
>> ulong i;
>> char[] s;
>>
>> void serialize(Stream s)
>> {
>> s.write(i);
>> s.write(s);
>> }
>>
>> static C unserialize(Stream s)
>> {
>> C c = new C();
>> s.read(c.i);
>> s.read(c.s);
>> return c;
>> }
>> }
>>
>> ...
>> C c = new C();
>> c.i = 5;
>> c.s = "foo";
>> c.serialize(myFile);
>>
>> ...
>>
>> C c = C.unserialize(myFile);
>>
>>> All of these methods works perfect. I'm able to retrieve values from all
>>> of them. Now lets check at the outputs:
>>>
>>> // Primitive
>>>
>>> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
>>>
>>> // Structure
>>>
>>> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
>>>
>>> // Class
>>>
>>> C0 3F 91 00
>>>
>>> My questions are:
>>>
>>> 1) What's the best method to write data (in terms of data
>>> protection/encryption against reversion). The class way seems to me at
>>> first look the most secure way.
>> As explained before, the class method is wrong, and there is no encryption
>> going on here. It's just a memory address, and you should never, ever write
>> memory addresses to a file.
>>
>> That being said, the best way is probably to just use the primitive .read
>> and .write methods of File. Just .. never, ever write pointers or
>> references of any kind to a file.
>>
>>> 2) Wich method is the faster in retrieving data?
>> If you implement them correctly, all three sample programs should make the
>> exact same output file using the same number of writes (and read it in the
>> same number of reads), and so they are all the same in terms of performance.
>>
>>
>
> Wow, that covers all, thanks for your reply.
>
> But, can i still write an entire structure with writeExact()? or you suggest writting each member of the structure with write()?
>
> Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data?
>
> Thanks man
Well technically it will write it as UTF8, which is as near to ASCII as makes no
nevermind. If you don't want it readable (and this is a binary file anyway) you could
just use some simple reversable encryption algorithm. Something like this for a silly random.
<code>
module silly;
import tango .io .Stdout ;
struct SillyCrypt {
alias process opCall ;
static const CHUNK_SIZE = 32_U ;
static const ROT = 16_U ;
static const XOR = 24_U ;
static char[] process (char[] src) {
char[] result ;
foreach (ch; chunks(src)) {
result ~= mutate(ch);
}
return result;
}
private static char[][] chunks (char[] x) {
char[] source = x ;
char[][] result ;
while (source.length >= CHUNK_SIZE) {
result ~= source[0 .. CHUNK_SIZE] ;
source = source[CHUNK_SIZE .. $ ] ;
}
if (source.length) {
result ~= source;
}
return result;
}
private static char[] mutate (char[] x) {
char[] result ;
if (x.length > ROT) {
result = x[ROT .. $] ~ x[0 .. ROT];
}
else {
result = x.dup;
}
foreach (inout c; result) {
c ^= XOR;
}
return result;
}
}
const SOURCE = "I would say hello to you, but you couldn't read it even if I did."c ;
void main () {
auto enc = SillyCrypt(SOURCE) ;
auto dec = SillyCrypt(enc ) ;
Stdout
("Source -> "c)(SOURCE).newline()
("Encrypt -> "c)(enc ).newline()
("Decrypt -> "c)(dec ).newline()
.flush
;
}
</code>
The output when I tried it was this:
Source -> I would say hello to you, but you couldn't read it even if I did.
Encrypt -> w8lw8awm48zml8awQ8owmt|8kya8p}ttql8}n}v8q~8Q8|q|m8{wmt|v?l8j}y|86
Decrypt -> I would say hello to you, but you couldn't read it even if I did.
I know I don't personally know anyone who can read
"w8lw8awm48zml8awQ8owmt|8kya8p}ttql8}n}v8q~8Q8|q|m8{wmt|v?l8j}y|86" at all. :)
-- Chris Nicholson-Sauls
More information about the Digitalmars-d-learn
mailing list