Streaming library

Rainer Deyke rainerd at eldwood.com
Thu Oct 14 21:57:32 PDT 2010


On 10/14/2010 22:24, Andrei Alexandrescu wrote:
> On 10/14/10 21:22 CDT, Rainer Deyke wrote:
>> Characters data must be encoded into bytes before it is written and
>> decoded before it is read.  The low-level OS functions only deal with
>> bytes, not characters.
> 
> I'm not so sure about that. For example, some code in std.stdio is
> dedicated to supporting fwide():
> 
> http://www.opengroup.org/onlinepubs/000095399/functions/fwide.html

I don't think that's not a low-level OS function.  But it is true that I
may have overstated my case.  Still, the underlying file system and the
underlying hardware deal in bytes, not chars, on all platforms that matter.

Encoded text /is/ bytes.


> So the $1M question is, do we support text transports or not?

All text is encoded, and encoded text is logically bytes, not chars.
This is distinction is somewhat confused in D because the native string
types in D do specify an encoding.  However, it would be a mistake to
conflate the internal encoding with the external encoding used by text
transports.

It's also worth noting that some of these text transports are not 8-bit
clean.  This means that they cannot transport UTF-8 (without
transcoding), which means that they cannot transport D strings.

> - email protocol and probably other Internet protocols

All internet protocols ultimately work over IP, and IP is a binary protocol.

> If we don't support text at the transport level, things can still made
> to work but in a more fragile manner: upper-level protocols will need to
> _know_ that although the API accepts any ubyte[], in fact the results
> would be weird and malfunctioning if the wrong things are being passed.

The situation for text would be no different from the situation for any
other structured binary format.

> A text-based transport would clarify at the type level that a text
> stream accepts only UTF-encoded characters.

You can still have that, as a wrapper around the byte stream.


-- 
Rainer Deyke - rainerd at eldwood.com


More information about the Digitalmars-d mailing list