(Phobos - SocketStream) Am I doing something wrong or is this
Zane
zane.sims at gmail.com
Thu Nov 5 04:53:53 PST 2009
Travis Boucher Wrote:
> There are a few issues with your implementation.
>
> First, parse the headers properly. Below see my trivial implementation.
> You want to parse them properly so you can find the correct
> end-of-headers, and check the size of the content from the headers.
>
> readLine() looks to be designed for a text based protocol. The biggest
> issue is with the end-of-line detection. "\r", "\n" and "\r\n" are all
> valid end-of-line combinations and it doesn't seem to do the detection
> in a greedy manor. This leaves us with a trailing '\n' at the end of
> the headers.
>
> The implementation of readBlock() doesn't seem to really wait to fill
> the buffer. It fills the buffer, if it can. This is pretty standard
> of a read on a socket. So wrap it in a loop and read chunks. You want
> to do it this way anyway for many reasons. The implementation below
> double-buffers which does result in an extra copy. Although logically
> this seems like a pointless copy, but in a real application it is very
> useful many reasons.
>
> Below is a working version (but still has its own issues).
>
> #!/usr/bin/gdmd -run
>
> import std.stream;
> import std.stdio;
> import std.socket;
> import std.socketstream;
>
> import std.string; // for header parsing
> import std.conv; // for toInt
>
> import std.c.time;
>
> int main()
> {
> char[] line;
> ubyte[] data;
> uint num = 0;
>
> TcpSocket socket = new TcpSocket(new InternetAddress("www.google.com",
> 80));
>
> socket.send("GET /logos/bigbird-hp.gif HTTP/1.0\r\n\r\n");
>
> SocketStream socketStream = new SocketStream(socket);
>
> string[] response; // Holds the lines in the response
> while(!socketStream.eof)
> {
> line = socketStream.readLine();
>
> if (line=="")
> break;
>
> // Append this line to array of response lines
> response ~= line;
> }
>
> // Due to how readLine() works, we might end up with a
> //trailing \n, so
> // get rid of it if we do.
> ubyte ncr;
> socketStream.read(ncr);
> if (ncr != '\n')
> data ~= ncr;
>
>
> // D's builtin associative arrays (safe & easy hashtables!)
> string[char[]] headers;
>
> // Parse the HTTP response. NOTE: This is a REALLY bad HTTP
> // parser. a real parser would handle header parsing properly.
> // See RFC2616 for proper rules.
> foreach (v; response) {
> // There is likely a better way to do this then
> // a join(split())
> string[] kv_pair = split(v, ": ");
> headers[tolower(kv_pair[0])] = join(kv_pair[1 .. $], ":");
> }
>
> foreach (k, v; headers)
> writefln("[%s] [%s]", k, v);
>
> uint size;
> if (isNumeric(headers["content-length"])) {
> size = toInt(headers["content-length"]);
> } else {
> writefln("Unable to parse content length of '%s' to a number.",
> headers["content-length"]);
> return 0;
> }
> // This fully buffers the data, if you are fetching large files you
> // process them in chunks rather then in a big buffer. Also, this
> // does not handle chunked encoding, see RFC2616 for details.
> while (data.length < size && !socketStream.eof) {
> ubyte[4096] buffer;
> num = socketStream.readBlock(buffer.ptr, 4096); // read 4k at a time
> writefln("Read %s bytes [%s/%s] (%s%%)",
> num, data.length, size, (cast(float)data.length/cast(float)size)*100);
>
> // Process the buffer, in this case just copy it to the data
> // buffer. This double buffering process may seem bad, but
> // has the advantage of allowing you to thread around data,
> // process the buffer in chunks, etc.
> data ~= buffer[0..num];
> }
>
> socketStream.close;
> socket.close;
>
> // It might be worthwhile to chunk this as well in some cases.
> File file = new File("logo.gif", FileMode.Out);
> file.write(data);
> file.close;
>
> return 0;
> }
Hey Travis,
I cannot begin to thank you enough for taking the time to explain this in such an exaustive example! One reason I like the D community...I can ask rediculously simple questions and still people humor me with answers and take the time to care. I was able to put together a working app based on your example. The only part I don't understand from your example that strikes my curiousity is the buffer. Why the 4096 size? I guess what I don't understand is how do I know what buffer size I should use and if that is the best buffer size, why? I hope that makes sense.
Thanks again,
Zane
More information about the Digitalmars-d
mailing list