New web newsreader - requesting participation

Mon Jan 31 07:18:44 PST 2011

Nice newsreader! Fast and does what it needs to do, and written in D. I 
like that. :)

I'm currently writing an NNTP web frontend (reading and posting) for my 
university. However it's written in PHP so it's not really fitting for a 
new D homepage. But I'm curious how you do web programming with D. Do 
you use CGI? How do you do all the HTTP stuff (parsing form data, etc.) 
and templating?

But back to the NNTP reader:

# HTML formating

The work you put into formating messages as HTML is impressive. The 
autodetection of source code could really come in handy. I found 
[Markdown][1] to work relatively well with common Mails so the syntax 
might contain a few good ideas for e.g. quotes, links, lists, etc.

[1]: http://daringfireball.net/projects/markdown/syntax

# Topic list

Right now you display the newest few messages on the newsgroup. Building 
a topic list gets quite a bit more complex. To get a proper topic list 
with pagination etc. I query the overview information of all (!) 
messages in a newsgroup with the "over" command (the digitalmars.com 
server supports the older "xover" which is the same). This contains the 
message ID and the references header which can be used to built a 
message tree. All messages on the root level of the tree are topics and 
it's easy to get the number of replies and the latest reply. It a bit 
tricky sometimes but all other algorithms I came up with tend to lose 
some messages (e.g. of the topic post is deleted) or were even slower. 
The overview also contains the subject and from header and some other 
useful stuff. I suppose the current newsreader does something similar 
without caching and this might be the reason why it is so slow.

This message tree and the overview information however can be cached 
very easily. The tree can also be extended on the fly, e.g. check for 
new messages with the newnews command and add them to the tree. This 
might require some locking but at least in PHP flock() was sufficient 
for that.

# Cache invalidation

The problem with the message tree cache or cached messages in general is 
the invalidation. Looks like the digitalmars news server does not delete 
that much messages so this might not be much of a problem. How do you 
handle this right now?

# D website

I took a look at your current version of the D website 
(http://arsdnet.net/d-web-site/). I really like the layout. Looks good 
to get started with D. Just two small things:

- The compile and run button is a bit of a security risk. I was able to 
read the /etc/passwd file for example. Maybe it's possible to lock down 
the compiled binaries with SELinux. Denial of service attacks (e.g. 
endless loops) might still be a problem though. We built an "online D 
compiler" for a presentation at our university but didn't published it 
because of these concerns.
- If you only display mails in the announcements which do not have a 
"References" header you will only get mails that started a new topic. 
This will filter out replies.

If you want some help I could do some stuff. I'm a bit short on time 
right now but since I'm building a NNTP reader in PHP anyway I might be 
able to help out with your D NNTP reader. I can also help with HTML and 
CSS stuff if you want. Support for older browsers and older IE versions 
if there is much traffic with these browsers or some minor design stuff 
(I'm not that much of a designer though). I might also start to look 
into SELinux…

Happy programming
Stephan Soller

On 31.01.2011 04:08, Adam Ruppe wrote:
> In the other newsgroup, I've been talking about a little
> web news program I've been writing as a spinoff of the
> potential new homepage idea.
>
> It's to the point where it is usuable, but still kinda buggy:
>
> http://arsdnet.net/d-web-site/nntp/thread-index?
> newsgroup=digitalmars.D
>
> Source code: http://arsdnet.net/d-web-site/nntp.d
>
> NOTE: it does /not/ automatically check for new posts. I have
> to manually trigger that right now (I don't want it annoying
> the news server automatically while still in the testing phase.)
>
> It will lazily load a message on demand though if you know
> it's message ID:
> http://arsdnet.net/d-web-site/nntp/get-message
>
> Get it from the Message-ID header in the post.
>
>
>
> Anyway, here's the features:
>
> a) It isn't god awful slow. The PHP web news currently on digital
> mars, as best as I can tell, actually polls the news server every
> time you go to it's index! This does aggressive local caching.
>
> b) It actually lets you select text...
>
> OK, if I list every annoyance with the current web news, I'll
> never stop. Moving on to new things:
>
> c) It tries to convert news posts to HTML, so the paragraphs
> wrap to the browser, links work, quotes are put into the proper
> tags for indentation, and it tries to auto-detect D code and
> put it in a<pre>  block - which my javascript can make inline
> editable and runnable. Example:
>
> http://arsdnet.net/d-web-site/nntp/get-message?
> newsgroup=digitalmars.D&messageId=%
> 3Cmailman.1085.1296409409.4748.digitalmars-d%40puremagic.com%3E
>
> With script disabled, you'll see the code in a different colored
> block. With script enabled, you'll see an Edit button there
> too.
>
> d) It tries to convert HTML emails back to plain text. (Ironically,
> so it can turn it back to html...) This gives uniformity across
> the various mime types. Similarly, if the type is
> multipart/alternative, it will only show the text version.
>
> e) It also makes an attempt to preserve deliberate whitespace,
> for things like ASCII art or purposefully short lines. If it
> can't make heads or tails of it, it bails out and shows the
> original message in a<pre>  block for human consumption.
>
> f) Tries to be fast and lean.
>
> g) Written in D!
>
> h) Already read messages is tracked by your browser - if the link
> is visited, it puts up a different color url.
>
> Coming as I find time:
>
> a) References to bugzilla entries should be automatically
> converted to links.
>
> b) Viewing threads by date or by threaded view.
>
> c) Posting with the option of automatic quoting.
>
> d) Syntax highlighting of D code in posts.
>
> e) Maybe, maybe links to documentation of functions referenced,
>     if I can find a good way to get them automatically. Integration
>     with my dpldocs.info site is the way I'd do it.
>
> e) Any more ideas? I'm reluctant to add too much, but if I like
>     an idea - or if you want to write the code :) - I'll be open'
>     to adding it.
>
>
> Known bugs:
>
> Lots of content types aren't handled right and it ignores
> character encoding.
>
> It doesn't always recognize code. This would be ok, but if it
> sees one line as code but doesn't include one of them, it would
> confuse the reader. Example:
>
> http://arsdnet.net/d-web-site/nntp/get-message?
> newsgroup=digitalmars.D&messageId=%3Cii4lbj%242bes%241%
> 40digitalmars.com%3E
>
> (Look for "auto str =")
>
> The reason for this is it detects code lines by looking for
> semicolons and open braces. It will call something a generic
> <pre>  if there's a lot of whitespace in it - figuring it is
> probaby ascii art (if it thinks the whitespace has human
> significance, it tries to preserve it), but it still isn't
> a perfect detection function.
>
> I'm open to ideas. We want to detect code, but not flag
> regular English text.
>
>
>
> I'm also open to graphical styling ideas. I put up a dark
> theme here because the white was hurting my eyes, but I change
> on if I like light or dark almost at random. (Depends on the room's
> lighting conditions I think). But I didn't do any more graphic
> setup other than the max-width.
>
> Multiple color schemes is an idea I like.
>
>
>
> BTW, as a fun fact, this post is about 1/4th the size of the
> entire nntp.d code file!