Should std.conv:parse parse html entities?

Jonathan M Davis newsgroup.d at jmdavisprog.com
Wed Nov 13 14:52:46 UTC 2019


On Wednesday, November 13, 2019 7:41:45 AM MST Jonathan M Davis via 
Digitalmars-d wrote:
> On Wednesday, November 13, 2019 5:17:17 AM MST berni44 via Digitalmars-d
>
> wrote:
> > Concerning issue 9621 [1]: There are two things, that parse
> > doesn't parse currently, namely octal numbers and html entities.
> > While there is no argument against the former (I actually wrote a
> > PR to add them), there has been some discussion around the later,
> > because the whole table of those entities (about 3000) would make
> > it in the code, even if not needed at all.
> >
> > As I don't think, I should try to decide this on my own, I'd like
> > to know your oppinion: What is better: Add the entities or write
> > in the docs, that they are not supported? What do you think?
> >
> > [1] https://issues.dlang.org/show_bug.cgi?id=9621
>
> I fail to see why std.conv.to or std.conv.parse should handle either octal
> literals or HTML entities, and I don't know why anyone would expect them
> to. HTML entities are the kind of thing that I would expect an HTML
> parser to handle, not the standard library. The compiler does handle some
> of them (which honestly, I think is kind of weird), which is the only
> argument I can see for supporting them in std.conv, but it's not like
> std.conv is designed to be parsing D code. Also, IIRC, octal literals
> were removed from the language. So, that's not an argument for adding
> them to std.conv. They also not all that commonly needed by anything
> AFAIK. parse can already parse integer values of arbitrary bases if you
> give it an explicit based / radix.

Actually, it looks like you can still have octal literals in strings even
though support for octal integer literals was removed. Either way, given
that the compiler is going to translate a string literal with an octal or
HTML entity into what it represents rather than have it be something to
parse, unless someone is constructing strings that use these rather than
using string literals, there won't even be anything to parse. Personally, I
don't see much reason to support either. What's the use case?

- Jonathan M Davis





More information about the Digitalmars-d mailing list