std.fileformats?

Tue Jan 7 22:52:37 UTC 2020

On Tuesday, January 7, 2020 5:44:40 AM MST berni44 via Digitalmars-d wrote:
> On Tuesday, 7 January 2020 at 01:10:08 UTC, Jonathan M Davis
>
> wrote:
> > Now, personally, I don't think that anything regarding file
> > formats should have been in the standard library in the first
> > place.
>
> Thinking about this whole stuff, I noticed, that there are two
> different points of view, which should be separated: The idealist
> view and the pragmatic view. IMHO both are important.
>
> So when I got you right, from an idealists view, you'd say these
> file formats should be removed from phobos, but from a
> pragmatists view this looks much more difficult.
>
> I think, I share this point of view. But I'd like to get rid of
> them anyway.

std.xml is probably the only one from your list that I'd argue should be
seriously considered for being removed from Phobos and moved into undead
sooner rather than later. I don't know quite what state std.json is in, so
maybe it should have the same done to it, though it's had some work done on
it recently to try to improve it. I see no reason to rip out stuff like
std.csv or std.zip at this point though. They work and are useful. They also
aren't fundamentally broken in the way that std.xml is AFAIK.

> > For the most part, I don't see any point in removing any of
> > these modules, since that would break existing code,
>
> Well, every module, that is kept inside Phobos produces (lots of)
> maintainance work. From my perspective, we are missing resources
> here. So I prefere a controlled breaking of code (with
> deprecation and all) instead of having the code rosting and
> breaking uncontrolled sooner or later.
>
> I came up with this issue, when I looked at my own comment on
> issue 17709 [1]: I found the reason for this issue and I think I
> could fix that in a reasonable amout of time. But is it worth it
> doing so if that module might be removed or replaced in the near
> future? Wouldn't it be much better to use that time to fix a bug
> at a more important place?
>
> But on the other side: How does such a comment look like to
> someone how is using std.xml and found that issue, cause he
> stumbled over the same problem? Wouldn't it be better to remove
> std.xml completely in the first place?
>
> [1] https://issues.dlang.org/show_bug.cgi?id=17709

std.xml is broken. We've agreed for years now that it should go. It's just
that there has been no agreement on removing it without having a
replacement, which is why it's still there. I don't think that it's worth
your time to work on it. However, out of the list of modules that you
provided, std.json is the only other one where I recall any real discussion
about replacing or removing. Certainly, spending time fixing a bug in
something like std.base64 or std.zip is not a waste of time.

> > BTW, base64 isn't really a file format. It's an encoding.
>
> Really? So why isn't it in std.encoding? :-) I know, that base64
> is somewhat different, maybe it's in the gray area... Or look at
> it the other way round: Isn't zipping also just encoding and
> unzipping decoding?

Encoding involves taking information and converting it into another format
which contains the same information in a different manner. File formats may
use encodings for some of the information that they contain, but an encoding
has to do with how information is encoded. e.g. Unicode code points are
encoded with UTF-8, UTF-16, or UTF-32. All three encodings contain exactly
the same information, but the way that that information is encoded differs.
And while UTF-8 may be used inside a file, it is in no way tied to files.
It's just a way that Unicode character information is encoded.

base64 is a binary encoding, whereas std.encoding deals with character
encodings. zip is a file / container format. It uses different compression
algorithms to encode binary information internally, but zip itself is a
container format, not an encoding. It does far more than encode a string of
data in a different way like an encoding does.

std.encoding is also a bit of an oddball. It's an older module that probably
needs to be revamped / redesigned. It has some level of support for various
character encodings - including UTF-8, UTF-16, and UTF-32 - but we have
std.utf for UTF handling, and std.utf is what gets used by Phobos for
handling UTF encodings. std.encoding is class-based and has some range
support, but it isn't really range-based aside from improvements that have
been made to it over time. It does get some occasional tweaks, but largely,
it's an older module with an older design that doesn't necessarily fit all
that well into the rest of Phobos. I don't know what should be done with it
though. Some of what it does is stuff that we really should have, but it
probably needs to be redesigned. However, somebody would have to step up to
do that.

- Jonathan M Davis