std.path review: second update

Jonathan M Davis jmdavisProg at gmx.com
Mon Aug 1 13:50:21 PDT 2011


> On 01.08.2011 20:06, Lars T. Kyllingstad wrote:
> > On Mon, 01 Aug 2011 17:24:14 +0000, Jonathan M Davis wrote:
> >>> Here's a new update based on your comments and requests. Code and docs
> >>> are in the usual place,
> >>> [...]
> >> 
> >> Well, my previous comments on function names still stand - both in
> >> terms of names using Ext instead of Extension and Sep instead of sep -
> > 
> > Since the feedback has been minimal since my last update, I take it
> > people are more or less happy with the module's functionality now.
> > (Unless they're just tired of the whole discussion, of course. ;))
> > 
> > That means it is time to let the bikeshedding begin. Since we have been
> > unable to reach a consensus in previous discussions, and there are now
> > only three days left of the review, I would really like to get a good
> > sample of the general opinion here.
> > 
> > Folks, please state your preferences in terms of function names. I'll
> > try to put personal bias aside and compose a naming scheme that is both
> > internally consistent and consistent with the majority opinion.
> > 
> >> and also with regards to making non-property functions verbs (e.g.
> >> absolutePath and relativePath).
> > 
> > I'd be happy to change it, but I'm at loss for good alternatives. I seem
> > to remember you suggesting makeAbsolute and makeRelative, but not being
> > 100% happy with them yourself. Any other suggestions?
> 
> I tend to like the current naming scheme. I tried taking the list of
> functions and turning them into verbs, resulting in names like
> toAbsolute, ensureExtension, getBaseName, etc. It makes it a bit easier
> to tell what's a function and what's not. But given just a little bit
> of context, that's usually pretty clear. Then those "to" and "get"
> prefixes start to feel like more like noise.
> 
> Leaving out prefixes in some cases leaves more room for making the names
> descriptive. I guess that's another way of seeing it, so I thought I'd
> mention it. It makes it a bit harder to get an overview of the module
> by reading the symbol list at the top of the doc page, though.
> 
> >> Aside from that, wasn't it discussed that extensions should include the
> >> . in them in order to cover the case where a file ends in a .?
> >> Otherwise
> >> 
> >> assert(extension("file.") == extension("file"));
> >> 
> >> So, all of the stuff dealing with extensions should probably include
> >> the dot in the extension. It looks like both the current std.path and
> >> your std.path handle it by using "" for dot and null for nothing, but
> >> distinguishing between null and the empty string like that isn't
> >> necessarily a good idea, and your documentation doesn't make the
> >> distinction clear, unlike the documentation for the current std.path.
> >> 
> >> So, at minimum, the docs should be clearer about null vs empty when it
> >> comes to the extension (and there should be tests for it), but it would
> >> arguably be better to just include the . in the extension.
> > 
> > I'm slowly coming around to the idea of including the dot. Unless I hear
> > any loud protests I will probably do so.
> 
> I mentioned this in the other thread, but I'm hard pressed to find an
> actual use case for it. The only thing I can think of is that with some
> Windows applications specifying a file name that ends with a dot will
> suppress the addition of a default extension. Notepad is an example, it
> adds .txt by default. I'm not sure if that's really relevant, though.
> 
> On Windows, I believe it's generally true that a file name that ends
> with a dot is not considered to have an extension. You can't normally
> create a file name ending with a dot, IIRC, fopen() will just ignore the
> dot and create the file without it. In light of that, I think the
> current behavior of extension() makes perfect sense. I the rare case
> that a file name ending with a dot exists on the file system, it could
> easily fool naive user code that would likely never have seen such a
> file name in testing:
> 
> ---
> if (extension(fname).length == 0) {
> writeln("Missing extension, adding default!);
> setExtension(fname, "xyz");
> }
> ---
> 
> Returning null for the special case of a name ending with a dot seems
> reasonable, I'm fine with that. A slightly obscure solution for a
> slightly obscure problem.
> 
> I think setExtension could be smarter, same goes for defaultExtension.
> This is what I think they should behave like:
> 
> assert(setExtension("foo", "bar") == "foo.bar");
> assert(setExtension("foo", ".bar") == "foo.bar");
> assert(setExtension("foo.", "bar") == "foo.bar");
> assert(setExtension("foo.", ".bar") == "foo.bar");
> 
> This is what .NET does. It's treating extensions not just like
> substrings, but recognizing that there is syntax and conventions
> associated with them. Which is what you want 90% of the time.
> 
> Just my pragmatic, Windows-sentric view of things :)

The problem with that is that then you _can't_ have something like "foo..bar". 
Granted, that's not a normal thing to do, but it would be problematic if it 
ever happened.

The primary danger I see with using null for no extension and empty for "." is

assert(extension("file.") == extension("file"));

That could cause problems for any program that actually runs into a file which 
ends with a ".". Now, if Linux, Windows, and Mac OS X all prevent files ending 
in a dot (which I seriously doubt), then it isn't really an issue. Having the 
difference between null and empty does make it possible to distinguish if you 
start doing stuff like

auto ext1 = extension("file.");
auto ext2 = extension("file");
assert(ext1 !is null && ext2 !is null && ext1 == ext2);

but it's ugly and it means that your average program is going to deal with it 
incorrectly. True, it's unlikely to happen, but if it _does_ happen, the 
program will be buggy.

It might be reasonable to have all of the functions in std.path which return 
an extension include the dot and have them accept extensions with or without 
the dot (in which case they just add a dot if there wasn't one - regardless of 
whether the file's baseName ended with a dot or not). But that inconsistency 
might be problematic - though considering that the alternative is pretty much 
to assert or throw if given an extension which doesn't start with a dot, they 
might pretty much have to have the behavior of adding the dot if it isn't 
there.

Ideally, this would always be true

assert(setExtension(stripExtension(fileName), fileName.extension) == 
fileName);

In your suggestion, that would not be the case for a file such as "foo..bar", 
and I'm not quite sure what it does with Lars' current implementation for a 
file such as "file.". I'd have to check. But unless there's a really good 
reason why forcing

assert(setExtension(stripExtension(fileName), fileName.extension) == 
fileName);

to always be true wouldn't work, then I definitely think that it should 
_always_ be true.

- Jonathan M Davis


More information about the Digitalmars-d mailing list