std.path review: second update

Marco Leise Marco.Leise at gmx.de
Tue Aug 2 01:19:54 PDT 2011


Am 02.08.2011, 08:02 Uhr, schrieb Jonathan M Davis <jmdavisProg at gmx.com>:

> "file." and "file" do _not_ have the same extension. One has an empty  
> extension whereas the other has none.

Still I would expect a get extension function to return the empty string  
for both. Why is that so? As Wikipedia states the interpretation depends  
on the filesystem (or maybe on the originating OS, but you can use ext3 on  
Windows and NTFS on Linux nowadays).

But others seem to have problems as well:

Trailing dots disappear in Samba:
http://lists.samba.org/archive/rsync/2002-September/003636.html

On Windows files ending in a dot cannot be deleted:
http://cygwin.com/ml/cygwin/2004-01/msg00848.html
http://blog.dotsmart.net/2008/06/12/solved-cannot-read-from-the-source-file-or-disk/

Mozilla Linux cannot open files ending in a dot:
https://bugzilla.mozilla.org/show_bug.cgi?id=149586

The file extension is what is following the last dot.
On Windows it cannot be empty, thus 'foo.' will be an inaccessible file.
Yet 'foo..bar' is perfectly fine, which is causing us trouble now, since  
'foo.' is 'foo..bar' stripped from its extension, but 'foo.' itself -  
while valid on Posix - is an ambiguous name in Windows.
Camp A thinks:
- it has no extension as long as the dot isn't followed by one
- changing the extension must result in 'foo..ext'
- getExtension should never return null, but be either '' or include the  
dot as in '.ext'
- disassembling and reassembling a filename by string concatenation should  
return the original filename in all cases

Camp B thinks:
- no dot = no extension, otherwise what follows the dot is the extension
- changing the extension must result in 'foo.ext'
- getExtension returns null if no dot is found, an empty string if the  
file ends in a dot or otherwise what is following the dot
- disassembling and reassembling a filename isn't a portable process

I started at camp A, but now I'm really caught in the middle. Their  
arguments make as much sense.
Funny enough even Sun avoided file extension methods in their Java File  
class, so I checked Python for that matter:
os.path.splitext ( "foo.bar" ) -> '.bar'
os.path.splitext ( "foo." ) -> '.'
os.path.splitext ( "foo" ) -> ''
Although there is no routine to change the extension, the obvious approach  
would result in changeExt('foo.', '.bar') == 'foo.bar'.

This is what Jonathan prefers and I agree with this solution now that I  
made up my mind. It's just inconvenient that by this convention you cannot  
change the extension of 'Keep my dot.' in a way that the result is 'Keep  
my dot..ext'.


More information about the Digitalmars-d mailing list