std.path review: second update
Marco Leise
Marco.Leise at gmx.de
Tue Aug 2 01:19:54 PDT 2011
Am 02.08.2011, 08:02 Uhr, schrieb Jonathan M Davis <jmdavisProg at gmx.com>:
> "file." and "file" do _not_ have the same extension. One has an empty
> extension whereas the other has none.
Still I would expect a get extension function to return the empty string
for both. Why is that so? As Wikipedia states the interpretation depends
on the filesystem (or maybe on the originating OS, but you can use ext3 on
Windows and NTFS on Linux nowadays).
But others seem to have problems as well:
Trailing dots disappear in Samba:
http://lists.samba.org/archive/rsync/2002-September/003636.html
On Windows files ending in a dot cannot be deleted:
http://cygwin.com/ml/cygwin/2004-01/msg00848.html
http://blog.dotsmart.net/2008/06/12/solved-cannot-read-from-the-source-file-or-disk/
Mozilla Linux cannot open files ending in a dot:
https://bugzilla.mozilla.org/show_bug.cgi?id=149586
The file extension is what is following the last dot.
On Windows it cannot be empty, thus 'foo.' will be an inaccessible file.
Yet 'foo..bar' is perfectly fine, which is causing us trouble now, since
'foo.' is 'foo..bar' stripped from its extension, but 'foo.' itself -
while valid on Posix - is an ambiguous name in Windows.
Camp A thinks:
- it has no extension as long as the dot isn't followed by one
- changing the extension must result in 'foo..ext'
- getExtension should never return null, but be either '' or include the
dot as in '.ext'
- disassembling and reassembling a filename by string concatenation should
return the original filename in all cases
Camp B thinks:
- no dot = no extension, otherwise what follows the dot is the extension
- changing the extension must result in 'foo.ext'
- getExtension returns null if no dot is found, an empty string if the
file ends in a dot or otherwise what is following the dot
- disassembling and reassembling a filename isn't a portable process
I started at camp A, but now I'm really caught in the middle. Their
arguments make as much sense.
Funny enough even Sun avoided file extension methods in their Java File
class, so I checked Python for that matter:
os.path.splitext ( "foo.bar" ) -> '.bar'
os.path.splitext ( "foo." ) -> '.'
os.path.splitext ( "foo" ) -> ''
Although there is no routine to change the extension, the obvious approach
would result in changeExt('foo.', '.bar') == 'foo.bar'.
This is what Jonathan prefers and I agree with this solution now that I
made up my mind. It's just inconvenient that by this convention you cannot
change the extension of 'Keep my dot.' in a way that the result is 'Keep
my dot..ext'.
More information about the Digitalmars-d
mailing list