Path as an object in std.path

Michel Fortin michel.fortin at michelf.ca
Thu Jun 6 14:47:21 PDT 2013


On 2013-06-06 20:25:58 +0000, Walter Bright <newshound2 at digitalmars.com> said:

> On 6/6/2013 1:02 PM, Michel Fortin wrote:
>> Have you never opened a local file in a windows web browser and took a look at
>> the URL? The drive letter is there.
>> 
>>      file:///c:/path/to/the%20file.txt
>> 
>> The drive letter is simply the first part of the path on Windows.
> 
> I didn't know that, but that doesn't make it a canonical path. It just 
> combines the notion of url with a path.

It's not a canonical path, but it's a platform-neutral representation 
of a path. You can perform the same operations with a URL (including 
regular expressions) irrespective the underlying OS.

I was replying initially to your claim that there was no portable way 
to represent a path. I don't think the definition of a "portable path" 
needs to include any notion of canonical, because not even non-portable 
paths can be canonical these days.


>> Actually, it doesn't depend on Linux or Windows or OS X. It depends on the
>> filesystem used, be it FAT16, FAT32, NTFS, ext{1,2,3}, HFS+, Case-sensitive
>> HFS+, etc. If you assume a specific case sensitivity setting by looking at the
>> OS, that's a bug. You can mount NTFS and FAT on Linux or OS X, and Apple has
>> Case-sensitive HFS+ for OS X and its the default on iOS. Then there's the whole
>> issue about which locale to use for Unicode case-insensitive comparisons. I'd
>> bet that different filesystems choose different approaches to this 
>> tricky problem.
>> 
>> So there's no way to normalize for case-sensitivity just by looking at 
>> a path or
>> a URL, even if you know on which OS you're on. If you want to know for sure
>> whether two paths are the same, or what is the normalized path, you need to ask
>> the filesystem at some point. Anything else is based on fragile assumptions.
> 
> It may be a bug, and I personally try to never depend on path code that 
> is case sensitive or not, but I bet there's a *lot* of code out there 
> that makes those assumptions.

That's a good way to deal with paths (don't assume anything). And I'd 
bet even case-sensitive filesystems differ in behaviour when presented 
with different normalization of Unicode (using pre-combined characters 
vs. combining ones).

-- 
Michel Fortin
michel.fortin at michelf.ca
http://michelf.ca/



More information about the Digitalmars-d mailing list