Path as an object in std.path

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Jun 6 12:27:16 PDT 2013


On Thu, Jun 06, 2013 at 02:38:41PM -0400, Andrei Alexandrescu wrote:
> On 6/6/13 2:13 PM, Jonathan M Davis wrote:
> >>An example of a strong justification for a redo is, for example,
> >>conversion to use ranges. std.zip needs that treatment.
> >
> >Agreed.
> 
> Key to success for Path: somehow get it on the ranges bandwagon :o).
[...]

Hmm. Let's see:

	assert(isInputRange!Path);
	version(Windows)
		auto p = Path(`..\blah\blah\..\bluh`);
	else version(Linux)
		auto p = Path(`../blah/blah/../bluh`);

	// I'm assuming auto normalization; if you don't like that,
	// pretend I also wrote this line:
	//	p.normalize();

	assert(p.equals([
		"..",
		"blah",
		"bluh"
	]);

What about that? ;-)

While the above may *look* attractive, it's actually a minefield full of
pitfalls. Consider this directory tree in Posix:

	/home/user/test
	/home/user/test/symlink -> /home/user/real/1
	/home/user/test/real
	/home/user/test/real/1/myfile
	/home/user/test/real/2/anotherfile

Let's say the current working directory is /home/user. Now consider
this:

	auto p = Path(`test/symlink/../2/anotherfile`);
	assert(std.path.exists(p));	// should this work?

The only way the above can actually work is if normalization queries the
filesystem. That is to say, it is NOT mere string manipulations.

However, *should* normalization always check the filesystem? What if the
program is constructing a list of paths that it's going to create, which
don't exist in the filesystem yet? Then normalization will fail, even
though the paths are valid.

Conclusion: correct path normalization depends on intent, which only the
programmer knows -- the library can't possibly figure this out without
being told. (And I haven't even started getting into OS-dependent path
manipulation yet... what should Path(`C:\Program Files\abc.def`) do on a
Posix system?) IOW, the programmer *already* has to know about
system-dependent details of paths, so I'm not sure what value Path is
really adding. At least, I'm not finding it compelling enough to eschew
plain old string manipulations.

Besides, should glob patterns like "/home/user/prog/*/*.d" be Path's or
strings? What about path regexes? Should Path export a whole suite of
parallel methods for constructing such patterns? One can always
interconvert to/from strings, of course, but if we'd started out with
strings in the first place, we wouldn't need any conversions. The OS
ultimately takes only strings anyway, so is there really a need to
insert a convert to/from Path in between?

I do see a lot of value in providing *functions* for manipulating path
strings (normalizations, parsing path components, splitting file
extensions, etc.), but I've a hard time with encapsulating a path string
in an opaque object when it doesn't really give that much more value. If
you *really* like the idea of Path, nothing stops you from writing one
yourself, and have it implicitly convert to string so that you can pass
it directly to OS functions that take paths. I just don't see value in
requiring Phobos functions to only take Path objects.


T

-- 
WINDOWS = Will Install Needless Data On Whole System -- CompuMan


More information about the Digitalmars-d mailing list