phobo's std.file is completely broke!

Vladimir Panteleev thecybershadow.lists at gmail.com
Wed Sep 19 08:36:35 UTC 2018


On Wednesday, 19 September 2018 at 08:18:38 UTC, Nick Sabalausky 
(Abscissa) wrote:
>> Someone mentioned in this thread that .NET runtime does do the 
>> long-path workaround automatically. One thing we could do is 
>> copy EXACTLY what C# is doing.
>> 
>
> This is a complete textbook example of the "appeal to 
> authority" fallacy.
>
> If an approach is valid, then it stands on its own merits 
> regardless of whether or not Microsoft implemented it.
>
> If an approach in invalid, then it fails on its own demerits 
> regardless of whether or not Microsoft implemented it.
>
> What MS has or hasn't implemented and released is completely 
> irrelevant WRT validity and correctness.
>
> What it *might* be useful for is as a starting point for 
> further exploration. But that is all.

No, absolutely not.

Microsoft is in charge of the implementation. We can't know that 
any deviation from Microsoft's algorithm will work in all 
situations, or all past/future implementations of the API. There 
is the considerable possibility that there are situations which 
we cannot foresee from our limited knowledge of the problem and 
Windows API implementation; on the other hand, Microsoft not only 
has complete knowledge of the implementation, but also controls 
its future. They have an incentive to keep the .NET algorithm 
working.

If we deviate from the .NET algorithm and D breaks (but not C#), 
it is our fault.

If we implement the .NET algorithm, then we are as good as C#. If 
it breaks, it's Microsoft's fault.

You cannot evaluate any intrinsic merit here because the result 
is beyond your control.

> If one extra OS API call + allocation per std.file API call is 
> unacceptable, then explain how it is unacceptable. I disagree 
> that it is significant enough to be unacceptable.

It is not unacceptable, but it is a drawback.

> If a user needs to optimize their 
> already-working-for-all-accepted-inputs application, then they 
> are free to do so. I argue that building this into the standard 
> library's default behaviour amounts to mandatory premature 
> optimization, prioritizing premature optimization over 
> correctness. Prove me wrong.

You could extend this argument to any severity of workarounds. 
Where do you draw the line?

>> - Using paths longer than MAX_PATH is an exceptional 
>> situation. Putting the workaround in the main code path 
>> penalizes 99.9% of use cases.
>
> I have many filepaths on my system right now which exceed 
> MAX_PATH in total length. I submit that this "penalty" you 
> speak of is nothing more than a trivial performance boost at 
> the expense of correctness. Furthermore, I submit that long 
> paths which need extra optimization are MORE exceptional than 
> long paths which do NOT need extra optimization.

Optimization is the least concern.

>> - The registry switch in newer Windows versions removes the 
>> need for this workaround, so systems with it enabled are 
>> penalized as well.
>
> Using the Phobos-based workaround on a system WITH the longpath 
> setting supported and enabled results in slightly reduced 
> performance (which can be overridden and optimized when 
> necessary).

I'm more concerned about differences in behavior.

> OTOH, NOT using the Phobos-based workaround on a system where 
> the longpath setting is NOT supported *OR* NOT enabled results 
> in erroneous behavior.

I disagree that failure on paths exceeding MAX_PATH is 
necessarily erroneous behavior. The API reports an error given 
the user's path, so should Phobos.

> The superior default is clear: Use the workaround except where 
> the workaround in known to be safe to omit.

We don't even have an algorithm for determining for sure when the 
workaround is needed.

>> - There is still the matter regarding special filenames,
>
> If you're referring to NUL, COM1, COM2, etc, then this is 
> completely orthogonal.

Yes. How so? It is the same issue: paths with certain properties 
are valid on all platforms except on Windows. Phobos errors out 
when attempting to access/create them. A simple workaround is 
available: expand/normalize the path, prepend the UNC prefix, and 
use Unicode APIs.

>> as well as whet her the expected behavior is really to succeed 
>> and create paths inaccessible to most software, instead of 
>> failing.
>
> Ok, suppose we decide "Sure, we have reason to believe there 
> may be a significant amount of software on Windows which fails 
> to handle long paths and we want to ensure maximum 
> compatibility with those admittedly broken programs." That's 
> fine. I can get behind that. HOWEVER, that does NOT mean we 
> should leave our APIs as they are, because currently, our APIs 
> fail at that goal. Instead, what it really means is that our 
> APIs should be designed to *REJECT* long paths with an 
> appropriately meaningful error message - and a reasonable 
> workaround - and NOT to blindly just pass them along as they 
> currently do.

This is not possible, because you need to precisely know how the 
implementation will handle the path. Considering the 
implementation's behavior can be configured by the user, I don't 
think this is feasible.

> Either way, Phobos needs changed:
>
> Do you believe D should prevent its own software from being 
> broken on long paths? Then Phobos should be modified to detect 
> and fix long paths.
>
> Do you believe D should permit breakage on long paths and 
> encourage its programs to play nicely with other non-D Windows 
> software that is *also* broken on long paths? Then Phobos 
> should be modified to detect and *reject* long paths.
>
> Either way, the current Phobos behavior is clearly the worst of 
> both worlds and needs modification.

Sorry, I don't see how you're reaching that conclusion. Looks 
like a false dichotomy.



More information about the Digitalmars-d mailing list