Am I reading this wrong, or is std.getopt *really* this stupid?

Jon Degenhardt jond at noreply.com
Sat Mar 24 19:14:32 UTC 2018


On Saturday, 24 March 2018 at 16:11:18 UTC, Andrei Alexandrescu 
wrote:
> Anyhow. Right now the order of processing is the same as the 
> lexical order in which flags are passed to getopt. There may be 
> use cases for which that's the more desirable way to go about 
> things, so if you author a PR to change the order you'd need to 
> build an argument on why command-line order is better. FWIW the 
> traditional POSIX doctrine makes behavior of flags independent 
> of their order, which would imply the current choice is more 
> natural.

Several of the TSV tools I built rely on command-line order. 
There is an enhancement request here: 
https://issues.dlang.org/show_bug.cgi?id=16539.

A few of the tools use a paradigm where the user is entering a 
series instructions on the command line, and there are times when 
the user entered order matters. Two general cases:

* Display/output order - The tool produces delimited output, and 
the user wants to control the order. The order of command line 
options determines the order.

* Short-circuiting - tsv-filter in particular allows numeric 
tests like less-than, but also allow the user to short-circuit 
the test by testing if the data contains a valid number prior to 
making the numeric test. This is done by evaluating the command 
line arguments in left-to-right order.

Short-circuiting is supported the Unix `find` utility.

I have used this approach for CLI tools I've written in other 
languages. Perl's Getopt::Long processes args in command-line, so 
it supports this.

I considered submitting a PR to getopt to change this, but 
decided against it. The approach used looks like it is central to 
the design, and changing it in a backward compatible way would be 
a meaningful undertaking. Instead I wrote a cover to getopt that 
processes arguments in command-line order. It is here: 
https://github.com/eBay/tsv-utils-dlang/blob/master/common/src/getopt_inorder.d. It handles most of what std.getopt handles.

The TSV utilities documentation should help illustrate these 
cases. tsv-filter use short circuiting: 
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#tsv-filter-reference. Look for "Short-circuiting expressions" toward the bottom of the section.

tsv-summarize obeys the command-line order for output/display. 
See: 
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#tsv-summarize-reference.

There's one other general limitation I encountered with the 
current compile-time approach to command-line argument 
processing. I couldn't find a clean way to allow it to be 
extended in a plug-in manner.

In particular, the original goal for the tsv-summarize tool was 
to allow users to create custom operators. The tool has a fair 
number of built-in operators, like median, sum, min, max, etc. 
Each of these operators has a getopt arg invoking it, eg. 
'--median', '--sum', etc. However, it is common for people to 
have custom analysis needs, so allowing extension of the set 
would be quite useful.

The code is setup to allow this. People would clone the repo, 
write their own operator, placed in a separate file they 
maintain, and rebuild. However, I couldn't figure out a clean way 
to allow additions to command line argument set. There may be a 
reasonable way and I just couldn't find it, but my current 
thinking is that I need to write my own command line argument 
handler to support this idea.

I think handling command line argument processing at run-time 
would make this simpler, at the cost loosing some compile-time 
validation.

--Jon


More information about the Digitalmars-d mailing list