Splitting std.algorithm
H. S. Teoh via Digitalmars-d
digitalmars-d at puremagic.com
Tue Jan 20 15:40:57 PST 2015
OK, so the past weekend, I was foolhardy enough to attempt (again) to
split std.algorithm into more manageable pieces... and unlucky enough to
actually succeed this time:
https://github.com/D-Programming-Language/phobos/pull/2879
However, right now this PR is in a rather precarious situation, because
git doesn't understand the concept of moving content between files; as a
result, *any* further changes to std/algorithm.d in master must be
manually merged into the branch (as in, the diffs must be applied by
hand). As we all (should) know, such by-hand merges are extremely
error-prone, and subtle bugs can get introduced inadvertently. A
careless mistake on my part could accidentally revert a bugfix PR, for
example.
Besides, such by-hand merges are also very time-consuming, and I really
rather do better things than to sit around manually applying diffs all
day, and all that without knowing whether or not this PR is going to get
merged at all.
So I have a request: can we please decide ASAP whether or not this PR is
worth it, and, if it is, merge it ASAP? Since we're currently in the
process of improving docs, there's an extremely high chance that
std/algorithm.d will be touched again in the near future. I've already
spent hours manually applying the diffs (and coaxing git to behave,
which is challenging in this situation) for a *single* affected PR that
was merged today, and I really do not want to keep doing this if at all
possible. If this is a bad idea, I'd like to know right now and not
waste any more time on it.
So far, the arguments for splitting std.algorithm are:
- The file is too big. It increases compilation time and compiler memory
usage, and makes locating a particular piece of code more difficult
than it needs to be.
- I can't even run the Phobos unittests on my machine because dmd runs
out of memory and dies. This means either (1) I submit untested PRs so
that I can leverage the autotester to run the tests for me, which
means lots of wasted autotester resources and waste of time for me as
I have to do the code-compile-test cycle on the autotester; or (2) I
have to manually delete large swaths of code from std.algorithm while
working on the PR, just so I can unittest my changes properly. I'm
sure I'm not the only one here who has trouble running Phobos
unittests; this means the barrier to contribution is needlessly high.
- It's a conglomeration of only tenuously-related functions, and as a
result, importing std.algorithm will pull in roughly half of Phobos in
dependencies that you may not actually need. In the process of
splitting, I have found that I could eliminate many module-level
imports, and/or otherwise reduce dependencies to other Phobos modules.
- Although this PR doesn't do this yet, having smaller submodules means
that other Phobos modules that need something from std.algorithm won't
have to import the entire thing (which in turn would cause a snowball
effect of also importing every dependency of std.algorithm, and their
respective recursive dependencies, most of which are unnecessary
anyway, since it may be just 1 or 2 functions that are actually
needed). With the new submodules, if you need map(), for example, you
could just import std.algorithm.iteration : map, and you won't incur
the cost of also importing stuff that only, say, cartesianProduct
needs.
- An overly large module makes it difficult for new users to understand
what the module does, or whether it happens to contain something they
need. This is to some extent alleviated by proper documentation, but
even then, functions categorized into 6 submodules is a lot more
browseable than a single gigantic module that contains everything
including the kitchen sink.
The only argument against splitting std.algorithm (that I know of) is:
- Andrei doesn't approve because apparently some people think "big files
are not a problem".
So, what do you think? Should we merge this, or should we not?
T
--
Many open minds should be closed for repairs. -- K5 user
More information about the Digitalmars-d
mailing list