what are guidelines for when to split a module into a package?

Timothee Cour thelastmammoth at gmail.com
Thu Feb 22 07:44:49 UTC 2018


>  it's harder to find symbols

i don't understand this argument.

```
dscanner --declaration startsWith
./std/algorithm/searching.d(4105:6)
./std/algorithm/searching.d(4195:6)
./std/algorithm/searching.d(4265:6)
./std/algorithm/searching.d(4301:6)
```


On Wed, Feb 21, 2018 at 11:31 PM, Jonathan M Davis via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On Wednesday, February 21, 2018 23:13:33 Timothee Cour via Digitalmars-d
> wrote:
>> from my perspective it makes sense to split a module M into submodules
>> A, B when:
>> * M is large
>> * there's little interaction between A and B (eg only few symbols from
>> A are needed in B and vice versa)
>> * A and B are logically grouped (that is domain specific)
>> * it doesn't turn into an extreme (1 function per module)
>>
>> Advantages of splitting:
>> * easier to review
>> * easier to edit (no need to scroll much to see entirety of module
>> we're editing)
>> * less pollution from top-level imports as they only affect submodule
>> (likewise with top-level attributes etc)
>> * more modular
>> * doesn't affect existing code since `import M` will continue to work
>> after M is split into a package
>> * less memory when using separate compilation
>> * allows fine-grained compiler options (eg we can compile B with `-O` if
>> needed) * allows to run unittests just for A instead of M
>> * allows selective import in client to avoid pulling in too many
>> dependencies (see arguments that were made for std.range.primitives)
>>
>> Disadvantages of splitting:
>> * more files; but not sure why that's a problem so long we don't go
>> into extremes (eg 1 function per module or other things of bad taste)
>>
>> ---
>> while working on https://github.com/dlang/phobos/pull/6178 I had
>> initially split M:std.array into submodules:
>> A:std.array.util (the old std.array) and B:std.array.static_array
>> (everything added in the PR)
>> IMO this made sense according to my above criteria (in this case there
>> was 0 interaction between A and B), but the reviewers disagreed with
>> the split.
>>
>> So, what are the guidelines?
>
> It's decided on a case-by-case basis but is generally only done if the
> module is quite large. std.array is not particularly large. It's less than
> 4000 lines, including unit tests and documentation, and it only has 18
> top-level symbols.
>
> Also, remember that within Phobos, imports are supposed to be as localized
> as possible - both in terms of where the import is placed and in terms of
> selective imports - e.g. it would be
>
> import std.algorithm.searching : find;
>
> not
>
> import std.algorithm : find;
>
> which means that splitting the module then requires that all of those
> imports be even more specific. User code can choose to do that or not, but
> it does make having modules split up further that much more tedious. Related
> to that is the fact that anyone searching for these symbols now has more
> modules to search through. So, finding symbols will be harder. Take
> std.algorithm for instance. It was split, because it was getting large
> enough that compiling it on machines without large amounts of memory
> resulted in the compiler running out of memory. So, there was a very good
> argument for splitting it. However, now, even if you know that a symbol is
> in std.algorithm, do you know where in std.algorithm it is? Some are obvious
> - e.g. sort is in std.algorithm.sorting. However, others are not so
> obviously - e.g. where does startsWith live? Arguably, it could go in either
> std.algorithm.comparison or std.algorithm.searching. It turns out that it's
> in std.algorithm.searching, but I generally have to look it up. And where to
> functions like map or filter live? std.algorithm.mutation?
> std.algorithm.iteration? It's not necessarily obvious at all.
>
> From the perspective of users trying to find stuff, splitting modules up
> comes at a real cost, and I honestly don't understand why some folks are in
> a hurry to make module really small. That means more import statements when
> using those modules, and it means that it's harder to find symbols.
>
> Personally, I think that we should be very slow to consider splitting
> modules and only do so when it's clear that there's a real need, and
> std.array is nowhere near that level.
>
> - Jonathan M Davis
>
>


More information about the Digitalmars-d mailing list