GSoC 2018 - Your project ideas
H. S. Teoh
hsteoh at quickfur.ath.cx
Wed Dec 13 20:13:17 UTC 2017
On Wed, Dec 13, 2017 at 07:50:44PM +0000, bpr via Digitalmars-d-announce wrote:
> On Tuesday, 5 December 2017 at 18:20:40 UTC, Seb wrote:
[...]
> Of the projects in [2], I like the general purpose betterC libraries
> most, and I think it's something where students could make a real
> impact in that time period.
[...]
> > [2] https://wiki.dlang.org/GSOC_2018_Ideas
The "Who's (using) who?" project can use my symbol dependency tool as a
starting point:
https://github.com/quickfur/symdep
Basically, as it stands, it can extract the list of symbols from the
program, and which symbol references which other symbols, where "A
references B" means the disassembled code between A and the next symbol
in the executable contains a reference to an address somewhere between B
and the next symbol after B. This is done by inspecting the output of
the `objdump` tool. A list of dependencies can be produced in either
text format or in GraphViz .dot format, which can be passed to graphviz
or neato to produce a graphical chart of symbol dependencies.
As of now, the following are possible points of improvement:
- Make it work on Windows and other OSes that don't have the `objdump`
utility;
- Add better capability to limit the output to a subgraph of the full
graph. Because of the huge number of symbols in a typical D program,
outputting the entire dependency graph will produce a graph far too
large to be easily understood.
Currently, symdep has the capability of restricting the output to the
subgraph of symbols reachable from a certain given symbol (useful for
answering "what does function foo call?"), or the subgraph of symbols
NOT reachable from a certain given symbol (e.g., "what are the symbols
that aren't reachable from _Dmain?"). However, in medium-to-large D
programs, the resulting subgraph is still far too large to be useful,
so a better way of selecting a subgraph would be nice. Perhaps
implementing a maximum recursion level to the existing subgraph
functions might be a good start, i.e., "what are the symbols
referenced by _Dmain up to 3 levels down the call chain / reference
graph?".
- Better accuracy for dependency detection. Currently, it may not
produce the most accurate results because if there are private /
static symbols in a module that don't export a public symbol in the
executable, symdep won't know if a reference is actually to that
private symbol, and will blindly assume that it's actually referencing
the closest public symbol that comes before the private symbol in the
executable. This makes the output graph inaccurate.
Also, some references that go through indirection may not be detected
correctly, e.g., if function F calls function G via a function pointer
table or thunk. (I think the function table case should still work, as
long as the function table itself has a public symbol; it will just
show up in the output as F -> tableSym -> G. But this has not been
rigorously tested.)
- Currently, symdep does not distinguish between code symbols and data
symbols. For its stated purpose (i.e., find unexpected dependencies
to Phobos modules that seemingly aren't used), this is not necessarily
a bad thing. But being able to tell the difference helps to make the
output more readable, e.g., use different node shapes for code vs.
data symbols; it also allows subgraph queries to be restricted to a
particular node type (show me the call graph vs. show me the data
dependency graph), etc..
T
--
Dogs have owners ... cats have staff. -- Krista Casada
More information about the Digitalmars-d-announce
mailing list