SAOC LLDB D integration: 6th Weekly Update

Luís Ferreira contact at lsferreira.net
Thu Oct 28 01:17:11 UTC 2021


Hi D community!

I'm here again, to describe what I've done during the sixth week of
Symmetry Autumn of Code.

## LLVM Patches follow up

The first two patches were merged into the LLVM tree!

- https://reviews.llvm.org/D111947
- https://reviews.llvm.org/D111948

Hopefully we can now proceed with merging the demangling patches as the
next step.

## LLDB D Plugin

This week I primarily worked on getting the D plugin working. I added
two features to the plugin which includes handling D slices generically
and the special case of string slices. They are now formatted as a D
string literal, depending on its encoding.

This is a reduced example of what the LLDB can show to the user, with
the D plugin.

```
* thread #1, name = 'app', stop reason = signal SIGSEGV: invalid
address (fault address: 0xdeadbeef)
    frame #0: 0x0000555555555edc app`app.foobar(p=0x00000000deadbeef,
a=([0] = 1, [1] = 2, [2] = 3), ...) at app.d:43:2
   40           immutable(dchar)[] sh = "double atum"d.dup;
   41           const(wchar)[] si = "wide atum"w.dup;
   42
-> 43           return *p;
   44   }
   45
   46   class CFoo {
(lldb) fr v
(int *) p = 0x00000000deadbeef
(int[]) a = ([0] = 1, [1] = 2, [2] = 3)
(long double) c = 123.122999999999999998
(Foo) f = {}
(string) sa = "atum"
(wstring) sb = "wide atum"w
(dstring) sc = "double atum"d
(char[]) sd = "atum"
(dchar[]) se = "double atum"d
(wchar[]) sf = "wide atum"w
(const(char)[]) sg = "atum"
(dstring) sh = "double atum"d
(const(wchar)[]) si = "wide atum"w
```

If you are excited to test it by yourself, checkout
[this](https://github.com/ljmf00/llvm-project/commits/llvm-plugin-d)
branch and compile lldb. I suggest the following steps:

```bash
# To use clang to compiler LLVM
export CC=clang
export CXX=clang++

# CMake flags (compile to different target if you are not using x86)
cmake -S llvm -B build -G Ninja \
	-DLLVM_ENABLE_PROJECTS="clang;lldb" \
	-DCMAKE_BUILD_TYPE=Debug \
	-DLLDB_EXPORT_ALL_SYMBOLS=OFF \
	-DLLVM_OPTIMIZED_TABLEGEN=ON \
	-DLLVM_ENABLE_ASSERTIONS=ON \
	-DLLDB_ENABLE_PYTHON=ON \
	-DLLVM_TARGETS_TO_BUILD="X86" \
	-DLLVM_CCACHE_BUILD=ON \
	-DLLVM_LINK_LLVM_DYLIB=ON \
	-DCLANG_LINK_CLANG_DYLIB=ON

ninja -C build lldb lldb-server
ldc2 -g app.d
./build/bin/lldb app
```

You can also use
[this](https://gist.github.com/ljmf00/a35da0e41c3a2074d74960e981f43ca6)
file, which is what I use to test the D plugin and used to show the
above example.

### Issues

During the plugin development and testing, I found out that LLDB was
not properly showing UTF8 strings when using `char8_t` types with
different names so I made a patch to fix it:
https://reviews.llvm.org/D112564 . An issue was also created to cross
reference the fix https://bugs.llvm.org/show_bug.cgi?id=52324 . This is
particularly an issue for the D formatter if the compiler exports types
with different type names, which they should.  Debuggers should be able
to read encoding DWARF tags and rely on that first, instead of
hardcoding the formatters. LLDB does that but this somehow got skipped
on https://reviews.llvm.org/D66447 .

While reading how plugin are built with their internal C++ interface, I
found very repetitive code and decide to patch it:
https://reviews.llvm.org/D112658 .

I also happened to reproduce
[this](https://bugs.llvm.org/show_bug.cgi?id=45856) issue that Mathias
reported a while ago and decided to investigate on it since it
indirectly affects the behaviour on D side. I got some conclusions and
I believe this is a regression introduced in 2015. Please read the
issue for more context.

I found other issues on the LDC side and DMD side that I already added
to my
task list, including:
- DMD should use wchar and dchar type names instead of `wchar_t`: This
triggers the hardcoded formatters to format char pointers wrongly.
Furthermore this is wrongly typed since `wchar_t` is not exactly UTF16,
according to the C standard.
- DMD also reports other types as C style naming instead of D style
- LDC reports hardcoded const(char) type instead of a DWARF type
modifier

### Mailing list announcement

As discussed erlier in a LLDB bug, I decided to write to the `llvm-dev`
and `lldb-dev` mailing list to discuss about upstreaming the D language
plugin. You can follow up the thread
[here](https://lists.llvm.org/pipermail/lldb-dev/2021-October/017101.html
).

## What is next?

Next week, I'm going to try to fix the above listed issues on either
DMD and LDC trees. I need to be careful with these changes to make sure
I don't break GDB behaviour, if they are relying on the hardcoded
types. If that is the case I'll try to patch it too. I'm going to also
finish my DWARF refactor on the backend to handle DWARF abbreviations
correctly. The objective of the second milestone is finished but I'm
going to try to study more features to improve pretty printing.

You can also read this on my blog,
[here](https://lsferreira.net/posts/d-saoc-2021-06/).

-- 
Sincerely,
Luís Ferreira @ lsferreira.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20211028/9dd8b0d9/attachment-0001.sig>


More information about the Digitalmars-d mailing list