Emulated TLS for Android

Joakim via digitalmars-d-ldc digitalmars-d-ldc at puremagic.com
Thu Jul 6 23:54:38 PDT 2017


On Thursday, 6 July 2017 at 12:10:40 UTC, Johannes Pfau wrote:
> Am Thu, 06 Jul 2017 11:13:20 +0000
> schrieb Joakim <dlang at joakim.fea.st>:
>
>> So I've finally spent some time looking at this, ie what work 
>> google did to get a gcc-alike emulated TLS into llvm, since 
>> they've ditched the gcc compiler from their Native Development 
>> Kit (NDK):
>> 
>> [...]
>>
>> With this last change, ldc will have Android cross-compilation 
>> support from every platform that's part of the official 
>> release. Until we get some way to generate a cross-compiled 
>> stdlib from the compiler and accompanying source, I can put up 
>> a tarfile with the cross-compiled stdlib for Android, though 
>> they'll need the NDK for their platform for its native Android 
>> libraries and linker.
>
> Interesting, I've had the exact same problem with GDC which is 
> no surprise as it's using the same mechanism ;-)
>
> 1) Does not work if you want to support mixing C and D code. 
> You can intercept calls from D code, but if the variable is 
> only accessed through C code your custom function is not run. 
> (Unless you intercept the function at runtime, but I'm not sure 
> if this leads to a stable solution...)

I suppose it's possible that you have some extern(C) TLS variable 
in a D module that's accessed first or only from the C code, but 
that seems unlikely.

> 2) The GCC implementation has the advantage of working with 
> dynamically
> loaded shared libraries, static libraries, any number of 
> threads and
> it's runtime-linker agnostic. You have to sacrifice one of these
> features to know the per-thread memory size. So the GCC solution
> is quite elegant, but it does not work with the GC too well...

Only shared libraries have not been made to work with the other 
emulated TLS approaches on Android, largely because I have not 
looked into switching Android to the massive 
rt.sections_elf_shared, which ldc already uses for 
non-Android-linux/Darwin/BSD.

> 3) Then you loose C/D compatibility for thread local variables 
> and I'm not sure if the DMD approach fully supports dynamic 
> shared library loading? Do you have some more information about 
> this implementation? I'm wondering whether C compatibility is 
> that important. But TLS for shared library loading etc should 
> work.

C compatibility only goes under the extreme scenario you alluded 
to, and I doubt there is much interminingling of TLS variables 
with C code, even when properly registered with the D GC first so 
that it works fine.  Yeah, no additional D shared libraries on 
Android working yet, as mentioned above, only a single D shared 
library that statically links against the D runtime.

As for more info, Walter wrote an article about it, dmd on OS X 
used it with Mach-O for years afterwards (still in the defunct 
x86 version), and I simply copied it over onto Android with ELF:

http://www.drdobbs.com/architecture-and-design/implementing-thread-local-storage-on-os/228701185
https://github.com/dlang/druntime/commit/73cf2c150
https://github.com/dlang/druntime/pull/784

> The main problem with the GCC implementation is that the memory 
> for TLS is not contiguous. So even if you end up with a 
> solution, you'll have to add a GC range for every single 
> variable and thread. This is not exactly going to be fast...
>
> The solution we came up for GDC was to generate a 
> __scan_emutls(cb) function per module. The function then calls 
> cb(&var, var.sizeof) for every TLS variable in the module. Add 
> a pointer to __scan_emutls to ModuleInfo and all modules can be 
> scanned. But the __scan_emutls functions have to be called for 
> every thread and as the GC runs only in one thread you'll have 
> to do this at thread startup (or whenever a thread loads a new 
> shared library) and store a list of all variables location and 
> size... I never updated this code for the new rt.sections 
> mechanism though so this is currently broken.

Interesting, you initialize and GC-register every thread-local 
variable at every thread startup and add to the list when a 
shared library is loaded, rather than lazily allocating like 
other implementations.  I guess this is the cost of making sure 
the GC always knows what's going on.

> We could probably do better by patching the libgcc functions 
> but it'll take very long till these updated libgcc versions 
> have been upgraded on all interesting targets. Optimally libgcc 
> would just provide a callback __emutls_iterate_variables(cb) to 
> iterate all variables in all threads. We can't really do that 
> externally as we can't access the emutls_mutex and emutls_key 
> and as __emutls_get_address updates the pthread_setspecific 
> value anyway, so __emutls_get_address needs to be patched.

Yeah, I was initially thinking of a hook like 
__emutls_iterate_variables too, but after seeing that this 
implementation may extend the thread-local data at any time, I 
guess that would still be problematic.

> The emutls source code is here (GPL3 with GCC Runtime Library 
> Exception!!!) 
> https://github.com/gcc-mirror/gcc/blob/master/libgcc/emutls.c

Yeah, I linked to it above.  I since also found this llvm 
compiler-rt implementation under permissive licenses, written and 
merged by the same google engineer who got the emulated TLS hooks 
into llvm, and which helpfully also has some doc comments (not to 
mention a Windows version):

https://github.com/llvm-mirror/compiler-rt/blob/master/lib/builtins/emutls.c




More information about the digitalmars-d-ldc mailing list