Big Data Ecosystem
Laeeth Isharc
laeeth at kaleidic.io
Fri Jul 12 18:02:59 UTC 2019
On Tuesday, 9 July 2019 at 16:58:56 UTC, Eduard Staniloiu wrote:
> Cheers, everybody!
>
> I was wondering what is the current state of affairs of the D
> ecosystem with respect to Big Data: are there any libraries out
> there? If so, which?
>
> Thank you,
> Edi
Weka.io of course have the world's fastest file system and I
understand ML at scale is one hot market for them. It's simple
to get going from what I saw and it's not expensive in the scheme
of things. I don't really understand myself why you would use
cloud in many cases, but it does work on the cloud if you want.
I guess you know mir and Lubeck. There's LDA tucked away there
in case you need.
James Thompson lightning talk was quite interesting - sometimes
doing things efficiently can reduce the need for all the
complexity of some of the standard approaches.
I don't know if you consider postgres part of big data solutions,
but with Timescale DB maybe. You can quite easily write Foreign
Data Wrappers in D to integrate with other data sources and you
can also write server side functions. I have done maybe half the
work for that but didn't get time to finish yet. DPP more or
less works for postgres headers.
Joyent have an interesting approach to working on big data the
UNIX way. They have an object store called Manta that allows you
to run code on the same node as the data (stored using zfs).
One could do something similar in D. I wanted to get comfortable
with SmartOS but I don't think it's ready for us today. However
one could do something similar home-rolled with zfs and Linux
containers. I wrapped libzfscore and lxd - alpha quality right
now. Not sure if I pushed the latest versions to GitHub yet.
For syncing stuff across a WAN between regions, TCP doesn't have
great throughput. You can either strap together a bunch of
connections or use something on top of UDP to make it reliable.
We found UDT-D gave us 300x faster file transfers between London
and HK. It's up at GitHub though not very polished code.
More information about the Digitalmars-d
mailing list