Big Data Ecosystem

Andre Pany andre at s-e-a-p.de
Tue Jul 9 21:16:03 UTC 2019


On Tuesday, 9 July 2019 at 16:58:56 UTC, Eduard Staniloiu wrote:
> Cheers, everybody!
>
> I was wondering what is the current state of affairs of the D 
> ecosystem with respect to Big Data: are there any libraries out 
> there? If so, which?
>
> Thank you,
> Edi

Big data is a broad topic:), you can achieve it with specific 
software like spark, kafka or even with cloud storage services 
like AWS S3 or even known databases like Postgres.

For Kafka there is a Deimos binding for librdkafka available here 
https://github.com/DlangApache/librdkafka. There is also a native 
implementation for D available, but unfortunately not longer 
maintained https://github.com/tamediadigital/kafka-d.

For AWS services, I prefer the AWS client executable. It accepts 
JSON input and also outputs JSON. From the official AWS services 
metadata files you can easily create D structs and classes 
(https://github.com/aws/aws-sdk-js/tree/master/apis). It almost 
feels like the real AWS SDK available e.g. for Python, Java, C++.
For AWS s3 there is also s native D implementation based on 
vibe-D.

For postgres you can e.g. use this great library 
https://github.com/adamdruppe/arsd/blob/master/postgres.d.

In one way or another you need in Big Data scenarios http client 
and servers. Also here the ARSD library has some lightweight 
components.

Also the current GSOC project regarding dataframes is an 
important part of Big Data.

What I currently really miss is the possibility to read/write 
Parquet files.

Kind regards
Andre






More information about the Digitalmars-d mailing list