Big Data Ecosystem
Andre Pany
andre at s-e-a-p.de
Tue Jul 9 21:16:03 UTC 2019
On Tuesday, 9 July 2019 at 16:58:56 UTC, Eduard Staniloiu wrote:
> Cheers, everybody!
>
> I was wondering what is the current state of affairs of the D
> ecosystem with respect to Big Data: are there any libraries out
> there? If so, which?
>
> Thank you,
> Edi
Big data is a broad topic:), you can achieve it with specific
software like spark, kafka or even with cloud storage services
like AWS S3 or even known databases like Postgres.
For Kafka there is a Deimos binding for librdkafka available here
https://github.com/DlangApache/librdkafka. There is also a native
implementation for D available, but unfortunately not longer
maintained https://github.com/tamediadigital/kafka-d.
For AWS services, I prefer the AWS client executable. It accepts
JSON input and also outputs JSON. From the official AWS services
metadata files you can easily create D structs and classes
(https://github.com/aws/aws-sdk-js/tree/master/apis). It almost
feels like the real AWS SDK available e.g. for Python, Java, C++.
For AWS s3 there is also s native D implementation based on
vibe-D.
For postgres you can e.g. use this great library
https://github.com/adamdruppe/arsd/blob/master/postgres.d.
In one way or another you need in Big Data scenarios http client
and servers. Also here the ARSD library has some lightweight
components.
Also the current GSOC project regarding dataframes is an
important part of Big Data.
What I currently really miss is the possibility to read/write
Parquet files.
Kind regards
Andre
More information about the Digitalmars-d
mailing list