Best way to read CSV data file into Mir (2d array) ndslice?
jmh530
john.michael.hall at gmail.com
Wed Sep 21 19:14:30 UTC 2022
On Wednesday, 21 September 2022 at 13:08:14 UTC, jmh530 wrote:
> On Wednesday, 21 September 2022 at 05:31:48 UTC, mw wrote:
>> Hi,
>>
>> I'm just wondering what is the best way to read CSV data file
>> into Mir (2d array) ndslice? Esp. if it can parse date into
>> int/float.
>>
>> I searched a bit, but can't find any example.
>>
>>
>> Thanks.
>
> It probably can't hurt to try the simplest approach first.
> `std.csv` can return an input range that you can then use to
> create a ndslice. Offhand, I don't know what D tools are an
> alternative to `std.csv` for reading CSVs.
>
> ndslice assumes homogenous data, but you can put the Dates (as
> Date types) as part of the labels (as Data Frames). However,
> there's a bit to be desired in terms of getting that
> functionality integrated into the rest of the package [1].
>
> [1] https://github.com/libmir/mir-algorithm/issues/426
I just tried doing it with `std.csv`, but my version was a bit
awkward since it doesn't seem quite so straightforward to just
take the result of csvReader and put it in a array. I had to read
it in there. I also wanted to allocate the array up front, but to
do that I needed to know how big it was and ended up doing two
passes on reading the data, which isn't ideal.
```d
import std.csv;
import std.stdio: writeln;
import mir.ndslice.allocation: slice;
void main() {
string text =
"date,x1,x2\n1/31/2010,65,2.5\n2/28/2010,123,7.5";
auto records_firstpass = text.csvReader!double(["x1","x2"]);
auto records_secondpass = text.csvReader!double(["x1","x2"]);
size_t len = 0;
foreach (record; records_firstpass) {
len++;
}
auto data = slice!double(len, 2);
size_t i = 0;
size_t j;
foreach (record; records_secondpass)
{
j = 0;
foreach (r; record) {
data[i, j] = r;
j++;
}
i++;
}
writeln(data);
}
```
More information about the Digitalmars-d-learn
mailing list