counting words

Brad Anderson eco at gnuk.net
Fri Jun 28 10:01:06 PDT 2013


On Friday, 28 June 2013 at 16:48:08 UTC, Benjamin Thaut wrote:
> Am 28.06.2013 18:42, schrieb Brad Anderson:
>> On Friday, 28 June 2013 at 16:25:25 UTC, Brad Anderson wrote:
>>> On Friday, 28 June 2013 at 16:04:35 UTC, Benjamin Thaut wrote:
>>>> I'm currently making a few tests with std.algorithm, 
>>>> std.range, etc
>>>>
>>>> I have a arry of words. Is it possible to count how often 
>>>> each word
>>>> is contained in the array and then sort the array by the 
>>>> count of the
>>>> individual words by chaining ranges? (e.g. without using a 
>>>> foreach
>>>> loop + hashmap)?
>>>
>>> If you don't mind sorting twice:
>>>
>>> words.sort()
>>>     .group()
>>>     .array()
>>>     .sort!((a, b)=> a[1] > b[1])
>>>     .map!(a => a[0])
>>>     .copy(words);
>>>
>>> You could also do it with a hashmap to keep the count.
>>
>> Like so:
>>
>>     size_t[string] dic;
>>     words.map!((w) { ++dic[w.idup]; return w; })
>>          .array // eager (so dic is filled first), sortable
>>          .sort!((a, b) { bool less = dic[a] > dic[b]; return 
>> less ||
>> less && a < b; })
>>          .uniq
>>          .copy(words);
>>
>> It's a bit ugly and abuses side effects with the hash map.  
>> The order
>> will differ from the other program when words have identical 
>> counts.
>
> I figured something like this by now too. Thank you.
> But I don't quite understand what the copy is for at the end?

Just replacing your original word list with the sorted list 
(which I just realized is wrong because it will leave a bunch of 
words on the end, oops).  You could .array it instead to get a 
new array or just store the range with auto and consume that 
where needed with no extra array allocation.


More information about the Digitalmars-d-learn mailing list