A couple of questions about arrays and slices
Cecil Ward
cecil at cecilward.com
Sat Jun 24 14:43:00 UTC 2023
On Saturday, 24 June 2023 at 12:05:26 UTC, Jonathan M Davis wrote:
> On Saturday, June 24, 2023 1:43:53 AM MDT Cecil Ward via
> Digitalmars-d-learn wrote:
>> On Saturday, 24 June 2023 at 07:36:26 UTC, Cecil Ward wrote:
>> > [...]
>>
>> I just realised something, your point about altering the table
>> and having to rehash, is well taken. I hadn’t considered that.
>> The reason for my foolishness in failing to realise that I’m
>> asking the impractical is my pattern of usage. I add all the
>> entries into the mapping table and have no interest in any
>> lookups until it is fully built. Then a second function starts
>> to do lookups while the data remains unchanging and that usage
>> pattern can be guaranteed. I could even idup it if that would
>> help, as copying < 32 uints wouldn’t take forever. A typical
>> value would be a mere 5 or less. I only picked 32 to be
>> completely safely ott.
>
> Well, if the key were a struct or a class, the hashing function
> would be opHash. For built-in types, the runtime has hashing
> functions that it uses. Either way, with AAs, you really don't
> worry about managing the memory, because it's completely
> outside of your control. You just put the elements in there
> using their associated keys, and if you want to try to speed it
> up after you've populated it, you use rehash so that the
> runtime can try to move the elements around within the
> container so that lookup speeds will be closer to optimal.
>
> As such, for the most part, when dealing with AAs and worrying
> about efficiency, the question really becomes whether AAs are
> the correct solution rather than much of anything having to do
> with how you manage their memory.
>
> With so few elements, it's also possible that using
> std.algorithm.searching.find would be faster - e.g. having a
> dynamic array of strings where the matching int is at the same
> index in a dynamic array of ints - or you could use
> std.typecons.Tuple!(string, int)[] with something like
> arr.find!(a => a[0] == key)() to find the tuple with the int
> you want.
>
> Simply comparing a small number of strings like that might be
> faster than what goes on with hashing the string and then
> finding the corresponding element within the AA - or it might
> not be. You'd have to test that to know. The AA would
> definitely be faster with a large number of elements, but with
> a small number of elements, the algorithmic complexity doesn't
> really matter, and the extra overhad with the AA lookups could
> actually mean that the search through the dynamic array is
> faster even though it's O(n). But you can only know which is
> faster by testing it out with the actual data that you're
> dealing with.
>
> Regardless, you need to remember that associative arrays are
> not arrays in the C sense. Rather, they're hash tables, so they
> function very differently from dynamic arrays, and the rehash
> function is the closest that you're going to get to affecting
> how the elements are laid out internally or how much memory the
> AA is using.
>
> - Jonathan M Davis
I started out looking into a number of runtime library routines,
but in the end it seemed quicker to roll my own code for a crude
recursive descent parser/lexer that parses part of D’s grammar
for expressions, and (again partial grammar) parser for string
literal expressions and so on. I find certain special elements
and execute actions which involve doing the AA lookup and
replacing variable names with ordinal numbers in decimal in the
output stream. Admission: The parsing is the thing that has to be
fast, even though again the size of the D language text is not
likely to be huge at all. But 40 years ago, I came from a world
with 2k RAM and 0.9 MHz clock rates so I have developed a habit
of always thinking about speed before I do anything, needful or
not, to be honest. I once wrote a program that took 35 mins to
evaluate 2+2 and print out the answer, so I’m now ashamed of
writing slow code. Those were bad days, to be honest. 4 GHz+ and
ILP is nicer.
More information about the Digitalmars-d-learn
mailing list