Ternary Search Trees

Robert Fraser fraserofthenight at gmail.com
Thu Apr 16 19:27:47 PDT 2009


Jason House wrote:
> bearophile Wrote:
> 
>> Robert Fraser:
>>> Hey, could you please post your implementation (assuming it's 
>>> open-source?) I'd love to use them, but can't be bothered to implement 
>>> it. Thanks!
>> It's just a translation to D of this Py code:
>> Info:
>> http://jeethurao.com/blog/?p=146
>> Code:
>> http://bitbucket.org/woadwarrior/trie/
>>
>> Where necessary you can also follow the ideas from the original article here (but using an enum is MUCH better and it's also safer in C, to avoid aliasing):
>> http://www.ddj.com/windows/184410528
>>
>> My implementation is about 57 times faster than the Python version and much faster than the Psyco version too.
>>
>> You can also take a look here:
>> http://sourceforge.net/projects/ternary/
>>
>> You can implement it with a _single_ template struct, filled with all the methods you want:
>>
>> struct TST(T) {
>>     TST* left, right; 
>>     union {
>>       TST* mid;
>>       int index;
>>     }
>>     T item;
>>
>>     // methods here
>> }
>>
>> Note that inside there you don't need TST!(T)*, TST* is probably enough, but I don't know why yet.
>>
>> You can start with only the few basic methods, test them, and later when/if you want you can add more methods, taking ideas from here:
>> http://abc.se/~re/code/tst/tst_docs/index.html
>>
>> I'd like to not show my code because it's not nice, it has no comments, tests, etc. But it's quite close to that Py code.
>>
>> A problem with that template struct is that even if you add a opIn_r() method as I have done, then you have to do (note the star):
>> if (!(k in *ds))
>>
>> I don't know if in D2 for structs you can use the syntax you use with classes:
>> if (!(k in ds))
>>
>> Even better syntax is:
>> if (k !in ds)
>>
>> In Python you write something better still:
>> if k not in ds:
>>
>> Bye,
>> bearophile
> 
> Why use a struct for TST? it requires you to use pointers everywhere and passing by value hardly makes any sense. A class (optionally with final methods for speed) would work a lot better. It even fixes your issue with opIn

I assume it saves some space (classes have a monitor and vtbl pointer 
which are un-needed here). For maximum efficiency, rather than mallocing 
individual nodes, you'd want to be using some sort of array.



More information about the Digitalmars-d mailing list