Help optimize D solution to phone encoding problem: extremely slow performace.
Renato
renato at athaydes.com
Tue Jan 16 18:54:56 UTC 2024
On Tuesday, 16 January 2024 at 16:56:04 UTC, Siarhei Siamashka
wrote:
> On Tuesday, 16 January 2024 at 15:50:35 UTC, H. S. Teoh wrote:
>> Unfortunately there seems to be some discrepancy between the
>> output I got and the prescribed output in your repository. For
>> example, in your output the number 1556/0 does not have an
>> encoding, but isn't "1 Mai 0" a valid encoding according to
>> your dictionary and the original problem description?
>
> You are not allowed to emit "1" as the first token in the
> output as long as there are any dictionary word matches at that
> position. The relevant paragraph from the problem statement:
>
> ==snip==
> Encodings of phone numbers can consist of a single word or of
> multiple
> words separated by spaces. The encodings are built word by word
> from
> left to right. If and only if at a particular point no word at
> all from
> the dictionary can be inserted, a single digit from the phone
> number can
> be copied to the encoding instead. Two subsequent digits are
> never
> allowed, though. To put it differently: In a partial encoding
> that
> currently covers k digits, digit k+1 is encoded by itself if
> and only if,
> first, digit k was not encoded by a digit and, second, there is
> no word
> in the dictionary that can be used in the encoding starting at
> digit k+1.
> ==snip==
>
> I also spent a bit of time trying to figure out this nuance
> when implementing my solution. It doesn't make much sense
> visually (no back-to-back digits in the output either way), but
> that's how it is.
Exactly, this is one of the things that make this problem a bit
annoying to solve :)
@"H. S. Teoh" you implemented the solution as a Trie!! Nice,
that's also what I did when I "participated" in the study. Here's
[my Trie solution in
Java](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/fastest-implementations-print-or-count/src/java/Main.java).
These are basically the two common approaches to the problem: a
Trie or a numeric-based table. According to the study, people who
use scripting languages almost always go with the numeric
approach, while people coming from lower level languages tend to
use a data structure like Trie (if they don't know Trie, they
come up with something similar which is fascinating), which is
harder to implement but more efficient in general.
Can I ask you why didn't you use the [D stdlib
Trie](https://dlang.org/phobos/std_uni.html#codepointTrie)? Not
sure that would've worked, but did you consider that?
More information about the Digitalmars-d-learn
mailing list