Natural language parsing (NLP) with D

Laeeth Isharc via Digitalmars-d digitalmars-d at puremagic.com
Tue Oct 20 11:43:53 PDT 2015


On Tuesday, 20 October 2015 at 16:01:41 UTC, Chris wrote:
> On Tuesday, 20 October 2015 at 15:49:18 UTC, bachmeier wrote:
>> It's not my area, but are you thinking of something like 
>> Freeling?
>>
>> http://nlp.lsi.upc.edu/freeling/
>>
>> Asking for a friend. I think a C++ expert could get it to work 
>> with D with little difficulty, at least by creating C 
>> bindings, but I'm not a C++ expert and I failed.
>
> Interesting, I heard of it a while ago. In D I have the 
> following:
>
> Text tokenization
>
> Yes.
>
> Sentence splitting
>
> Yes.
>
> Morphological analysis
>
> Yes.
>
> Suffix treatment [, retokenization of clitic pronouns]
>
> Yes.
>
> Flexible multiword recognition
>
> Yes.
>
> Contraction splitting
>
> Depends on what they mean. But I can handle contractions like 
> "l'ami".
>
> Probabilistic prediction of unkown word categories
>
> No.
>
> Phonetic encoding
>
> Transcription? If so, yes.
>
> SED-based search for similar words in dictionary
>
> No.
>
> Named entity detection
>
> No.
>
> Recognition of dates, numbers, ratios, currency, and physical 
> magnitudes (speed, weight, temperature, density, etc.)
>
> Partially implemented.
>
> PoS tagging
>
> Started.
>
> Chart-based shallow parsing
>
> No.
>
> Named entity classification
>
> No.
>
> WordNet-based sense annotation and disambiguation
>
> No.
>
> Rule-based dependency parsing
>
> No.
>
> Nominal correference resolution
>
> No.
>
> If anyone is interested in starting something like FreeLing in 
> D, please share your thoughts.

Hi.

I am very interested in this topic (especially sentiment 
analysis), and slowly I am getting a bit more firepower.  I 
started porting the Python version of the stanford NLP API (the 
underlying code is Java) to D - it's not very complicated, but I 
have too much on my plate and so it goes slowly.

I would be interested in working together on this with others, 
and I don't mind open sourcing the building blocks (which is 
really the time consuming bit).  I hope to have some others from 
D world helping me, so it should go a bit faster, although the 
NLP stuff might not be the first project we work on.

Feel free to drop me an email. Laeeth


At kaleidicassociates.com


Thanks.


Laeeth



More information about the Digitalmars-d mailing list