Matrix/Linear Algebra Library?
Benji Smith
dlanguage at benjismith.net
Fri Oct 3 09:10:05 PDT 2008
Fawzi Mohamed wrote:
> Probably if you want least square approximation (that is what you meant
> with LSA?) some kind of direct minimizer is the way to go (but be
> careful the problem might ill conditioned).
> Fawzi
Sorry. I meant "Latent Semantic Analysis". I'm building a table of
synonyms from a large document repository.
There are about 100,000 documents in the repository, and the total
corpus has about 100,000 unique words (after stemming). The LSA
algorithm uses a matrix, where each row represents a single term, and
each column represents a single document. Each cell contains the TF-IDF
(Term Frequency / Inverse Document Frequency).
Performing an SVD on the matrix yields a collelation matrix,
representing the synonymy of all terms, in all documents.
--benji
More information about the Digitalmars-d
mailing list