O(N) Garbage collection?

Sat Feb 19 19:21:42 PST 2011

"dsimcha" <dsimcha at yahoo.com> wrote in message 
news:ijp61d$1bu1$1 at digitalmars.com...
> On 2/19/2011 12:50 PM, Ulrik Mikaelsson wrote:
>> Just a thought; I guess the references to the non-GC-scanned strings
>> are held in GC-scanned memory, right? Are the number of such
>> references also increased linearly?
>
> Well, first of all, the benchmark I posted seems to indicate otherwise. 
> Second of all, I was running this program before on yeast DNA and it was 
> ridiculously fast.  Then I tried to do the same thing on human DNA and it 
> became slow as molasses.  Roughly speaking, w/o getting into the biology 
> much, I've got one string for each gene.  Yeast have about 1/3 as many 
> genes as humans, but the genes are on average about 100 times smaller. 
> Therefore, the difference should be at most a small constant factor and in 
> actuality it's a huge constant factor.
>

Out of curiosity, roughly how many, umm "characters" (I forget the technical 
term for each T, G, etc), are in each yeast gene, and how many genes do they 
have? (Humans have, umm, was it 26? My last biology class was ages ago.)

> Note:  I know I could make the program in question a lot more space 
> efficient, and that's what I ended up doing.  It works now.  It's just 
> that it was originally written for yeast, where space efficiency is 
> obviously not a concern, and I would have liked to just try a one-off 
> calculation on the human genome without having to rewrite portions of it.