[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pan-devel] musings on memory consumption
From: |
Anatoly Vorobey |
Subject: |
[Pan-devel] musings on memory consumption |
Date: |
Thu, 17 Jun 2004 17:28:31 +0300 |
User-agent: |
Mutt/1.5.4i |
Judging by recent discussions on this mailing list, developers of Pan
have more or less decided to move to a DB backend. I've been working
towards a different goal for the last few days, trying to make Pan work
for me with very large groups in its current model of storing article
headers in memory. This wasn't motivated by any ideological opposition
to DB backends in general; I merely wanted to be able to use Pan for all
my Usenet needs as soon as possible. Pan is the only GUI newsreader I
can use w/o yearning for a return to slrn every minute or so.
I ended up with a patch that allows me to browse a 1-million-headers
newsgroup comfortably on my machine, which is more or less what I
needed. Basically, I use refcounted strings and normalised subjects.
There's a new string type, RString, which stores unique strings only
once by using a global hash table and a refcount field to keep track of
how many times the string was referenced, allowing it to be freed when
the refcount drops to 0. RStrings can be used for many strings inside
Article which are now stored as PStrings separately for each article -
for example, author's name, author's email address, newsgroup names in
xref headers, etc. I may convert all of these to RStrings sometime later
to further reduce memory use. However, the biggest memory hog is the
subject. I wrote up a separate Subject type which is a kind of
normalised subject - it strips the "Re: " part at the beginning and the
part number, if those are present, stores them separately, and then
stores the rest as an RString, which means, in particular, that all
parts of a multipart article end up referencing the same subject
RString. Additionally, all of article-thread.c needed to be rewritten
(its normalisation of subjects when sorting or threading is no longer
needed, and in general it became smaller, faster, and much less
RAM-hungry), and all places
in Pan which reference article subjects needed small adjustment.
After this, starting Pan and loading a 1 million headers newsgroup
results in about 340Mb memory used, which is tolerable for me. The
slowest reactions now come from the GTK header pane, presumably because
it finds it hard to cope with such a large tree. The widget spends
around 10 seconds initialising or freeing the entire header pane (when
entering/leaving the group). I'm not sure whether dynamic feeding of
data into the widget, along the lines Evan Martin suggested in a recent
message, would speed that up.
A small nuisance is that Pan doesn't seem to be freeing the articles
when I leave the group and enter another one, even though, when I
re-enter the original group, it loads them from disk anyway. Why does it
behave this way? If there's no compelling reason for this, I may
spend some more time on improving this (perhaps as simple as an "Unload
this group" action in the context menu of a grouplist?)
Anyway, I understand that the work I've done might not be useful to
official Pan development if it's been decided to focus on moving to a DB
backend. I did it primarily for myself, to scratch my own private itch.
If, however, there's interest from project maintainers, and they think
it might be considered for inclusion, or if other people want to try it
out, do let me know; I can find some time to clean it up, remove
debugging junk, test it some more and put it up for download.
With best wishes,
Anatoly.
--
avva
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Pan-devel] musings on memory consumption,
Anatoly Vorobey <=