[Alpine-info] extreme slowness working with mbox files

Mike Miller mbmiller at taxa.epi.umn.edu
Tue Oct 14 23:07:38 PDT 2008


This is a great messages. Thanks. My reply is interspersed below.
--Mike


On Tue, 14 Oct 2008, Mark Crispin wrote:


> I'm sorry that you are having this problem. Unfortunately, it is

> unavoidable unless you change the underlying way that you do things.

>

> First: you have disabled the "Save combined copies" option. By doing

> so, you effectively demanded that Alpine do an individual message save

> for each selected message instead of an aggregate save.

>

> The fix is to enable this option, so that Alpine does an aggregate save

> and incurs the save setup overhead only once instead of per message.

>

> That way, you will only have to wait 22 seconds instead of

> <# messages> * 22 seconds.


Very interesting. I was having the problem earlier that message sort
order wasn't being retained the way I wanted it to be, so I wrote to the
list and was told to uncheck that box. I didn't know the downside!

(By the way, it is "Save Combines Copies" [combines, not combined])



> Second: you use the traditional UNIX mbox format for mailbox files

> greatly larger than that format was ever designed to support. That

> format was designed in the 1970s when a "large" mailbox was over 100K.

> That format has been obsolete for at least 20 years.

>

> The fix is to use a more modern mailbox format, preferably mix, but at

> least mbx. A more modern mailbox format does not have the save setup

> overhead that exists in traditional UNIX mbox format.

>

> That way, you won't have to wait at all.


I want to do that but I have to wait until I get more disk space to
convert old files to the new format. I am looking forward to it.



> Third: you use the SVR4 (e.g., Solaris, AIX, HP-UX) operation system.

> These systems are inferior garbage sold by sleazy vendors such as SUN,

> who knowingly sell this garbage to trick you into paying more for

> hardware than you need.


I caught onto that about 8 years ago. The nice thing is that the old
SPARC machine is still running after 12 years and it has only been
rebooted when the electricity went out or for one OS upgrade. It is
inferior for the money though. Back when I got it I kinda needed it
because we had software that would run only on Solaris and we Linux (which
I used for the first time in 1995) wasn't quite mature enough.



> The fix is to get rid of your SUN crap and replace it with a system

> running Linux and/or BSD. The next time the SUN salesman shows up,

> throw him out of your office.


I've been using Linux on other machines for years. I'm planning to get
one more box, probably this month, and retire the Sun, so I'm with you on
that one too.



> That way, you don't have to go to the extra overhead of having to spawn

> an inefficient slave process in order to work around the defective SVR4

> operating system.


That's interesting. It didn't occur to me that the extra process was an
OS-specific problem.



> I recommend that you do all of the above, and certainly the first of

> these. You will never be fully happy with performance until you do all

> of these. I know that changing mailbox formats and trashing your

> expensive SUN equipment in favor of cheap Linux equipment sounds crazy

> and radical, but I assure you that when you eventually do it, the

> different will be so substantial that you will wonder why you didn't

> listen to me and do it much earlier.


Well, the Sun equipment is probably (seriously) worth about $100 by now.
That's down from about $12,000 when it was new.



> Last, but not least -- yes, there is a larger save overhead in the

> traditional UNIX mailbox format. The 22 seconds sounds about right for

> your huge mailboxes. Unless you disable Save Combined Copies, that cost

> is incurred only once in a single save commend. If you disable Save

> Combined Copies, that cost is paid per message.

>

> That cost must be paid. Certain other sleazy vendors (large companies)

> unilaterally decided that an OPTIONAL facility of IMAP was now

> mandatory, and began spreading statements that Pine/UW IMAP was

> defective because it did not implement this optional facility. That

> facility was known to be a problem with mbox format, which is why it had

> not been implemented before.

>

> If you feel that this is wrong, you should have backed me up years ago

> when I was being called 69 flavors of "idiot" for standing up for people

> who still use legacy mailbox formats, and being called another 105

> flavors of "idiot" for thinking that it was important to maintain

> halfway decent performance for mbox format. But nobody did, and I lost

> that fight.

>

> It's long too late to undo this now. That train left the station years

> ago.

>

> As I said above, the simple step of making sure that Save Combined

> Copies is enabled will cause you to have just one 22 second delay

> instead of the extended delay that you experienced. The additional step

> of using a better mailbox format -- one designed to support the

> above-named "optional" (now mandatory) IMAP facility -- will remove the

> delay. And the other additional step of using a good operating system

> (Linux, BSD) instead of SVR4 will remove the overhead of the additional

> process.


I can do all of those things, probably before the end of the year. One
question is why Alpine is so much slower than Pine in dealing with mbox
files. I assume it is because of the IMAP facility you refer to above.
Maybe it is too hard to make it an option in Alpine to have it do things
the old Pine way. I'm not using IMAP.

Mike


More information about the Alpine-info mailing list