Archive for November, 2008

RunCore SSDs for EeePC

As of a few days ago, a new brand of third party SSDs is available for the EeePC. The RunCore SSDs are now available at MyDigitalDiscount. I’ve already ordered two and can’t wait for them to arrive. I’ll post benchmarks, soon.

Fix Firefox and Evolution hangs on EeePC

It seems a combination of several of the earlier ideas seems to do the trick. If you came to this post directly, here are the earlier posts for some context::

What seems to work fairly well for me is to run the ionice script at boot time to speed up the overall behaviour of Ext3 (especially on the slow MLC drive of the EeePC – you can find that in the second article mentioned above), but to also move the .evolution and .mozilla directories from your home directory (and any other directory that regularly receives fsync calls – use LatencyTOP to find out which ones those are) to a separate partition on the SLC drive and to format that partition with Ext2 (and not Ext3).

Now this means that this workaround is going to be rather hard for people running the default Xandros installation on the EeePC – that one uses all of the fast SLC drive (/dev/sda) for the unionfs that allows it to easily restore its original settings. Since I don’t run Xandros I can’t really help with how you could modify it to make this work.

If you have installed a different version of Linux on your flash based EeePC, then this should be fairly doable. Most likely /dev/sda contains your /boot (if separate) and your / filesystems. For data integrity reasons you may not want to switch either of them from Ext3 to Ext2 (as you lose the journal doing that). So instead I suggest to shrink the / filesystem and create a new partition in the remaining space, to format this with Ext2, mount it under something like /disposabledata (remember, if your system crashes, there’s a higher risk that you lose the content of this filesystem) and then move ~/.mozilla and ~/.evolution to that filesystem. The risk you are taking isn’t really that big, if you think about it. You may lose your browsing history. Or you may lose local copies of email that are stored on the server, anyway. What would be annoying to lose are the stored passwords and the overall settings of both firefox and evolution – but a little cron job that regularly copies that data back into different directories in your /home directory should do the trick.

Details will be different depending on your scenario, but here is the rough outline of what I did and which seems to be working well for me so far:

  • shrink the / partition on /dev/sda. There are many tutorials how to do that, I used this one to guide me through the process. How big that partition needs to be depends on how big your ~/.mozilla and ~/.evolution folders are. Some of the popular distributions by default don’t use partitions but LVM volumes to hold your filesystems. Again, I don’t use that, so I can’t provide you with detailed instructions how to shrink the filesystem in there and create a separate one as Ext2 filesystem – but it can’t be that hard to figure out.
  • create a new partition in the now available space on /dev/sda and format it with Ext2 (usually that’s done with mkfs.ext2)
  • modify /etc/fstab to mount this to an appropriate spot; I like to make it clear from the name of the mount point that this is not a journalled filesystem, so the line in my /etc/fstab looks something like this:
    /dev/sda3 /disposabledata ext2 defaults,noatime 0 0
    Obviously you need to use the right partition that you create earlier – it was /dev/sda3 in my case.
  • still as superuser, create a directory for your user on that filesystem and chown it to that user:
    mkdir /disposabledata/user; chown user /disposabledata/user
  • as your regular user quit firefox and evolution and then move their respective directories to that directory:
    mv ~/.evolution ~/.mozilla /disposabledata/user
  • create links from your home directory to these new locations:
    ln -s /disposabledata/user/.evolution ~
    ln -s /disposabledata/user/.mozilla ~
  • finally, create a little cron script that copies these directories back to your home directory at a regular interval – ideally keeping one generation of backups so that regardless when your system crashes, you always should have one intact copy of this data. Here’s what I do:
    #!/bin/bash
    SRCDIR=/disposabledata/user
    TARGETDIR=/home/user/backup
    SUBDIRS=".mozilla .evolution"
    # sanity checks
    if [ -e $TARGETDIR -a ! -d $TARGETDIR ] ; then
       echo $TARGETDIR is not a directory
       exit 1
    fi
    [ -d $TARGETDIR ] || mkdir $TARGETDIR
    for i in $SUBDIRS; do
       [ -e $TARGETDIR/$i.bak ] && rm -rf $TARGETDIR/$i.bak
       [ -e $TARGETDIR/$i ] && mv $TARGETDIR/$i $TARGETDIR/$i.bak
       cp -r $SRCDIR/$i $TARGETDIR
    done

And in the unfortunate event that this happens and your /disposabledata filesystem gets corrupted, simply recreate that filesystem and copy the last version of the backup folders back to that spot. You may have lost some of the browsing history and potentially some recent changes that you made, but most of your settings and data should still be around.

But most importantly, the system will feel much more usable with significantly fewer stalls and hangs. It certainly does for me.

More on the EeePC hangs

As a commenter on my previous post on this topic has pointed out, the same problem (the system stalls when due to Ext3′s bad design an fsync causes a complete sync of the filesystem, which on MLC-based SSDs causes long write stalls which the fsync blocks for) happens with Evolution as well. And of course running both Evolution and Firefox (a very typical scenario) it gets even worse.

I’ve worked with Arjan on improving LatencyTOP some more in order to track down these stalls. LatencyTOP now has a special mode (together with a matching kernel patch) that allows you to monitor all fsyncs with the file name that they are waiting for. Extremely useful in this context. What becomes obvious is that while we can’t fix the underlying problem with the bad combination of fsync-sync-MLC-stall without switching to a different filesystem, we can solve some of the other performance problems that are caused by the i/o system (and especially the filesystem) getting overloaded.

Basically the situation on the EeePC exposes another flaw of the combination of Ext3 and the new i/o scheduler. The way the priority calculations are run, the kernel journal daemon is getting starved, especially if writes stall. Thankfully there’s a way to improve the situation a little:
for i in `/sbin/pidof kjournald`; do
  /usr/bin/ionice -c1 -p $i;
done

This forces all instances of kjournald to run in the real time scheduling class. Which gives them more access to the very limited i/o bandwidth to the flash. Ideally put this somewhere where it is executed at boot time – something like /etc/rc.d/rc.local.

This helps in normal use – the system doesn’t stall as often, for example when just browsing the web. It doesn’t help with the massive stalls that are caused for example when you quit Evolution and it fsyncs all of the local copies of IMAP mailboxes.

We’re still looking into ways to improve that behavior (hoping for BTRFS to get into 2.6.29 is one of those strategies…).

EeePC hangs with Firefox

I’ve done some more work on figuring out why my EeePC literally freezes for a second and more at a time when I work with Firefox. I used LatencyTOP to track this down and think that I have put together the pieces of this puzzle.

The reason this hits me especially hard is a combination of several independent issues:

  1. Firefox 3 uses sqlite which aggressively uses fsync to make sure that the on disk database of websites visited stays in sync. The corresponding bug is marked as resolved, fixed but I would consider that a serious overstatement.
  2. Ext 2 and 3 have a bug where an fsync on a file actually forces a sync on the disk. This, too, has been discussed before.
  3. Many MLC SSDs have a problem with many small writes at once (this was discussed in extensive detail in this great AnandTech article).
  4. Finally, as I mentioned before, there appears to be something wrong with the MLC drive in my EeePC 1000 (I’m still waiting for the 32GB SLC drive (and, just to do comparisons, a 64GB MLC drive) from MyDigitalSSD). And since I have my complete Fedora installation on the secondary drive, all writes to temp files and log files add to the write delays.

It gets so bad that LatencyTOP show the system stalled for 1500ms just waiting for one Firefox fsync to complete.

So what can you do to fix this if you run into the same issue (which seriously hampers the usability of an affected system)? Well, you could put your /home directory on the SLC (/dev/sda) in your EeePC. But that’s not really a good solution as most of us want to use the large drive in order to have space for all of our files.

You could switch to a different filesystem. But Fedora (and many other current distributions) are very much built around the assumption that you are using Ext 3. And it seems that ReiserFS isn’t any better in that respect. The filesystem that will fix the problem, BTRFS is still not quite ready for real life deployment.

Finally, you could put a small partition on your SLC drive, and move your .sqlite files (in the .mozilla/firefox/randomlettersprofile directory) onto that drive. I tried that and it didn’t help, either.

So I’m still searching for a solution.

Update: I think I have at least a partial solution.