Friday, June 24, 2005


Here's a download of source for all lab sessions. I'll try and keep it up to date as I post new source. I also want to update the distribution from the downloads page to contain the more robust, bug fixed, and useful code (the code I actually use). The labs download is more flakey, experimental code that is not updated with bug fixes and is often left sitting around for a long time.

lab 33 - profile lockfs


lab 33 - profile lockfs


Last lab I talked up the benefits of lockfs. I started using it in my program in place of channel locks and profiled the code using prof(1) to see it's impact on performance. The results were a little disappointing but not really suprising.

One of the test programs I'm using while developing btree application is a simple indexer. It reads a set of text files, tokenizes each file into a set of Terms, then in the database creates a document id for each document and a termid for each distinct Term and adds to the database a docid-Hit-termid relation. I then run this on a few megabytes of text data; such as all email messages for a year,

% prof -n folkfs/tok -i /n/mbox/2001/*/*

I've frequently used prof while developing the btree. Locking always presents a bottleneck. When I had the locks based on channels and a cache that also behaved as a lock, both took up large percentage of samples when profiling lines of code.

The test I did most recently was a single threaded application using lockfs and compared it with the same program without the lockfs. I obviously expected some performance loss when using lockfs but the results showed that even with no blocking for other processes the time it took to interact with lockfs completely dominated the runtime of the program. 54% of samples occurred during the statement

lock = nil;

And another 33.3% on runtime spent in the lockfs module. The total runtime with lockfs was 32 seconds, and without lockfs 11 seconds.

Even with these results, I will not discard lockfs. The benefits of a distributed, garbage collected lock make it still worth keeping. But I will selectively turn off locking when doing a bulk load of an application. A danger I noticed is that the cost of obtaining and releasing a lock started me thinking about holding locks longer to avoid as much interaction with lockfs, but this just creates a different kind of bottleneck.

The only lesson from this is always prof your code. In Inferno it's easy.

Saturday, June 18, 2005

lab 32 - lockfs


lab 32 - lockfs


I've struggled for a while trying to get multi-reader-single-writer locks (rwlock) working the way I wanted for a database I've been working on (more in a later lab). I recently realized the answer was in inferno all along.

The problem was not in implementing a rwlock, which i had done for tickfs using channels and it worked fine, the problem came when I needed to kill a process that might be holding a lock.

A lock based on using channels usually means that a spawned process is waiting for an event on a channel either to obtain or free the lock. However, if the process that holds a lock is killed it doesn't get to send the unlock event to the lock manager process.

You might think, just don't kill the processes that might hold locks. But killing processes is sometimes hard to avoid. For example, when a Flush message arrives, or to cancel a query that launched multiple processes. Programs that depend on locking based merely on channels are quite fragile, for example, tickfs.

While implementing the wikifs (lab 30) I noted the lovely property of file locks. When the process died, the file descriptor was garbage collected and the file Clunked, the lock automatically freed. The file lock in wikifs was a simple exclusive lock, using the file system properties DMEXCL to manage the exclusion (using kfs in this case).

What I wanted was a file system that served a file representing the rwlock. It already existed in lockfs. I think what had confused me originally about this fs is that I believed it had to be layered on top of the actual data file I was going to work with. That wouldn't have worked for me so I didn't experiment with lockfs. What I realized this week was that I could have lockfs control access to an empty file, and then I'd get all the benefits of file locks, and more. The rlock function became simply

lock := Sys->open(LOCKFILE, Sys->OREAD);

and wlock became

lock := Sys->open(LOCKFILE, Sys->ORDWR);

and freeing a lock was nothing more than,

lock = nil;

which would happen anyway when the function or process exited. Brilliant!

The extra benefit is that lockfs can serve it's namespace and act as a lock for a cluster of machines.

Switching to using lockfs has considerably improved my database application. The example of tickfs shows that Styx's Flush message isn't handled properly because of the problems of killing processes that might hold locks. Lockfs makes the application more robust. Also, more than one application can now access the datafile as long as each respects the locks served by lockfs. These applications do not even need to be on the same machine. By combining using lockfs with the kind of lockless cache I discuss in lab 30 I can completely avoid locking based on channels that caused me headaches when I needed to kill groups of processes that held locks.

Tuesday, June 07, 2005

lab 31 - accumulator generator in limbo


lab 31 - accumulator generator in limbo


In lab 25 I tried to tackle the problem of writing an accumulator generator in the languages available on inferno. I blithely assumed it wasn't possible in Inferno. I'm happy to say I was wrong. Roger Peppe emailed me the answers, so I'm just reporting what he gave me. But it was very enlightening, a big Aha!, and made me view limbo in a different way.

From rog: "funny thing is, it *is* possible in limbo, if you take a somewhat more liberal definition of the term "function". recall that a limbo channel can act like a function: e.g.

       c: chan of (int, chan of string);

"can represent a function that takes an int and returns a string (through the reply channel).

       c <-= (99, reply := chan of string);
       sys->print("result is %s\n", <-reply);

"i've attached a Limbo version of the accumulator-generator that uses this kind of thing. i'll leave it to you to decide if this fulfils Greenspun's rule or not!"

I said it doesn't strictly pass the test because you are passing an integer not a number. Rog replied: "ahhh but it can if i want it to! (i really like inferno's parametric types... note the accgen and acc functions could be in an external module)."

Around the same time as Rog sent me this I was also reading Structure and Interpretation of Computer Programs, in particular the sections on delayed (or lazy) lists. The connection of lazy lists to functional programming and to channels in inferno was made complete by Doug Mcilroy's paper, Squinting the Power Series. I ported the code, original in newsqueak, to inferno, to absorb the lessons from this. I recommend all inferno programmers go read this paper and study the code.

Where from here? I tried to apply what I'd learned. I created a tool for querying a little database. The query is made by chaining processes in a similar manner as the power series code. I'll be posting this code at a later time, as I hope to incorporate it into the folkonomy fs.