Posts

Showing posts from March, 2008

lab 85 - stowage

NAME lab 85 - stowage NOTES In an earlier post I defined a venti-lite based on two shell scripts, getclump and putclump, that stored files in a content addressed repository, which in that instance was just an append-only gzip tar archive with an index. After learning a little about the git SCM , this lab re-writes those scripts to use a repository layout more like git's. The key thing to know about the git repository is that it uses sha1sum(1) content addressing and that it stores the objects as regular files in a filesystem using the hash as the directory and filename, objects/hh/hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh In the objects directory is 256 directories named for every 2 character prefix of the sha1hash of the object. The filename is the remaining 38 characters of the hash. Putclump calculates the hash, slices it to make the prefix and filename, tests if the file already exists, and if not writes the compressed data to the new file. Here is the important part

lab 84 - gridfs pattern (mapreduce)

NAME lab 84 - gridfs pattern (mapreduce) NOTES I've mentioned mapreduce in previous posts . It makes a good example application for thinking about grid computing. This lab is also about mapreduce although the point here is to illustrate an inferno pattern for grid computing. I'll call it here the gridfs pattern. Say you have a grid of compute nodes and you want to distribute and coordinate work among them. For the gridfs pattern you construct a synthetic file system that will get exported to all the nodes. The file system is the master process and all clients to the file system are workers. Both cpu(1) and rcmd(1) use the rstyxd(8) protocol that exports the local namespace when running remote jobs. To implement the gridfs pattern we bind our master fs into our local namespace so it gets exported when we run multiple workers across our compute grid. A very simple example of this pattern is explained in the Simple Grid Tutorial Part 2 . I export a named pipe with a pro

lab 83 - lcmd local cpu

NAME lab 83 - lcmd local cpu NOTES While thinking of the Simple Grid Tutorial Part 1 and Part 2 , I wondered whether I could implement the equivalent of rcmd(1) but for a local emu launched using os(1). For example, lcmd math/linbench 100 would launch a new emu, export the local fs to it through a pipe rather than a network socket, and run the command in that namespace. The idea seemed simple, no apparent obstacles, but it actually took me a couple of evenings to get it to work. So I'm posting it more because of the effort rather than its value. First lets look at what rcmd does, ignoring the networking. Given its arguments it builds a string, calculates its length + 1, and writes the length then the string to a file descriptor, then exports the local namespace to the same file descriptor. Well that part is easy to do in sh(1). Here it is as a braced block assuming all work is done on file descriptor 1. fn lcmd { load expr string args := $* s := sh -c ${quote $"