Tuesday, December 23, 2008

lab 91 - using freetype

NAME

lab 91 - using freetype

NOTES

While fiddling with Charon's fonts and wondering what work would be involved to replace the whole set I decided to take a quick look at the freetype module. This lab documents some of my progress.

A recent post to the acme-sac mail-list pointed me to the DejaVu fonts. They are derived from Bitstream Vera Fonts but with more characters. It includes various styles: Sans, Serif, Italic, Oblique, Bold, and Mono, making it a good choice for Charon. At first I considered converting the whole set over to Inferno format.

There is a program to convert TrueType fonts to the inferno format. But the program is designed to run on Plan 9 and I don't have a ready Plan 9 environment anymore. So the effort of setting up an environment, compiling and fixing problems I know exist in the conversion tool, creating font files for all the styles, and in a variety of sizes, and I was ready to look for an easier solution.

The Freetype library is compiled into the inferno-os emulator and exports a builtin limbo interface. There are no programs in inferno-os that use Freetype. And there is no documentation describing the limbo interface. There have, however, been a few posts to the inferno-list describing its use. Also, the Freetype documentation from its source website is good. The tutorial basically describes what needs to be done within inferno to use the freetype module.

The first example from inferno-list, testfreetype.b, shows the use of the library for rotation and scaling.

% cd lab 91
% limbo testfreetype.b
% testfreetype fonts/DejaVuSans.ttf 'Hello World'

In this screenshot, the Inferno logo image is the background, and the word 'freetype' is scaled and rotated above it, with some transparency.

108112312208-Inferno

The example dbft2.b is a simplification of the above that demonstrates writing a string to a window. I took the dbft2.b code an tried to adapt it to the frame module, a port a libframe from Plan 9, and used by Acme in inferno-os,

First of my own demos is an application called term that accepts keyboard input and uses frame to display the entered text inside a window. This uses inferno fonts. Using frame requires some setup code,

 framem = load Framem Framem->PATH;
 
 # setup a window client
 ...   
 win := wmclient->window(ctxt, "Term", Wmclient->Appl);
 ...

 font = Font.open(display, "/fonts/lucidasans/unicode.8.font");
 textcols = array[NCOL] of ref Draw->Image;
 textcols[BACK] = display.black;
 textcols[HIGH] = display.color(Draw->Darkyellow);
 textcols[BORD] = display.color(Draw->Yellowgreen);
 textcols[TEXT] = display.color(Draw->Medgreen);
 textcols[HTEXT] = display.black;
 framem->init(ctxt);
 frame = framem->newframe();
 win.image.draw(win.image.r, textcols[BACK], nil, ZP);
 framem->frclear(frame, 0);
 framem->frinit(frame, win.image.r,  font, win.image, textcols);

Its not documented in Inferno, but see the Plan 9 manual page.

On input from the keyboard we append it to a buffer and pass the buffer to frame:

 c := <-w.ctxt.kbd =>
  buf[len buf] = c;
  framem->frinsert(frame, buf[len buf - 1:], 1, frame.p0);

Once I got that baseline working, the next program is frame2.b and term2.b that uses freetype. In the init function I load the freetype module and load a new face.

 freetype = load Freetype Freetype->PATH;
 face = freetype->newface("./fonts/DejaVuSerif-BoldItalic.ttf", 0);
 face.setcharsize(20<<6, 72, 72);
 glyphsimg = ctxt.display.newimage(Rect((0,0), (20,20)), Draw->GREY8, 0, Draw->Black);

I rewrote the three functions used by frame to display strings, stringx, charwidth, and strwidth. Stringx does the work of loading the glyph and drawing it.

stringx(d : ref Image, p : Point, f : ref Font, s : string, c : ref Image)
{
 origin := Point(p.x<<6, (p.y+face.ascent)<<6);
 for (i := 0; i < len s; i++)
 {
  g := face.loadglyph(s[i]);
  if (g == nil){
   sys->print("No glyph for char [%c]\n", s[i]);
   continue;
  }
  drawpt := Point((origin.x>>6)+g.left, (origin.y>>6)-g.top);
  r := Rect((0,0), (g.width, g.height));
  r = r.addpt(drawpt);
  glyphsimg.writepixels(Rect((0,0), (g.width, g.height)), g.bitmap);
  d.draw(r, c, glyphsimg, (0,0));
  origin.x += g.advance.x;
 }
}

In this screenshot, the term application is running inside inferno-os, with some typed text. (Note, there is a error in my render of lowercase 'f', the top of the 'f' has been chopped off.)

108112314543-Inferno

Acme uses frame so we can quickly experiment with using a TTF file for the Acme font. The above three functions are in another module in the acme source called graph.b. I've included a graph2.b file in this lab that can be bound over the existing graph.dis and will load a hardcoded TTF file for the font. Below is a screenshot of acme running the DejaVuSans.ttf font.

108112392028-Inferno

Now back to my original aim: changing the charon fonts. I created a module that defined a new Font ADT to replace the one defined in the draw module. I changed the interface slightly, to include the freetype face and added the stringx function.

All the code for drawing text in charon is in layout.b. In this labs code I've included a replacement that uses the freetype Font adt. I changed all calls to Image.text() to call a local function that itself calls Font.stringx(). Layout.b locally defines a Fontinfo that specifies all the font files and sizes. I modified it to point to all the DejuVu fonts with sizes as appropriate. To run this yourself inside inferno-os you'll need to extract the DejuVu ttf files directly under /fonts, then,

% cd /appl/charon
% bind -bc '/n/local/inferno-lab/91' .
% limbo layout.b
% limbo ftfont.b
% bind layout.dis /dis/charon/layout.dis
% charon&

And here is an screenshot of the results.

1081123203852-Inferno

I haven't looked at converting Tk.

FILES

lab/91

Thursday, December 04, 2008

lab 89 - electroquongton

NAME

lab 89 - electroquongton

NOTES

The code for this lab was something I was playing with to display on a Nintendo DS. It uses the mux window manager and the prefab module builtin. Because of the dependency on the builtin, which isn't usually part of the inferno-os standard emu build, I've included a muxemu.exe for Windows in this labs code. To launch the code run the following,

% muxemu -r . -g256x384 /dis/mux/mux.dis

and on the screen you should see this,

10872522390-Inferno

Move up and down using keys 'i' and 'm'. Enter a selection using the 'Enter' key and return to the main menu by pressing the spacebar.

I'm not experienced with the draw(2) API so I started with screens for board games.

10861214276-Inferno

I was experimenting using transparency effects. The look I was going for was Electroplankton for Nintendo DS.

I tried to get shapes with edges that blended out into a graded background and some simple animations of shapes pulsing.

108725231623-Inferno

The one serious application I was trying to write was a QUONG keyboard for convenient touch input.

10872420332-Inferno

This application responds to the mouse to enter letters. The top half of the windows is an implementation of Plan 9's libframe. It was part of Inferno's acme implementation and I extracted it from acme's dependencies. If you take anything away from this lab it would probably be this one library, maybe to build an inferno based 9term.

A big disappointment is that these didn't actually work on the inferno-ds due to what I guess is a bug in the handling of graphics with masks. They did seem to work on the DS emulators, but slowly. Another difficulty is writing them so that they fit in the DS memory, the techniques of which I'm completely ignorant because I'm used to the great spaciousness of modern desktops.

FILES

inferno-lab/89

Monday, November 10, 2008

lab 90 - Multicast DNS and Zeroconf

NAME

lab 90 - Multicast DNS and Zeroconf (client portion)

NOTES

While using synthetic file systems to publish services works great, you still need to know where your server is. This information can be provided from a shared ndb/local or even DHCP, but there are plenty of scenarios I have run into where I don't control the DHCP server and distributing ndb/local is tedious. In working with Blue Gene the problem becomes a bit worse in that we don't know apriori which portion of the machine (and therefore which IP address) we will get. Further complicating this is the fact that our front-end node (where we run Inferno) is established by a load balancer, and there are potentially 5 of us running our own file servers. So not only do we need to know how to get to the front end node, but how to get to the right front-end node.

While some form of simple broadcast service discovery may have been sufficient, I decided to take the time to see what it would take to add multicast DNS and service-discovery to the Plan 9 and Inferno DNS services. Multicast DNS and Service Discovery (aka Bonjour, aka Rendezvous, aka Zeroconf) is documented in plenty of places, here's a few good starting points for more information:

Multicast DNS resolution is fairly straightforward, it involves just sending the DNS request to a multicast address (224.0.0.251) and using port 5353 instead of port 53. The first thing I did was modify /appl/cmd/ndb/dns.b to use this address and port when looking for any domain ending in .local per the zeroconf convention.

Service discovery is a bit more problematic. It involves sending a slightly different type of request. Typical DNS requests use an type A style request which retrieves an IP address for a hostname. Service discovery uses a PTR style request which returns three types of records -- the PTR record contains the instance of the service which will include a more specific name of the service location, a SRV record which will contain the port number of the service, a TXT record which contains some protocol specific information and an A response which contains the IP address.

So, for example, if I send a PTR request for _presence._tcp.local, I get four records in response:

  1. ptr record -> crazyjim@arl137._presence._tcp.local
  2. srv record -> port: 5298
  3. txt record -> last=Peterson 1st=James msg=Away status=dnd
  4. A record -> 9.3.61.137

I added a flag to ndb/dnsrequest (-z) which forces sending a PTR request to DNS. Using the Plan 9 DNS service as a model, I modified the Inferno dns to be able to parse SRV and TXT records. The one things I changed was that the TXT response can contain several key value pairs. The Plan 9 DNS service just strings these together (with no seperator mind you). So I print the number of key value pairs, and then have the key=value pairs one per line prefixed with dual tabs to make it look nice.

The final remaining problem is that the current DNS daemon only returns a single response for each request, but multicast DNS may have several responses for a single request. This involves a much larger set of changes to dns.b. DNS will now accumulate responses and return them in one big set.

I probably need to do some work to get cs.b to play nice with such information. The whole dnsquery, and even the cs and ndb front-ends to it seem decidedly anti-Plan 9 in their layout -- particularly for something with lots of rich attribute information like zeroconf. As such I'm likely going to create a synthetic file server with which to browse zeroconf data. You register which types of zeroconf entities you are interested in by creating directories in a two level hierarchy -- the file server will then use my modified DNS to query the local net and will create nodes under those directories for responses with attribute information broken out into individual files. The other major thing that needs to be done is adding multicast server support to dns.b. But I think I'll write these up in a different lab entries as this one is getting long in the tooth already.

While this is sufficient to solve my initial problems, there are several additional aspects of zeroconf which might be nice to integrate for Inferno including support for link-local IP addresses (which I guess would only be important on native), proper uniqueness handling for claiming your local names, NAT-PMP support, DNS-LLQ support, and dynamic DNS updates.

EXAMPLE

% ./dnsquery -z _presence._tcp.local
_presence._tcp.local ptr ericvh@ericvh-desktop._presence._tcp.local
ericvh@ericvh-desktop._presence._tcp.local txt 10
txtvers=1
1st=Eric
last=Van Hensbergen
port.p2pj=5298
status=away
node=libpurple
ver=2.5.2
vc=!
email=bergevan@us.ibm.com
phsh=943420112a8b192466a802bedfe547041a62ea90

ericvh@ericvh-desktop._presence._tcp.local srv 0 0 5298 ericvh-desktop.local
ericvh-desktop.local ipv6
ericvh-desktop.local ip 9.3.61.77
npe@macintosh-16._presence._tcp.local srv 0 0 5298 macintosh-16.local
npe@macintosh-16._presence._tcp.local txt 13
ext=
phsh=f308675309a23fa653c269c95f57eb7eb84efc44
last=
AIM=
nick=
1st=Noah
port.p2pj=5298
txtvers=1
version=1
node=
jid=
email=
status=avail

macintosh-16.local ipv6
macintosh-16.local ip 9.3.61.73
_presence._tcp.local ptr npe@macintosh-16._presence._tcp.local

DISCUSSION

dns.b seems entirely too big, and I just made it bigger. It would seem better served if it were split up into a bunch of component modules. It seems like the marshalling and unmarshalling of DNS messages is a legitimate module, the cache is a module, local database/config access is another, and then a proper module interface for performing DNS queries and/or servicing DNS requests. The file and network servers could then be provided relatively cleanly. All in all it would clean the code up signifigantly and make the whole thing a lot more readable/extensible. Things like the registry or even cs could easily be implemented as plug-in modules versus discreet file servers (although also allowing them to use file services is desirable in certain scenarios so this should definitely be parameterized).

Its funny, looking back at Virgild, it was essentially a broadcast form of multicast name resolution, just with its own more simple protocol instead of DNS.

FILES

  • lab90/appl/cmd/ndb/dns.b
  • lab90/appl/cmd/ndb/dnsquery.b

Thursday, May 01, 2008

lab 88 - degrees of freedom

NAME

lab 88 - degrees of freedom

NOTES

The Vitanuova downloads page has historical snapshots of Inferno source from 1996 to 2003 containing all three editions before Inferno went open source. I was curious to see how well Inferno has sustained a standard set of interfaces over the last ten years so I downloaded all of them and poked around.

The biggest overall change came with 4th edition, when many parts of the system were upgraded, including the Styx protocol, several builtin modules, dis format, the limbo language and VM. Also, significantly, Inferno adopted open source licenses granting developers the freedom to modify any part of the system: something that might impact sustainability for good or ill.

While every edition prior to 4th has had some interfaces changed, these changes did not break backwards compatibility. A 3rd edition emu can run dis code from the 1st edition archive.

The difference between 3rd and 4th was large enough that limbo code needed to be ported, or at the very least recompiled, to run on the new emu.

The Sys interface is evolving still with additions made since 4th edition was released. These types of changes, adding a function or a new constant, do not break backward compatibility, but in a network of emus where there is diversity of versions, link typechecks do fail when a module is expecting an interface newer than the one available. Where is the standard interface here? This problem seems to violate some of the core ideas of Inferno, and Inferno doesn't provide an easy way of working around compatibility issues with builtin modules.

Inferno's core idea is to provide standard interfaces that free content and service providers from concern of the details of diverse hardware, software, and networks over which their content is delivered. (/sys/doc/bltj.ms)

The BLTJ paper describing Inferno listed the several dimensions of portability and versatility provided by the OS,

  • Portability across processors: it currently runs on Intel, Sparc, MIPS, ARM, HP-PA, and PowerPC architectures and is readily portable to others.
  • Portability across environments: it runs as a stand-alone operating system on small terminals, and also as a user application under Windows NT, Windows 95, Unix (Irix, Solaris, FreeBSD, Linux, AIX, HP/UX) and Plan 9. In all of these environments, Inferno applications see an identical interface.
  • Distributed design: the identical environment is established at the user's terminal and at the server, and each may import the resources (for example, the attached I/O devices or networks) of the other. Aided by the communications facilities of the run-time system, applications may be split easily (and even dynamically) between client and server.
  • Minimal hardware requirements: it runs useful applications stand-alone on machines with as little as 1 MB of memory, and does not require memory-mapping hardware.
  • Portable applications: Inferno applications are written in the type-safe language Limbo, whose binary representation is identical over all platforms.
  • Dynamic adaptability: applications may, depending on the hardware or other resources available, load different program modules to perform a specific function. For example, a video player application might use any of several different decoder modules.

Now that we have a decade of Inferno history, how many of the above degrees of freedom still hold when the whole time span is considered as one network of interconnected emus?

Don't standard interfaces also imply standard across time? A network of emus can not be expected to upgrade all at the same time. A standard is also a constraint against change in an interface. One degree of freedom is expressly limited; keep the abstraction constant.

The dilemma faced is whether to freeze an interface to provide long term compatibility based on a standard, but risk the possibility of being held back from adopting new ideas and becoming irrelevant, or to keep changing interfaces to solve new problems but pay the cost of compatibility problems.

In general it seems filesystems, namespaces and textual interfaces all lend well to creating a sustainable software environment. However, the more complex limbo language and module interfaces have shown themselves to be not so well preserved. By comparison, maybe unfairly because of the different goals of the creators, see how the works of Knuth are intended to withstand time. He specifically structures his software so that incompatibilities do not creep in, such as leaving no undefined gaps in font tables so that no one is tempted to fill them (TeX Book), or defining instructions to fill all 256 possible slots in MMIX, and making his source readable but not permitting edits except through his CWEB change file system. Knuth's software is the only kind I know of that take seriously the problem of long term compatibility.

It would be nice to run 1st edition dis code in a current emu if for no other reason than to prove the sustainability of infernos standard interfaces, but a major barrier to that is the need to bind in old Sys and Draw modules. I assume that compatibility to older interfaces should be provided through limbo modules so that the emu doesn't bear the extra weight and complexity of carrying multiple builtin implementations. There is no way to override where a builtin module is loaded from, though this might be a nice feature. For example, if /dev/dis/draw.dis represented the builtin draw module, I might bind limbo implementation over it, so that a load Draw "$Draw", would take the compatibility simulation over the builtin. (This might also work nicely going the other way so that we could bind builtin modules over limbo modules for optimization.)

This is not possible for Sys however, because we can't simulate variadic args in limbo (e.g., sys->print). A solution to this would be nice! But an alternative to providing more mechanism is simply to freeze the various interfaces.

Sunday, April 06, 2008

lab 87 - mux for nintendo ds

NAME

lab 87 - mux for nintendo ds

NOTES

In an earlier post I talked about updating mux to 4th edition Inferno in the hope of one day running it on a Nintendo DS.

Well, Inferno is now booting on the DS so I got to try it for real.

I started with getting the mux window manager working in standard inferno. Then I changed the resolution down to 256x192 and tried to get everything to fit. The files in this lab include the version of mux I ended up putting in the nds file running on the DS.

Things to try if you download it. Rocker moves up and down selection. 'A' key enters, 'B' key backs out back up to the higher level. 'Start' key returns to the top level menu.

Try Today's Newspaper, and The Thisburgh has the only working graphic. Under news, click through to actually read an article. Under games, try connect4. Audio control would look cool if any of the graphics actually came in. The Financial Reports gives a ticker. It scrolls slowly only because of the sleep interval in the code is incorrect.

If you want to try this version of mux using hosted inferno just remember you need to compile prefab into your emu. Include prefab in the mod and lib sections of your emu config file, also uncomment prefab in the /libinterp/mkfile.

Mux uses irsim for key controls. I changed my local inferno-ds code to have the DS keys output the same characters as used by irsim.

The files in this lab include the movies and tvlist apps and their data. The data didn't fit on the 4MB .nds file. But they will fit when we get the GBA ROM or dldi interface working.

I think mux is a good path to follow for DS development. It's small, starts quickly, uses the keys effectively since it was designed for remote controls, the programs are easy to understand, and they hit most of the applications I'd like to start with, small games, news reader, email reader, simple database browser (movies, tvlist), and audio.

FILES

code for lab 87

lab 86 - srv

NAME

lab 86 - srv

NOTES

This from a post on 9fans, and also on Tip O' the Day

% dc >[0=1] | echo 0 > /srv/desk

Plan 9's srv(3) acts as a bulletin board for open file descriptors, other namespaces see all the files in srv, and so can read and write to /srv/desk.

Inferno has srv(3) which is a file2chan registry, but is also visible to all namespaces on the host. (see also srv9(3))

The current implementation of sh-file2chan(1) does not allow the above. The closest I got was,

% load file2chan
% calc >[0=1] | {file2chan /chan/desk {rblock; putrdata &} {fetchwdata > /fd/0}} &

% stream -b 1 /chan/desk
% echo 1+1 > /chan/desk

I tried implementing a command equivalent to srv(4) on Plan 9. It takes a command block or network address and post it in the srv registry.

% srv {calc >[1=0]}  /chan/desk

It using an existing '#s' instance if there is one, else binds a new one. Now we can open a console to /chan/desk from another window

% {cat & cat  >[1=0] < /dev/cons} <> /chan/desk

and other windows can write to /chan/desk, the output will be seen in the console.

Questions.

  1. Why isn't Plan 9 srv(3) in Inferno?
  2. File2chan(2) seems under used. Is that because of the shell interface sh-file2chan?
  3. Is there another interface that would make file2chan more usable?
  4. Mount(1) supports mounting from a file, as Plan 9's does. But the inferno srv(3) device must do extra copies of the read and write buffers to implement the interface. Is the file interface of Plan9 srv more elegant than the extra file2chan syscall in Inferno?
  5. Aren't there benefits to using channels in Inferno that make file2chan preferable?

Files in this lab are for the inferno srv(4) implementation.

FILES

lab 86 code

Monday, March 24, 2008

lab 85 - stowage

NAME

lab 85 - stowage

NOTES

In an earlier post I defined a venti-lite based on two shell scripts, getclump and putclump, that stored files in a content addressed repository, which in that instance was just an append-only gzip tar archive with an index.

After learning a little about the git SCM, this lab re-writes those scripts to use a repository layout more like git's. The key thing to know about the git repository is that it uses sha1sum(1) content addressing and that it stores the objects as regular files in a filesystem using the hash as the directory and filename,

  objects/hh/hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

In the objects directory is 256 directories named for every 2 character prefix of the sha1hash of the object. The filename is the remaining 38 characters of the hash.

Putclump calculates the hash, slices it to make the prefix and filename, tests if the file already exists, and if not writes the compressed data to the new file. Here is the important part of putclump,

 (sha f) := `{sha1sum $file}
 (a b) := ${slice 0 2 $sha} ${slice 2 40 $sha}
 
 if {ftest -e $hold/objects/$a/$b} {} {
  mkdir -p $hold/objects/$a
  gzip < $file > $hold/objects/$a/$b
 }

Getclump just needs to look up the file given a hash

 sha := $1
 (a b) := ${slice 0 2 $sha} ${slice 2 40 $sha}
 files := `{ls $hold/objects/$a/$b^* >[2] /dev/null}
 if {~ $#files 1} {gunzip < $files } 

Because the git repository uses a regular file system to store objects, it makes it considerably easier to work with than the compacted file system like tar.gz, or an application specific binary format like venti. This is because instead of having to create new tools to read and write binary formats, we can re-use existing tools, like sh(1), tarfs(4), updatelog(8), and applylog(8).

For example, I wrote a script, stow, that takes a tarball and stores it in my repository, called the hold. The hold should be created first with the following directories,

 /n/hold/logs
 /n/hold/objects
 /n/hold/stowage

Then give stow the name of a .tar or .tgz file. Files not found in the hold and that were added are printed to stdout.

 % stow acme-0.11.tgz
 ...
 %

Stow uses updatelog(8) to create a stowage manifest file for the tarball I added. This manifest is saved under /n/stowage. The manifest records the pathname, perms, and sha1 hash of every file in the tarball.

Now that I've stowed all my tarballs I need a way of getting things out.

I built a holdfs, derived from tarfs(4), to read the stowage manifest and present files from the hold. By default the file system is mounted on /mnt/arch.

 % holdfs /n/hold/stowage/acme-0.11

The hold with its stowage is be a step up from a directory tarpit of tarballs. I can accumulate a version history based on tar.gz releases like that for acme-sac and inferno. The vitanuova downloads site contains inferno history going back to 1997. My downloads page contains snapshots of inferno from 2002 to 2006 and acme-sac after that.

My intended application for this was that I could encourage forks of a project and merge back many individuals releases into a single repository and still do useful comparisons.

Using a filled hold I should be able to do analysis of a file history based on the stowage manifests. Contained in this lab are a few experimental scripts to build out more of an SCM. For example, the script hold/diff attempts to use updatelog to compare a manifest with the current tree. And hold/difflog uses a modified applylog(8) to compare two manifests.

The nautical references suggest a distributed and loosely coupled network like that of shipping, and is also influenced by git's design. The unit of transfer is a tarball. It is stowed into the ships hold, along with the manifest. A file system interprets the manifests and gives an interface for searching the hold. There is also keep a ships log of what was stowed and when. I can extract patches, files, tarballs, or the complete stowage in my hold to share with someone else.

This is a simple system:

     28 putclump.sh
     16 getclump.sh
     54 stow.sh
    669 holdfs.b
    767 total

But then most of what was needed already existed in inferno.

FILES

lab 85 code

Sunday, March 23, 2008

lab 84 - gridfs pattern (mapreduce)

NAME

lab 84 - gridfs pattern (mapreduce)

NOTES

I've mentioned mapreduce in previous posts. It makes a good example application for thinking about grid computing. This lab is also about mapreduce although the point here is to illustrate an inferno pattern for grid computing. I'll call it here the gridfs pattern.

Say you have a grid of compute nodes and you want to distribute and coordinate work among them. For the gridfs pattern you construct a synthetic file system that will get exported to all the nodes. The file system is the master process and all clients to the file system are workers.

Both cpu(1) and rcmd(1) use the rstyxd(8) protocol that exports the local namespace when running remote jobs. To implement the gridfs pattern we bind our master fs into our local namespace so it gets exported when we run multiple workers across our compute grid.

A very simple example of this pattern is explained in the Simple Grid Tutorial Part 2. I export a named pipe with a provider process writing one line at a time to the pipe; then multiple worker processes running across the grid consume lines from the pipe as fast as they can do the work.

The mapreduce source files I've included in this lab are a concrete (rough and experimental) example of this pattern taken to the next level. The namespace it exports is the following,

  mapreduce/clone
  mapreduce/n
  mapreduce/n/ctl
  mapreduce/n/status

Each worker opens the clone file and gets a unique connection to the master process, represented by a numbered directory. The open clone file becomes an open file descriptor to the ctl file of the new connection. The worker reads messages from ctl describing new work, and it can write back messages about work completed. The master process will keep the status file up to date with the progress of the worker, analogous to prog(3).

An advantage of this approach over the simpler named pipe is that the master process knows exactly when the worker has closed the connection and knows how much work they have completed based on the messages written to the ctl file. It also provides a better interface to the user; The ps(1) command can easily be adapted to read the status files from the mapreduce namespace.

To try out some examples using mapreduce I need to provide a mapper and reducer function. I wrote a module interface for a mapper,

Mapper : module {
    map: fn(key, value: string, emit: chan of (string, string));
};

This takes a key and value and maps it to an intermediate key and value, which it emits on a channel; it may emit many intermediate key value pairs for a single input key value pair. Here's an implementation for a mapper that takes a string input, tokenizes it, and outputs the token and '1', which will be added later for a wordcount.

# the map function may not get the whole file in one go. maybe
# just a segment, or a line.
map(nil, value: string, emit: chan of (string, string))
{
 if(sys == nil)
  sys = load Sys Sys->PATH;
 if(str == nil)
  str = load String String->PATH;
 (nil, f) := sys->tokenize(value, 
               "[]{}()!@#$%^&*?><\":;.,|\\-_~`'+=/ \t\n\r");
 for ( ; f != nil; f = tl f) {
  ss := str->tolower(hd f);
  emit <-= (ss, "1");
 }
}

There is also an interface for a reducer,

Reducer : module {
    reduce: fn(key: string, input: chan of string, emit: chan of string);
};

This takes all the intermediate values for a key and emits a value. Here's the adder, used by the wordcount.

reduce(nil: string, v: chan of string, emit: chan of string)
{
 value := 0;
 while((s :=<- v) != nil)
  value += int s;
 emit <-= string value;
}

The mapper and reducer interfaces are known by a worker process that loads them on demand. An intermediate process that combines values of the same keys and sorts them is also implemented in the worker process (See the Google MapReduce paper for a good explanation.) This implementation of mapreduce knows only how to walk directory hierarchies and print the file names to all the worker processes. Here's an example of a mapreduce command line that counts words in all files below /lib/legal.

  % mkdir /mnt/mapreduce
  % mapreduce -M4 -R3 wordcount adder /lib/legal
  % ls /mnt/mapreduce
  /mnt/mapreduce/clone

Mapreduce should launch and manage all its own processes. However, for the code checked into this lab, to illustrate what is going on, I have it launching nothing. It just mounts the file system on /mnt/mapreduce. The arguments '-M4 -R3' say to expect 4 Mapper processes and 3 Reducer processes. As workers connect it will configure it to be a mapper or reducer depending on whether work remains. Therefore, after running the above command and doing a cat(1) on /mnt/mapreduce/clone we should see the config line then the pathnames for the first worker.

  % cat /mnt/mapreduce/clone
  worker -m -R 3 -d wordcount -i 1
  /lib/legal/GPL 0 17982
  ...

The pathnames are divided up among the workers as fast as they can process them. So in this implementation mapreduce functions almost the same as the named pipe in the simple grid tutorial. The cat of the first clone file will return all pathnames!

Mapreduce however is still expecting more workers. Cat the clone file three more times to see the input to the next 3 workers. The next cat after that you should see the config and input to the reducer. For example from a remote node,

  % rcmd ${nextcpu} cat /n/client/mnt/mapreduce/clone

Doing a listing on the /mnt/mapreduce path should show you the current workers connected (if any). After all reducers have disconnected, the mapreduce filesystem will report it's done and exit.

Lets run it again for real using the mapreduce worker processes.

  % mapreduce -m /mnt/mapreduce -M4 -R3 wordcount adder /lib
  % for i in 1 2 3 4 {mapreduce/worker /mnt/mapreduce/clone&}
  % for i in 1 2 3 {mapreduce/worker /mnt/mapreduce/clone&}

You should see the result files in /tmp/out.*

For the GSoC 2008 I suggested a project where the student implement a splitjoin file system. Create a coarse grained splitjoin service as defined loosely here (PDF) (see slides 18 on for fine grained task parallelism). This suggested implementation is really another concrete example of the gridfs pattern. It would allow control over how messages are passed round robin to all the workers. It would permit different configurations of how many to push to each node, how many to join from each node, how many commands to duplicate. E.g.,

filter | splitjoin -d10 -m5 -n3 {cmd} | filter 

creates 10 duplicates of the cmd, take input from a pipeline and distributes m=5 records at a time round robin to each node and join the output n=3 records at a time from each task back out to the pipeline.

Splitjoin would take care of launching the task, and monitoring the task for completion. (Ideally, it would interact with the registry to decide where to launch services.)

Because Plan 9/Inferno is not participating this year in GSoC I will probably have a crack at this.

FILES

lab 84 code

Tuesday, March 11, 2008

lab 83 - lcmd local cpu

NAME

lab 83 - lcmd local cpu

NOTES

While thinking of the Simple Grid Tutorial Part 1 and Part 2, I wondered whether I could implement the equivalent of rcmd(1) but for a local emu launched using os(1). For example,

 lcmd math/linbench 100

would launch a new emu, export the local fs to it through a pipe rather than a network socket, and run the command in that namespace. The idea seemed simple, no apparent obstacles, but it actually took me a couple of evenings to get it to work. So I'm posting it more because of the effort rather than its value.

First lets look at what rcmd does, ignoring the networking. Given its arguments it builds a string, calculates its length + 1, and writes the length then the string to a file descriptor, then exports the local namespace to the same file descriptor. Well that part is easy to do in sh(1). Here it is as a braced block assuming all work is done on file descriptor 1.

fn lcmd {
 load expr string
 args := $*
 s := sh -c ${quote $"args}
 echo ${expr ${len $"s} 1 + } 
 echo $s
 export / /fd/1
}

We can test that,

 lcmd {ls } | auxi/rstyxd

Now, instead of running rstyxd in the current VM, I want to run another instance and run in it that. This is where it gets complicated. You might think this might work,

 
 lcmd {ls} | os emu auxi/rstyxd

It doesn't because os treats stdin as read only, stdout as write only. Because export(1) needs to read and write on one file descriptor, and so does rstyxd(8), we need to setup extra pipes, both on the local end and the remote end.

Another problem presents itself in emu. At startup rstyxd will see /dev/cons as stdin. But I'd need to bypass the keyboard handling and get the direct stdin from the pipe. We see the answer to that in /dev,

% ls -l /dev/host*
--rw-r--r-- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hostowner
---w--w--w- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hoststderr
--r--r--r-- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hoststdin
---w--w--w- c 0 caerwyn caerwyn 0 Oct 30 22:43 /dev/hoststdout

This looks good but when I tried them they were not fully implemented in the current emu. The details are not interesting. I fixed that in the acme-sac tree and committed it.

Finally, we can build our full lcmd

fn lcmd {
 load std expr string
 pctl forkns
 args := $*
 s := sh -c ${quote $"args}
 bind '#|' /tmp/lpipe
 
 {
  echo ${expr ${len $"s} 1 + }  >/fd/0;  
  echo $s >/fd/0; 
  export / /fd/0
 } <>/tmp/lpipe/data    &
 
 os -d 'd:/acme-sac' d:/acme-sac/sys/Nt/386/bin/icell.exe -c1  sh -c '
  bind  ''#|'' /tmp/pipes; 
  cat /tmp/pipes/data > /dev/hoststdout& 
  cat /dev/hoststdin > /tmp/pipes/data& 
  auxi/rstyxd <>/tmp/pipes/data1 >[2] /dev/null;  
  echo halt > /dev/sysctl' < /tmp/lpipe/data1 >/tmp/lpipe/data1
}

Heh!

I'm using icell.exe built using the cell config from acme-sac. This is a really small emu configuration. The directories /tmp/pipes, /tmp/lpipe are assumed to exist.

From this definition we can replace rcmd with lcmd in the commands for rsplit and lk in the Grid Tutorial Part 2 and get emu tools for multicores without the setup required for the grid.

Tuesday, February 12, 2008

lab 82, again

NAME

lab 82, again - txt2iaf

NOTES

After thinking the matter a little, I finally wrote the txt2iaf app. This allows to use any text file, with one or two columns of data, to be converted to an iaf file that can be played using auplay(1). Now, my sound analysis goes something like this:

1. Record something using a microphone. For now, I only record stuff using some app in Windows (Sound Recorder) or Linux (arecord(1)). This is because Inferno has some issues when trying to record something from /dev/audio: an annoying tick every so often that wrecks my sound analysis intentions. Maybe, I can help to fix this problem, probably related with buffering.

2. Convert the wav file obtained to an iaf file, using wav2iaf(1).

3. Get the data from the iaf file to a text format, using iaf2txt(?).

4. Read data from the text file using any data analysis package.

5. Do whatever you want to with the data.

6. If you wish or need to, output the data in a text file.

7. Using txt2iaf(?), create an iaf file using the data in the text file. Enjoy your new song using auplay(1) ;-)

8. If you want to, convert your iaf file to a wav file using iaf2wav(?).

I also fixed some ugly code in iaf2txt(?) and iaf2wav(?) (really ugly, indeed). Please, update.

Cheers

FILES

lab 82 code

Wednesday, February 06, 2008

A couple of IAF utilities

NAME

lab 82 - iaf2txt and iaf2wav

NOTES

Currently, a couple of my friends and I are playing a little with human voices. For this purpose I wrote two applications to convert iaf files to plain text and to the wav audio format. Both apps support the standard iaf files as described in audio(6), except for the encoding: only PCM is supported for now.

Why in the world would one need a text file converted from an iaf? Well... text files are easier to handle with data analysis software like the R programming language. I know MATLAB supports working with wav files directly, but there are mainly two reasons I needed an iaf to txt converter:

1. I do no use MATLAB.

2. When I wrote the iaf2txt app there was not an iaf to wav converter.

Maybe R can handle wav files directly, but I do not know.

I am not really sure if I need a text to iaf converter, but I am thinking about this issue. So far I do not need one.

FILES

lab 82 code