lab 37 - Geryon's mapreduce
NAME lab 37 - Geryon's mapreduce NOTES I have a registry of kfs disks and cpus in my cluster, and the registry is mounted on every node. Upon this infrastructure I've written some shell scripts to simulate a MapReduce function. I want to quantize and bound the data and the computation. To that end, a process works on only one 64MB disk for input, and may write to another disk for output. The spliting of the data is also crucial to the parallelization. Each input disk can be handled concurrently on any cpu node in the cluster. The namespace for a process is built from finding resources in the registry. Because there are several cpus available to run a command block, I choose the cpu in round robin using a shell function to populate a list of the available cpu nodes, and take the head of the list with each subsequent call until the list is empty. [1] cpulist=() subfn nextcpu { if {~ $#cpulist 0} { cpulist=`{ndb/regquery -n svc rstyx} } result = ${hd $cpulist} ...