Posts

Showing posts with the label venti

lab 85 - stowage

NAME lab 85 - stowage NOTES In an earlier post I defined a venti-lite based on two shell scripts, getclump and putclump, that stored files in a content addressed repository, which in that instance was just an append-only gzip tar archive with an index. After learning a little about the git SCM , this lab re-writes those scripts to use a repository layout more like git's. The key thing to know about the git repository is that it uses sha1sum(1) content addressing and that it stores the objects as regular files in a filesystem using the hash as the directory and filename, objects/hh/hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh In the objects directory is 256 directories named for every 2 character prefix of the sha1hash of the object. The filename is the remaining 38 characters of the hash. Putclump calculates the hash, slices it to make the prefix and filename, tests if the file already exists, and if not writes the compressed data to the new file. Here is the important part...

lab 46 - vac script

NAME lab 46 - vac script NOTES This lab is an answer to Jack's question in the comments of lab 41. The putclump/getclump scripts can be made to work with blocks of data. The shell script for this lab, vac , will write a kfs file system to a venti-lite archive in blocks, taking advantage of block level redundancy. This is demonstration rather that something truly useful. It literally takes hours to store a 64MB kfs file system in venti-lite using this script. Note also, that it is trivial to implement putclump/getclump using the venti(2) module in inferno and store the blocks in a real venti archive. And this script will work the same way. I've also included a vcat script to readout the kfs filesystem in one go. I started writing a file2chan interface so I could run disk/kfs directly against the venti-lite archive, but didn't finish it. I've attached that too (clumpchan). Exercise for the reader is to finish it. FILES caerwyn.com/lab/46 2005/1207 I updated vac ...

lab 41 - venti lite

NAME lab 41 - venti lite NOTES I've taken another look recently at venti and the ideas from the venti paper . Venti is a data log and index. The index maps a sha1 hash of a clump of data to the offset of the clump in the log. Clumps are appended to the log after being compressed and wrapped with some meta-data. A sequence of clumps make an arena, and all the arenas, possibly spread over several disks, make up the whole data log. There is enough information in the data log to recreate the index, if neccessary. The above is my understanding of Venti so far after reading the code and paper. There is a lot more complexity in it's implementation. There are details about the caches, the index, the compression scheme, the blocking and partitioning of disks, and so on. I will ignore these details for now. Although the whole of venti could be ported to Inferno, I want to look at it without getting bogged down in too many details too early. Reasoning about Venti in the context of I...