Monday, January 09, 2006

lab 50 - structural expressions


lab 50 - structural expressions


Sometimes I write some code but I'll let it sit for ages because I can't do the lab write-up for it. I think it's probably better to just throw it out there and get on with the next thing. This is one of those labs. Maybe I'll come back and edit it more if I end up using the tool.

I've been reading again Rob Pike's Structural Regular Expressions paper. It suggests variations on existing tools (sort, grep, diff and awk) where applying structural expressions might make the tools more versatile or change their character entirely.

Inferno is missing awk and though it can be run externally I still feel such a tool ought to exist within Inferno. Text manipulation, software tools, filters, regular expressions, pattern-action languages: these make up the core of what I think Inferno should be about. If there's a specific user I have in mind for Inferno it's me, and my main use for it is as a programming environment; one that integrates all the tools I use but implements the core set I need most often.

I've thought about porting awk to Inferno, but the paper suggests an alternative: to consider replacing it with new tools built around the idea of structural expressions. Inferno has an implementation of the sam language described in the paper. This is the Edit builtin command in Acme. The aim of this lab was to extract the Edit command into a stand alone command but to work more like a filter.

The commands implemented are x, y, g, v, p, which behave the same, and a, c, i which don't. Since the file being operated on is not being edited, the edit commands change the text represented by dot, in memory, and print it to standard output. For example, a will append it's argument text to dot and the write to stdout the dot text.

% xp ',x a/ foo/p'

To try out the language I tried to implement the two awk scripts that are used when building emu. The two scripts are mkdevlist and mkdevc. You can compare the implementation with the awk scripts in the inferno distribution /emu/port/(mkdevc mkdevlist). Note that the change in the language semantics means that c behaves like a print command.

This tool is potentially a mistake because the same 'sam' language means something different in the new context. I think it would be better to go further and try and build the awk like language from the paper.



LiteStar said...

Where did you find the 'Structural Regular Expressions' paper?

caerwyn said...

here it is structural expressions

LiteStar said...

Thank you very much. Groking google for the paper only turned up dead links or references. Thanks again.

Darren Bane said...

Just saw this on 9fans, and figured it might be useful to future readers. Alistair G. Crooks wrote ssam, a stream-oriented version of sam, and you can download it at