Sunday, 14 March 2010

Unix Text

Why oh why do we all reinvent the wheel so often? See AutoMenu for an example of my struggles doing just that; better still ignore that and read this instead.

I spent quite a while trying to figure out how to run ESC (ECMAScript compressor) to work on Win2k only to find that it fails if the WShell DLL gets unregistered. I have no idea why that happens but I prefer my code to be less dependent on complicated libraries.

So, back to basics: do it with AWK instead, or rather GAWK.

HTML parsing and rewriting with GAWK

Goal
read an HTML file and replace certain text strings with other text strings.
Tools
GAWK
Files
input HTML, dictionary of regular expressions to find and replace, output file.
What do we actually need to do
Split the file into tag and non-tag (text) pieces, spit out the tags unchanged, in the text parts replace all found keys with their values, spit out the result.

How

  • Use < as the record separator.

For each record:

  • Split each record on >
  • print a < then the first field then >
  • for each key in the dictionary replace any occurences in the second field with the replacement expression.
  • print the result

Now this will not work very well because the replacement expression may contain html and we do not want to allow replacement in html. This means that we need to do the substitution for one key at a time and treat the result as a new html document, or document fragment.

Luckily AWK can have functions.

Found a ghastly but useable piece of code on http://www.dynamicdrive.com that shows a menu. This lets me concentrate of the part that defines the menu, I'll deal with making the run time part more elegant at a later date, perhaps. The basic idea of the showmenu function in the dynamicdrive Javascript is to alter the innerHTML of a div. Each menu consist of a div that contains an anchor, the mouseover event executes the showmenu function. One of the arguments to the showmenu function is the html fragment that is to go into another div that will be positioned where the mouse is. In order to make this easier to use I created a helper function called makemenu which takes a list of arguments that are pairs of urls and menu items; this function returns a string containing the html for a list of divs containing href anchors made from the given urls and text. The makemenu function provides the html fragment that showmenu puts into a div to show as the menu.

See AutoMenu for more details.

No comments:

Post a Comment

Blog Archive

Followers