pragmatic parsing and emitting of HTML using SXML and SHTML
HtmlPrag provides permissive HTML parsing and emitting capability to Scheme programs. The parser is useful for software agent extraction of information from Web pages, for programmatically transforming HTML files, and for implementing interactive Web browsers.
HtmlPrag emits 'SHTML,' which is an encoding of HTML in SXML, so that conventional HTML may be processed with XML tools such as SXPath. Like Oleg Kiselyov's SSAX-based HTML parser, HtmlPrag provides a permissive tokenizer, but also attempts to recover structure. HtmlPrag also includes procedures for encoding SHTML in HTML syntax.
$ akku install wak-htmlprag $ .akku/env