July 6, 2003 | Fredrik Lundh The TidyHTMLTreeBuilder parser can read (almost) arbitrary HTML files, and turn them into well-formed element trees. This parser uses a library version of Dave Raggett’s HTML Tidy utility to fix any problems with the HTML before converting it to XHTML (the XML version of HTML). Note: If you don’t want to (or cannot) install binary Python extensions, you can use the Tid