September 8, 2004 Uche Ogbuji Lately I've seen HTML parsing problems everywhere. One project needed a web crawler with specialized features provided through Python code that processed arbitrary HTML. There have also been several threads on mailing lists I frequent (including XML-SIG) featuring discussions of mechanisms for dealing with broken HTML by converting it to decent XHTML. This article foc