Since version 2.0, lxml comes with a dedicated Python package for dealing with HTML: lxml.html. It is based on lxml's HTML parser, but provides a special Element API for HTML elements, as well as a number of utilities for common HTML processing tasks. Parsing HTML fragments There are several functions available to parse HTML: parse(filename_url_or_file): Parses the named file or url, or if the obj