python-readability Given a html document, it pulls out the main body text and cleans it up. This is a python port of a ruby port of arc90’s readability project. Installation It’s easy using pip, just run: $ pip install readability-lxml Usage >>> import requests >>> from readability import Document >>> response = requests.get('http://example.com') >>> doc = Document(response.text) >>> doc.title() '
![readability-lxml](https://cdn-ak-scissors.b.st-hatena.com/image/square/3ac04ce0305e3030bdd33bedf497bffb84a0bb3d/height=288;version=1;width=512/https%3A%2F%2Fpypi.org%2Fstatic%2Fimages%2Ftwitter.abaf4b19.webp)