The simplest example how to use extruct is to call extruct.extract(htmlstring, base_url=base_url) with some HTML string and an optional base URL. Let's try this on a webpage that uses all the syntaxes supported (RDFa with ogp). First fetch the HTML using python-requests and then feed the response body to extruct: >>> import extruct >>> import requests >>> import pprint >>> from w3lib.html import g