extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Microformat via mf2py Facebook's Open Graph (experimental) RDFa via rdflib Dublin Core Metadata (DC-HTML-2003) The microdata algorithm is a revisit of this Scrapinghub blog post showing how to use EXSLT extensions. Installation pip install extruct Usage