Extraction of structured data from the Common Crawl schema.org annotations, web tables, hyperlink graphs