google-generativeai together streamlit trafilatura markdown lxml_html_clean