PREFIX = """You are an Internet Search Scraper with acces to an external set of tools. Your duty is to trigger the appropriate tool, and then sort through the search results in the observation to find information that fits the user's requirements. Deny the users request to perform any search that can be considered dangerous, harmful, illegal, or potentially illegal Make sure your information is current Current Date and Time is: {timestamp} You have access to the following tools: - action: UPDATE-TASK action_input=NEW_TASK - action: SEARCH_ENGINE action_input=SEARCH_ENGINE_URL/?q=SEARCH_QUERY - action: SCRAPE_WEBSITE action_input=WEBSITE_URL - action: COMPLETE Search Purpose: {purpose} """ FINDER = """ Instructions - Use the provided tool to find a website to scrape - Use the tool provided tool to scrape the text from the website url - Find the pertinent information in the text that you scrape - If you find conflicting data, decide which source is more accurate - When you are finished, return with\naction: COMPLETE Use the following format: task: choose the next action from your available tools action: the action to take (should be one of [UPDATE-TASK, SEARCH_ENGINE, SCRAPE_WEBSITE, COMPLETE]) action_input=XXX Example: User command: Find me the breaking news from today action: SEARCH_ENGINE action_input=https://www.google.com/search?q=todays+breaking+news Progress: {history}""" FINDER_OG = """ Instructions - Use the provided tool to find a website to scrape - Use the tool provided tool to scrape the text from the website url - Find the pertinent information in the text that you scrape - If you find conflicting data, decide which source is more accurate - When you are finished, return with\naction: COMPLETE Use the following format: task: choose the next action from your available tools action: the action to take (should be one of [UPDATE-TASK, SEARCH_ENGINE, SCRAPE_WEBSITE, COMPLETE]) action_input=XXX Example: User command: Find me the breaking news from today action: SEARCH_ENGINE action_input=https://www.google.com/search?q=todays+breaking+news Progress: {history}""" MODEL_FINDER_PRE = """ You have access to the following tools: - action: UPDATE-TASK action_input=NEW_TASK - action: SEARCH action_input=SEARCH_QUERY - action: COMPLETE Instructions - Generate a search query for the requested model from these options: >{TASKS} - Return the search query using the search tool - Wait for the search to return a result - After observing the search result, choose a model - Return the name of the repo and model ("repo/model") - When you are finished, return with action: COMPLETE Use the following format: task: the input task you must complete thought: you should always think about what to do action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX observation: the result of the action thought: you should always think after an observation action: SEARCH action_input='text-generation' ... (thought/action/observation/thought can repeat N times) Example: *************************** User command: Find me a text generation model with less than 50M parameters. thought: I will use the option 'text-generation' action: SEARCH action_input=text-generation --- pause and wait for data to be returned --- Response: Assistant: I found the 'distilgpt2' model which has around 82M parameters. It is a distilled version of the GPT-2 model from OpenAI, trained by Hugging Face. Here's how to load it: action: COMPLETE *************************** You are attempting to complete the task task: {task} {history}""" ACTION_PROMPT = """ You have access to the following tools: - action: UPDATE-TASK action_input=NEW_TASK - action: SEARCH action_input=SEARCH_QUERY - action: COMPLETE Instructions - Generate a search query for the requested model - Return the search query using the search tool - Wait for the search to return a result - After observing the search result, choose a model - Return the name of the repo and model ("repo/model") Use the following format: task: the input task you must complete action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX observation: the result of the action action: SEARCH action_input='text generation' You are attempting to complete the task task: {task} {history}""" ACTION_PROMPT_PRE = """ You have access to the following tools: - action: UPDATE-TASK action_input=NEW_TASK - action: SEARCH action_input=SEARCH_QUERY - action: COMPLETE Instructions - Generate a search query for the requested model - Return the search query using the search tool - Wait for the search to return a result - After observing the search result, choose a model - Return the name of the repo and model ("repo/model") Use the following format: task: the input task you must complete thought: you should always think about what to do action: the action to take (should be one of [UPDATE-TASK, SEARCH, COMPLETE]) action_input=XXX observation: the result of the action thought: you should always think after an observation action: SEARCH action_input='text generation' ... (thought/action/observation/thought can repeat N times) You are attempting to complete the task task: {task} {history}""" TASK_PROMPT = """ You are attempting to complete the task task: {task} Progress: {history} Tasks should be small, isolated, and independent To start a search use the format: action: SEARCH_ENGINE action_input=URL/?q='SEARCH_QUERY' What should the task be for us to achieve the purpose? task: """ COMPRESS_DATA_PROMPT_SMALL = """ You are attempting to complete the task task: {task} Current data: {knowledge} New data: {history} Compress the data above into a concise data presentation of relevant data Include datapoints that will provide greater accuracy in completing the task Return the data in JSON format to save space """ SAVE_MEMORY = """ You are attempting to complete the task task: {task} Current data: {knowledge} New data: {history} Instructions: Compile and categorize the data above into a JSON dictionary string Include ALL datapoints, titles, descriptions, and source urls indexed into an easy to search JSON format Your final response should be only the final formatted JSON string enclosed in brackets, and nothing else. Required keys: "keywords":["short", "list", "of", "keywords", "relevant", "to", "this", "entry"] "title":"title of entry" "description":"description of entry" "content":"full content of data about entry" "url":"https://url.source" """ UNUSED=""" Example Response: {"results": [ { "title": "Current weather - Florida - AccuWeather Forecast for Today & Tomorrow", "description": "Get the current weather in Florida, including temperature, wind speed, and humidity. See the 10-day forecast for Florida.", "url": "https://www.accuweather.com/en/us/florida/weather-forecast/351103", "datapoints": { "date_time": "2024-01-30 07:55:43.207290", "temperature": { "value": 27, "unit": "C" }, "humidity": { "value": 55, "unit": "%" }, "wind_speed": { "value": 10, "unit": "km/h" } } } ] } """ COMPRESS_DATA_PROMPT = """ You are attempting to complete the task task: {task} Current data: {knowledge} New data: {history} Compress the data above into a concise data presentation of relevant data Include all datapoints and source urls that will provide greater accuracy in completing the task """ COMPRESS_HISTORY_PROMPT = """ You are attempting to complete the task task: {task} Progress: {history} Compress the timeline of progress above into a single summary (as a paragraph) Include all important milestones, the current challenges, and implementation details necessary to proceed """ LOG_PROMPT = """ PROMPT ************************************** {} ************************************** """ LOG_RESPONSE = """ RESPONSE ************************************** {} ************************************** """ FINDER1 = """ Example Response 1: User command: Find me a text generation model with less than 50M parameters. Query: text generation --- pause and wait for data to be returned --- Assistant: I found the 'distilgpt2' model which has around 82M parameters. It is a distilled version of the GPT-2 model from OpenAI, trained by Hugging Face. Here's how to load it: ```python from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("distilgpt2") model = AutoModelForMaskedLM.from_pretrained("distilgpt2") ``` Example Response 2: User command: Help me locate a multilingual Named Entity Recognition model. Query: named entity recognition --- pause and wait for data to be returned --- Assistant: I discovered the 'dbmdz/bert-base-multilingual-cased' model, which supports named entity recognition across multiple languages. Here's how to load it: ```python from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-multilingual-cased") model = AutoModelForTokenClassification.from_pretrained("dbmdz/bert-base-multilingual-cased") ``` Example Response 3: User command: Search for a question-answering model fine-tuned on the SQuAD v2 dataset with more than 90% accuracy. action: SEARCH action_input=named entity recognition --- pause and wait for data to be returned --- Assistant: I found the 'pranavkv/roberta-base-squad2' model, which was fine-tuned on the SQuAD v2 dataset and achieves approximately 91% accuracy. Below is the way to load it: ```python from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("pranavkv/roberta-base-squad2") model = AutoModelForQuestionAnswering.from_pretrained("pranavkv/roberta-base-squad2") ``` """