[NEW] Internet access for Assistants

#385
by nsarrazin HF staff - opened
Hugging Chat org
โ€ข
edited Mar 25

image (2).png

Hey! We have just released an update to HuggingChat Assistants that allows you to connect them to the Internet to get more relevant and interactive answers. When you create or edit an Assistant, you will now see an option for Internet access. It can have four settings: enabled, domain search, specific links, and disabled.

  • Enabled is the same websearch we currently use in HuggingChat. Use it for generic assistants that can have conversations about many domains.
  • Domains search allows you to specify a domain name or even part of a website that the web search will crawl to try to find relevant information. Useful if you want to have an Assistant that can find search and use content from your website or a particular news outlet you like, for example.
  • Specific Links lets you specify a direct URL to a web page or plain text document that you want to pass to the Assistant. This is very useful for talking to your text documents, adding extra context to your role-playing game, or passing any arbitrary data from your web server to the Assistant.
  • Dynamic links Allow the use of template variables {{url=https://example.com/path}} to insert dynamic content into your prompt by making GET requests to specified URLs on each inference.

Enabled

This is equivalent to enabling the web search toggle in HuggingChat. The model generates a query from your question, uses it to search the web, and parses the results to improve the answer. You can't restrict which domains or links are used in this mode.

Domains search

In this mode you can restrict the domains used in the web search. This is equivalent to appending site:example.com to the web query. For example if you want to make an assistant that only uses wikipedia as a source:

You can also be more specific, for example you could have an assistant that only knows about the docs for our library diffusers.

Specific Links

This allows you to directly specify up to 10 URLs that will be added directly to the context for better results. We support both HTML and plain text content such as markdown! For example, if you want your assistant to know about the chat-ui repository README file, you can add a link to the markdown file:

Or if you want an assistant that knows about the current top news headlines:

This is a very flexible setting because you have complete control over what is passed to the context. For example, you could create your own web server in a space that returns a different country name every day and use it to make a country guessing game where every user gets the same new country every day. Also note that for long web pages it will limit the maximum context it can use, so less may perform better.

Safety & Trust

Because web-connected assistants use an external source of information, you should never trust any information without verifying it yourself, and you should always manually check which source they use; this information is publicly available on the Assistant Settings page. Web-connected assistants have a special icon so you can easily identify them.

Let us know what you think of the feature, and feel free to share your new assistants in this thread! ๐Ÿค—

It doesn't look like it's been pushed to huggingface.co/chat, is this a bug?

Hugging Chat org

@EveryPizza Feel free to check again! Should be there now

Hugging Chat org

this is incredible ๐Ÿ”ฅ

Love itt! Great job team!

Can the specific domains option be used in some clever way with a cyoa game, where the user mostly inputs numbers?
I tried it here, but web fails: https://hf.co/chat/assistant/65c80b2288d76ebc0fc357ca

victor pinned discussion

I m afraid that the Domains search option is not working correctly

I have just created an assistant for the records in the President John F. Kennedy Assassination Records Collection

https://hf.co/chat/assistant/65f4604731b538d22b432df7

I need it to answer based only in material found on archives.gov subdomains but, alas, it returns text from reddit, wikipedia and other sources

Hugging Chat org
โ€ข
edited Mar 15

@emilios can you share a conversation where this doesn't work for you? because for me it works: https://hf.co/chat/r/DN9CIVy

Allows users to custom search views

Could this, however, indicate that queries are sent to an external server or are privacy limitations set in place?

nsarrazin unpinned discussion

Hey, the internet access for assistants working very well but I would like to know how can I instruct the assistant to search in a Git repo? I mean, when I use llama 3.1 70b model with "Url Fetch" feature, and ask something about the codebase like:

"Do you know that this library supports any HashMap structure? Please search in codeberg.org/user/repo."

It's smart enough to perform a search in the codebase like: https://codeberg.org/user/repo/search?q=HashMap&ref=master

And pull out the relevant information. But I'm not able to instruct the same way for the assistant I've created.

Hi folks, the "Default" setting for the assistant seems to be searching the web now. It usually only happens on my second example prompt. The UI says the default option is no web search. I'm confused as to why my assistant is now searching stuff.

Hi folks, the "Default" setting for the assistant seems to be searching the web now. It usually only happens on my second example prompt. The UI says the default option is no web search. I'm confused as to why my assistant is now searching stuff.

same happening here with my assistant
managed to get around it by stating "Only use web search when asked to" in the system prompt, seems to work for now

It's happening to me too, more so with a custom assistant. Add "Do not search the web" at the end of prompts as a workaround for now.

It's happening with command-r-plus.

I hope the developers will give due consideration to integrating the 'Fetch URL' tool into the Custom Assistants creation process. This tool would be highly beneficial for users who, like myself, have created assistants designed to summarize content from specific URLs. At present, the 'Web Search' tool is the only option, but it often includes external sources in its summaries. This can be problematic as it may compromise the accuracy of the information, drawing from sources beyond the intended URL content.

Hugging Chat org

@LostSpirit @louay01 @pearsonkyle should be fixed now! there was an issue in the backend code

I just want to express my appreciation to the developers and whoever's servers this is running on, you change my life for the better

Sign up or log in to comment