Mert's picture

7 17

Mert

Sengil

·

https://www.linkedin.com/in/mertsengil/

AI & ML interests

LLM's

Recent Activity

Reacted to gabrielchua's post with 👀 1 day ago

Sharing my first paper! == Large Language Models (LLMs) are powerful, but they're prone to off-topic misuse, where users push them beyond their intended scope. Think harmful prompts, jailbreaks, and misuse. So how do we build better guardrails? Traditional guardrails rely on curated examples or classifiers. The problem? ⚠️ High false-positive rates ⚠️ Poor adaptability to new misuse types ⚠️ Require real-world data, which is often unavailable during pre-production Our method skips the need for real-world misuse examples. Instead, we: 1️⃣ Define the problem space qualitatively 2️⃣ Use an LLM to generate synthetic misuse prompts 3️⃣ Train and test guardrails on this dataset We apply this to the off-topic prompt detection problem, and fine-tune simple bi- and cross-encoder classifiers that outperform heuristics based on cosine similarity or prompt engineering. Additionally, framing the problem as prompt relevance allows these fine-tuned classifiers to generalise to other risk categories (e.g., jailbreak, toxic prompts). Through this work, we also open-source our dataset (2M examples, ~50M+ tokens) and models. paper: https://huggingface.co/papers/2411.12946 artifacts: https://huggingface.co/collections/govtech/off-topic-guardrail-673838a62e4c661f248e81a4

Reacted to maxiw's post with 👍 1 day ago

You can now try out computer use models from the hub to automate your local machine with https://github.com/askui/vision-agent. 💻 ``` import time from askui import VisionAgent with VisionAgent() as agent: agent.tools.webbrowser.open_new("http://www.google.com") time.sleep(0.5) agent.click("search field in the center of the screen", model_name="Qwen/Qwen2-VL-7B-Instruct") agent.type("cats") agent.keyboard("enter") time.sleep(0.5) agent.click("text 'Images'", model_name="AskUI/PTA-1") time.sleep(0.5) agent.click("second cat image", model_name="OS-Copilot/OS-Atlas-Base-7B") ``` Currently these models are integrated with Gradio Spaces API. Also planning to add local inference soon! Currently supported: - https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct - https://huggingface.co/Qwen/Qwen2-VL-2B-Instruct - https://huggingface.co/AskUI/PTA-1 - https://huggingface.co/OS-Copilot/OS-Atlas-Base-7B

liked a Space 3 days ago

openfree/trending-board

View all activity

Organizations

None yet

Sengil's activity

New activity in facebook/musicgen-small about 1 month ago

How to get best result?

#31 opened about 1 month ago by

Faster MusicGen Generation with Streaming

#23 opened about 1 year ago by

Question

#28 opened 8 months ago by

New activity in facebook/musicgen-large about 1 month ago

how to get best result

#22 opened about 1 month ago by

New activity in black-forest-labs/FLUX.1-schnell 2 months ago

GPU and memory requirements

#89 opened 2 months ago by

New activity in black-forest-labs/FLUX.1-schnell 3 months ago

How can I make this model run faster?

#78 opened 3 months ago by

New activity in IDEA-Research/grounding-dino-base 5 months ago

How to plot results using supervision

#5 opened 5 months ago by