M Veselovskiy

Yuuru
ยท

AI & ML interests

None yet

Recent Activity

Organizations

Yuuru's activity

New activity in TheDrummer/UnslopSmall-22B-v1-GGUF about 1 month ago
New activity in G-reen/gpt5o-reflexion-q-agi-llama-3.1-8b 2 months ago

How to pay

1
#17 opened 2 months ago by Yuuru
New activity in mattshumer/Reflection-Llama-3.1-70B 2 months ago

DLETE THIS MODEL

2
#76 opened 2 months ago by MaziyarPanahi
Reacted to m-ric's post with ๐Ÿ‘ 3 months ago
view post
Post
1912
๐Ÿคฏ ๐—” ๐—ป๐—ฒ๐˜„ ๐Ÿณ๐Ÿฌ๐—• ๐—ผ๐—ฝ๐—ฒ๐—ป-๐˜„๐—ฒ๐—ถ๐—ด๐—ต๐˜๐˜€ ๐—Ÿ๐—Ÿ๐—  ๐—ฏ๐—ฒ๐—ฎ๐˜๐˜€ ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ-๐Ÿฏ.๐Ÿฑ-๐—ฆ๐—ผ๐—ป๐—ป๐—ฒ๐˜ ๐—ฎ๐—ป๐—ฑ ๐—š๐—ฃ๐—ง-๐Ÿฐ๐—ผ!

@mattshumer , CEO from Hyperwrite AI, had an idea he wanted to try out: why not fine-tune LLMs to always output their thoughts in specific parts, delineated by <thinking> tags?

Even better: inside of that, you could nest other sections, to reflect critically on previous output. Letโ€™s name this part <reflection>. Planning is also put in a separate step.

He named the method โ€œReflection tuningโ€ and set out to fine-tune a Llama-3.1-70B with it.

Well it turns out, it works mind-boggingly well!

๐Ÿคฏ Reflection-70B beats GPT-4o, Sonnet-3.5, and even the much bigger Llama-3.1-405B!

๐—ง๐—Ÿ;๐——๐—ฅ
๐ŸฅŠ This new 70B open-weights model beats GPT-4o, Claude Sonnet, et al.
โฐ 405B in training, coming soon
๐Ÿ“š Report coming next week
โš™๏ธ Uses GlaiveAI synthetic data
๐Ÿค— Available on HF!

Iโ€™m starting an Inference Endpoint right now for this model to give it a spin!

Check it out ๐Ÿ‘‰ mattshumer/Reflection-Llama-3.1-70B
ยท
New activity in yodayo-ai/kivotos-xl-2.0 6 months ago

Broken results

3
#1 opened 6 months ago by Yuuru
New activity in saltlux/luxia-21.4b-alignment-v1.0 9 months ago

Quantized GGUF available

3
#3 opened 9 months ago by MaziyarPanahi
New activity in chargoddard/mixtralnt-4x7b-test 12 months ago

It works!!!

7
#1 opened 12 months ago by HoangHa
New activity in TheBloke/Mixtral-8x7B-v0.1-GGUF 12 months ago

It works.

6
#3 opened 12 months ago by Yuuru
New activity in mistralai/Mistral-7B-Instruct-v0.2 12 months ago

How is this different from v1?

7
#2 opened 12 months ago by amgadhasan
New activity in TheBlokeAI/Mixtral-tiny-GPTQ 12 months ago

What is this model?

3
#1 opened 12 months ago by Yuuru
New activity in TheBloke/Yi-34B-GPTQ about 1 year ago

How do i run it?

4
#2 opened about 1 year ago by Yuuru