Maziyar Panahi's picture

Maziyar Panahi PRO

MaziyarPanahi

·

AI & ML interests

Fine-Tuning, RLHF, Merging, Quantizations, Leaderboards

Organizations

MaziyarPanahi's activity

upvoted a collection 1 day ago

OpenScholar_V1

The set of models, index, data associated with the paper "OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs". • 8 items • Updated 2 days ago • 21

upvoted 2 collections 2 days ago

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 7 items • Updated about 9 hours ago • 17

INTELLECT-1 Dataset

INTELLECT-1 Training dataset • 5 items • Updated Oct 8 • 14

upvoted an article 3 days ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13

• 370

upvoted an article 4 days ago

Article

Halo: Open Source Health Tracking with Wearables

By

•

4 days ago

• 75

upvoted 2 papers 8 days ago

Thinking LLMs: General Instruction Following with Thought Generation

Paper • 2410.10630 • Published Oct 14 • 16

TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees

Paper • 2410.12854 • Published Oct 10 • 1

upvoted a collection 8 days ago

Nov 15 Releases 🍂

15 items • Updated 9 days ago • 6

upvoted an article 9 days ago

Article

Synthetic dataset generation techniques: Self-Instruct

By

•

May 15

• 12

upvoted an article 10 days ago

Article

Releasing the largest multilingual open pretraining dataset

By

•

10 days ago

• 94

upvoted a collection 16 days ago

🇫🇷 Calme-3

Here you can find all the new Calme-3 models • 26 items • Updated about 3 hours ago • 7

upvoted a paper 17 days ago

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

Paper • 2410.02089 • Published Oct 2 • 12

upvoted a paper 23 days ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22 • 126

upvoted a collection 23 days ago

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 17 days ago • 95

upvoted a collection 30 days ago

C4AI Aya Expanse

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 3 items • Updated about 1 month ago • 26

upvoted 2 articles about 1 month ago

Article

Deploying Speech-to-Speech on Hugging Face

Oct 22

• 35

Article

🧨 Diffusers welcomes Stable Diffusion 3.5 Large

Oct 22

• 43

upvoted a paper about 1 month ago

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21 • 57

upvoted an article about 1 month ago

Article

Scaling AI-based Data Processing with Hugging Face + Dask

Oct 9

• 23

upvoted an article about 2 months ago

Article

Introducing the Open FinLLM Leaderboard

Oct 4

• 64