Trangle Heshvp's picture

Trangle Heshvp

Trangle

·

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

Xkev/Llama-3.2V-11B-cot

liked a model 9 days ago

onnx-community/Qwen2.5-Coder-1.5B-Instruct

liked a model 9 days ago

onnx-community/Llama-3.2-3B-Instruct-ONNX

View all activity

Organizations

Trangle's activity

upvoted a collection about 2 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated about 1 month ago • 490

upvoted an article 3 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 267

upvoted 4 collections 4 months ago

Gemma Scope Release

A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated Aug 11 • 13

Llama 3.1 Evals

This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated Sep 25 • 16

Minitron

A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 200

upvoted 2 papers 4 months ago

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

Paper • 2407.09435 • Published Jul 12 • 20

Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 156

upvoted an article 4 months ago

Article

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡

By

•

Jul 9

• 40

upvoted a paper 5 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 48

upvoted a collection 5 months ago

Step-DPO

Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs" • 11 items • Updated Jul 1 • 5

upvoted an article 5 months ago

Article

Welcome Gemma 2 - Google's new open LLM

Jun 27

• 123

upvoted a paper 5 months ago

SpeechVerse: A Large-scale Generalizable Audio Language Model

Paper • 2405.08295 • Published May 14 • 14

upvoted a collection 5 months ago

TaskMeAnything

A collection of TaskMeAnything resources [https://github.com/JieyuZ2/TaskMeAnything] • 12 items • Updated Aug 4 • 3

upvoted 2 articles 5 months ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16

• 32

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 177

upvoted a collection 5 months ago

WildBench

4 items • Updated 9 days ago • 5

upvoted a paper 5 months ago

Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

Paper • 2406.10208 • Published Jun 14 • 21

upvoted a collection 5 months ago

System Message Generalization

11 items • Updated Jun 7 • 3

upvoted a paper 6 months ago

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4 • 37