Sunyoung Hwang's picture

Sunyoung Hwang PRO

sosoai

·

https://sosohajalab.com

AI & ML interests

llm, vision, transformers, megabytes

Recent Activity

liked a model 3 days ago

mistralai/Mistral-Large-Instruct-2411

liked a dataset 3 days ago

microsoft/orca-agentinstruct-1M-v1

liked a model 3 days ago

mistralai/Pixtral-Large-Instruct-2411

Organizations

sosoai's activity

upvoted an article 4 months ago

Article

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Jul 31

• 59

upvoted 2 collections 4 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 623

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 198

upvoted an article 4 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 265

upvoted a paper 4 months ago

GAVEL: Generating Games Via Evolution and Language Models

Paper • 2407.09388 • Published Jul 12 • 14

upvoted 2 articles 5 months ago

Article

How to run Gemini Nano locally in your browser

By

•

Jul 11

• 43

Article

Announcing New Dataset Search Features

Jul 8

• 22

upvoted a collection 5 months ago

SPPO

Self-Play Preference Optimization • 10 items • Updated Jun 29 • 12

upvoted an article 5 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 177

upvoted a paper 5 months ago

OpenVLA: An Open-Source Vision-Language-Action Model

Paper • 2406.09246 • Published Jun 13 • 36

upvoted a paper 6 months ago

AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Paper • 2405.14906 • Published May 23 • 23

upvoted 2 papers 7 months ago

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning

Paper • 2303.15647 • Published Mar 28, 2023 • 4

A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22 • 20

upvoted an article 7 months ago

Article

Expanding Model Context and Creating Chat Models with a Single Click

By

•

Apr 28

• 37

upvoted 2 collections 7 months ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12 • 31

Idefics2 🐶

Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88

upvoted a paper 7 months ago

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training

Paper • 2309.10400 • Published Sep 19, 2023 • 26

upvoted 2 collections 9 months ago

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Sep 18 • 206

SLIM Models

Structured Language Instruction Models (SLIMs) • 31 items • Updated 25 days ago • 30

upvoted a collection 10 months ago

zephyr-7b-sft-full-SPIN

Models fine-tuned with SPIN across iterations 0,1,2,3 • 4 items • Updated Feb 7 • 8