nayohan (Yohan Na)

upvoted a paper 4 days ago

Thinking LLMs: General Instruction Following with Thought Generation

Paper • 2410.10630 • Published Oct 14 • 16

upvoted 3 papers about 1 month ago

Conversation Chronicles: Towards Diverse Temporal and Relational Dynamics in Multi-Session Conversations

Paper • 2310.13420 • Published Oct 20, 2023 • 2

Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9 • 69

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8 • 107

upvoted a collection 4 months ago

Korean-English Parallel Datasets (한국어-영어 병렬 데이터셋)

Collection

6 items • Updated Jul 17 • 3

upvoted a paper 4 months ago

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 33

upvoted a paper 5 months ago

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 73

upvoted 3 collections 5 months ago

upvoted 3 collections 6 months ago

Korean Pretraining Dataset

Collection

15 items • Updated 3 days ago • 10

Standard-format-preference-dataset

Collection

We collect the open-source datasets and process them into the standard format. • 14 items • Updated May 8 • 22

Domain Specific (Math, Code, etc)

Collection

24 items • Updated 3 days ago • 1

upvoted a paper 7 months ago

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25 • 74

upvoted a paper 9 months ago

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64

upvoted 5 papers about 1 year ago

AlpaGasus: Training A Better Alpaca with Fewer Data

Paper • 2307.08701 • Published Jul 17, 2023 • 22

Large Language Models as Analogical Reasoners

Paper • 2310.01714 • Published Oct 3, 2023 • 15

Efficient Streaming Language Models with Attention Sinks

Paper • 2309.17453 • Published Sep 29, 2023 • 13

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 87

Small-scale proxies for large-scale Transformer training instabilities

Paper • 2309.14322 • Published Sep 25, 2023 • 19

Yohan Na PRO

AI & ML interests

Organizations

nayohan's activity

Thinking LLMs: General Instruction Following with Thought Generation

Conversation Chronicles: Towards Diverse Temporal and Relational Dynamics in Multi-Session Conversations

Personalized Visual Instruction Tuning

Aria: An Open Multimodal Native Mixture-of-Experts Model

Korean-English Parallel Datasets (한국어-영어 병렬 데이터셋)

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Better & Faster Large Language Models via Multi-token Prediction

Text datasets with missing language information

Awesome feedback datasets

Translated (En->Ko) dataset

Korean Pretraining Dataset

Standard-format-preference-dataset

Domain Specific (Math, Code, etc)

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

AlpaGasus: Training A Better Alpaca with Fewer Data

Large Language Models as Analogical Reasoners

Efficient Streaming Language Models with Attention Sinks

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Small-scale proxies for large-scale Transformer training instabilities