Xiao Yu's picture

2 1 4

Xiao Yu

jasonyux

·

jasonyux

AI & ML interests

None yet

Recent Activity

updated a model 21 days ago

jasonyux/gpt2-alwaysno

authored a paper 21 days ago

Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

authored a paper 21 days ago

ConFit: Improving Resume-Job Matching using Data Augmentation and Contrastive Learning

View all activity

Organizations

jasonyux's activity

updated a model 21 days ago

jasonyux/gpt2-alwaysno

Text Generation • Updated 21 days ago • 9

authored 4 papers 21 days ago

Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning

Paper • 2305.13660 • Published May 23, 2023

ConFit: Improving Resume-Job Matching using Data Augmentation and Contrastive Learning

Paper • 2401.16349 • Published Jan 29

LIONs: An Empirically Optimized Approach to Align Language Models

Paper • 2407.06542 • Published Jul 9

Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published Oct 2 • 9

upvoted a paper about 2 months ago

Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published Oct 2 • 9

commented a paper about 2 months ago

Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published Oct 2 • 9 •

updated 9 datasets 5 months ago

Columbia-NLP/DPO-hh-rlhf

Viewer • Updated Jul 10 • 169k • 107

Columbia-NLP/DPO-PKU-SafeRLHF

Viewer • Updated Jul 10 • 136k • 49

Columbia-NLP/DPO-HelpSteer

Viewer • Updated Jul 10 • 9.17k • 41

Columbia-NLP/DPO-tldr-summarisation-preferences

Viewer • Updated Jul 10 • 177k • 78

Columbia-NLP/DPO-py-dpo-v0.1

Viewer • Updated Jul 10 • 9.47k • 46

Columbia-NLP/DPO-UltraFeedback_binarized

Viewer • Updated Jul 10 • 62.7k • 44

Columbia-NLP/DPO-distilabel-intel-orca-dpo-pairs_cleaned

Viewer • Updated Jul 10 • 12.8k • 40

Columbia-NLP/DPO-distilabel-capybara-dpo-7k-binarized

Viewer • Updated Jul 10 • 7.56k • 42

Columbia-NLP/DPO-Nectar

Viewer • Updated Jul 10 • 183k • 47

updated 2 collections 5 months ago

LION-datasets

Datasets used to train the LION pipeline. Paper: https://arxiv.org/abs/2407.06542; Code: https://github.com/Columbia-NLP-Lab/LionAlignment • 9 items • Updated Jul 10

LION-series

Models trained using the LION pipeline. Paper: https://arxiv.org/abs/2407.06542; Code: https://github.com/Columbia-NLP-Lab/LionAlignment • 6 items • Updated Jul 10