weishen

fakerbaby

AI & ML interests

NLP, alignment, LLM

Recent Activity

liked a dataset 23 days ago
yingyingzhang/metamath-qwen2-math
liked a dataset 23 days ago
nvidia/OpenMathInstruct-2
liked a dataset 28 days ago
KbsdJames/Omni-MATH
View all activity

Organizations

fakerbaby's activity

Reacted to onekq's post with πŸ‘ 2 months ago
view post
Post
2551
Here is my latest study on OpenAIπŸ“o1πŸ“.
A Case Study of Web App Coding with OpenAI Reasoning Models (2409.13773)

I wrote an easy-to-read blogpost to explain finding.
https://huggingface.co/blog/onekq/daily-software-engineering-work-reasoning-models

INSTRUCTION FOLLOWING is the key.

100% instruction following + Reasoning = new SOTA

But if the model misses or misunderstands one instruction, it can perform far worse than non-reasoning models.