arxiv:2310.08164
Abdullah
amirabdullah19852020
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Organizations
Papers
1
models
16
amirabdullah19852020/interpreting_reward_models
Updated
amirabdullah19852020/test
Text Generation
•
Updated
•
13
amirabdullah19852020/gpt-neo-125m_hh_reward
Text Generation
•
Updated
•
16
amirabdullah19852020/gpt-neo-125m_utility_reward
Reinforcement Learning
•
Updated
•
14
amirabdullah19852020/pythia-70m_sentiment_reward
Reinforcement Learning
•
Updated
•
34
amirabdullah19852020/pythia-160m_sentiment_reward
Reinforcement Learning
•
Updated
•
14
amirabdullah19852020/gpt-neo-125m_sentiment_reward
Reinforcement Learning
•
Updated
•
11
amirabdullah19852020/pythia-160m_utility_reward
Reinforcement Learning
•
Updated
•
12
amirabdullah19852020/pythia-70m_utility_reward
Reinforcement Learning
•
Updated
•
15
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward
Reinforcement Learning
•
Updated