Yosef Worku Alemneh

rasyosef

AI & ML interests

Pretraining, Supervised Fine Tuning, Direct Preference Optimization, Retrieval Augmented Generation (RAG), Function Calling

Recent Activity

updated a model 1 day ago

rasyosef/Llama-3.2-180M-Amharic-Instruct

updated a dataset 2 days ago

rasyosef/amharic-openhermes

updated a dataset 2 days ago

rasyosef/amharic-openhermes

Organizations

None yet

rasyosef's activity

New activity in ContextualAI/ultrafeedback_clair_32k 10 days ago

Phi-2-Instruct-APO: aligned with Anchored Preference Optimization

#3 opened 2 months ago by

rasyosef

New activity in meta-llama/Llama-3.2-1B 18 days ago

[Query-ISSUE] tokenizer.vocab_size is 128000, however len(tokenizer) is 128256, which prevents me from using those other tokens.

#34 opened 23 days ago by

HV-Khurdula

What are the start and stop tokens of this model?

#40 opened 20 days ago by

aryaash

Is the BOS token id of 128000 hardcoded into the llama 3.2 tokenizer?

#17 opened about 2 months ago by

rasyosef

New activity in nvidia/Mistral-NeMo-Minitron-8B-Base about 2 months ago

Mistral-NeMo-Minitron-8B-Chat

#5 opened 3 months ago by

rasyosef

New activity in rasyosef/Phi-1_5-Instruct-v0.1 2 months ago

what is the context window size of this model , i means what is the input token and output tokens of this model

#1 opened 2 months ago by

naveen237

New activity in ContextualAI/ultrafeedback_clair_32k 3 months ago

APO Trainer in TRL?

#2 opened 3 months ago by

rasyosef

New activity in rasyosef/Mistral-NeMo-Minitron-8B-Chat 3 months ago

ChatML template does not work properly

#2 opened 3 months ago by

WasamiKirua

New activity in rasyosef/bert-medium-amharic 3 months ago

Collaboration

#1 opened 3 months ago by deleted

New activity in rasyosef/Llama-3.1-Minitron-4B-Chat 3 months ago

Error when trying to run

#1 opened 3 months ago by

ctranslate2-4you

New activity in microsoft/Phi-3.5-mini-instruct 3 months ago

What changed for people using this model in english?

#3 opened 3 months ago by

migueltalka

New activity in microsoft/phi-2 3 months ago

Phi 2 Instruct: an instruction following Phi 2 SLM that has undergone SFT and DPO

#132 opened 3 months ago by

rasyosef

New activity in open-llm-leaderboard/open_llm_leaderboard 4 months ago

What should a finetuned model's license be if the model is MIT but the datasets are Apache 2.0 and cc-by-4.0

#866 opened 4 months ago by

rasyosef

New activity in microsoft/phi-1_5 4 months ago

Phi 1.5 Instruct: an instruction following Phi 1.5 model that has undergone SFT and DPO

#89 opened 4 months ago by

rasyosef

New activity in rasyosef/amharic-sentences-corpus 4 months ago

Update README.md

#2 opened 4 months ago by

seyyaw

New activity in rasyosef/amharic-news-category-classification 6 months ago

Duplicate?

#2 opened 6 months ago by

israel

New activity in mistral-community/Mixtral-8x22B-Instruct-v0.1-4bit 7 months ago

Model card is about Mixtral-8x7B instead of Mixtral-8x22B

#3 opened 7 months ago by

rasyosef

New activity in microsoft/phi-2 10 months ago

New tokens generated with FP16 inference are only exclamation marks "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

#89 opened 10 months ago by

rasyosef

New tokens generated with FP16 inference are only exclamation marks "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

#89 opened 10 months ago by

rasyosef

New tokens generated with FP16 inference are only exclamation marks "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"

#89 opened 10 months ago by

rasyosef