JJ's picture

54

JJ

J22

·

AI & ML interests

None yet

Recent Activity

New activity 20 days ago

ibm-granite/granite-3.0-3b-a800m-instruct

updated a model 20 days ago

ibm-granite/granite-3.0-3b-a800m-instruct

New activity 21 days ago

facebook/MobileLLM-1B

Organizations

None yet

J22's activity

New activity in ibm-granite/granite-3.0-3b-a800m-instruct 20 days ago

Upload tokenizer.json

#1 opened 20 days ago by

J22

New activity in facebook/MobileLLM-1B 21 days ago

a horrible function in `modeling_mobilellm.py`

#5 opened 21 days ago by

J22

New activity in allenai/OLMoE-1B-7B-0924-Instruct 2 months ago

Run this on CPU

#6 opened 2 months ago by

J22

New activity in openbmb/MiniCPM3-4B 2 months ago

Run on CPU

#13 opened 2 months ago by

J22

New activity in microsoft/Phi-3.5-MoE-instruct 3 months ago

need gguf

#4 opened 3 months ago by

New activity in meta-llama/Llama-3.1-8B-Instruct 4 months ago

Best practice for tool calling with meta-llama/Meta-Llama-3.1-8B-Instruct

#33 opened 4 months ago by

Run this on CPU and use tool calling

#38 opened 4 months ago by

J22

New activity in AI-MO/NuminaMath-7B-TIR 4 months ago

My alternative quantizations.

#5 opened 4 months ago by

New activity in mistralai/Mistral-7B-Instruct-v0.3 5 months ago

Tool calling is supported by ChatLLM.cpp

#36 opened 5 months ago by

J22

New activity in mistralai/Mistral-7B-Instruct-v0.3 6 months ago

can't say hello

#9 opened 6 months ago by

J22

no system message?

#14 opened 6 months ago by

New activity in microsoft/Phi-3-small-8k-instruct 6 months ago

"small" is so different from "mini" and "medium"

#8 opened 6 months ago by

J22

New activity in nvidia/Llama3-ChatQA-1.5-8B 7 months ago

how to set context in multi-turn QA?

#14 opened 7 months ago by

J22

New activity in microsoft/Phi-3-mini-128k-instruct 7 months ago

clarification on the usage of `short_factor` and `long_factor`?

#49 opened 7 months ago by

J22

Continue the discussion: `long_factor` and `short_factor`

#32 opened 7 months ago by

J22

New activity in microsoft/Phi-3-mini-4k-instruct 7 months ago

is the '\n' after `'<|end|>'`?

#43 opened 7 months ago by

J22

Is sliding window used or not?

#25 opened 7 months ago by

J22

New activity in microsoft/Phi-3-mini-128k-instruct 7 months ago

`long_factor` is never used?

#22 opened 7 months ago by

J22

generate +6 min, +20GB V-ram

#17 opened 7 months ago by

`sliding_window` is larger than `max_position_embeddings`

#21 opened 7 months ago by

J22