Use V1 tokenizer instead

#10

by Rocketknight1 HF staff - opened Jul 9

base: refs/heads/main

←

from: refs/pr/10

Discussion Files changed

+32001

-45824

Rocketknight1

Jul 9

No description provided.

Upload tokenizerc356b812

Rocketknight1 changed pull request title from Upload tokenizer to Use V1 tokenizer instead Jul 9

Rocketknight1

Jul 9

There was an issue with the last PR - we used the V3 tokenizer, but this base model actually uses the V1 tokenizer. This should fix the issue!

lbathen

Jul 9

@Rocketknight1 does it affect the vocab size? Model and tokenizer sizes are not matching. So model is failing to load.

Rocketknight1

Jul 10

@lbathen can you give me some code to reproduce that issue? From here it looks like the tokenizer and the model both have a vocab size of 32000

lbathen

Jul 10

@Rocketknight1 I confirmed that both show same vocab of 32K now. I had pulled the wrong revision :)

wuliang-google

Jul 26

Is this going to be merged soon?

jdpressman

Aug 21

•

edited Aug 21

@Rocketknight1 Could you merge this in? It's working on my end and I'm thankful to have this model back.

This command should get it running for anyone who needs it:

python -m vllm.entrypoints.openai.api_server --model mistralai/Mixtral-8x22B-v0.1 --revision c356b81 --served-model-name mistralai/Mixtral-8x22B-v0.1 --max-logprobs 100 --gpu-memory-utilization=0.85 --disable-log-requests --disable-log-stats --port 5001 --tensor-parallel-size 8

pandora-s

Mistral AI_ org Aug 21

My apologies! Merging this PR!

pandora-s changed pull request status to closed Aug 21

pandora-s changed pull request status to open Aug 21

pandora-s changed pull request status to merged Aug 21

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment