Prompt format in `tokenizer_config.json`

#1
by justinthelaw - opened

The prompt format in the tokenizer_config.json seems to vary with the other Phi-3 instruct variants in the wild and from the official source (Microsoft). The change seems to be the removal of the system prompt section and conditional logic.

NeuralMagic: https://huggingface.co/neuralmagic/Phi-3-mini-128k-instruct-FP8/blob/402f133d5636dcb6e0cbb1209663b8956b36d0be/tokenizer_config.json#L119

Upstream: https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/d548c233192db00165d842bf8edff054bb3212f8/tokenizer_config.json#L119

Is this an intentional change? If so, why? I believe the system prompt is generally useful for most cases, like when creating agents or assistants.

Neural Magic org

Thanks @justinthelaw , we are taking a look at these today to rectify!

Additionally, I noticed the actual config definition code is old and out-of-date from upstream, leading to the old missing transformers_modules issue seen by the older version of Phi-3 when used with vLLM (0.4.x - 0.5.x). Recommend adding the configuration_phi3.py and modeling_phi3.py from upstream as an import in the config.json or direct download into this repo.

ModuleNotFoundError: No module named 'transformers_modules'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/nonroot/.pyenv/versions/3.11.6/lib/python3.11/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/home/leapfrogai/.venv/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 59, in _log_task_completion
    raise AsyncEngineDeadError(
vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an issue on Github. See stack trace above for theactual cause.
Neural Magic org

Hi @justinthelaw , thanks! The new config files should handle this correctly now.

Lin-K76 changed discussion status to closed

Sign up or log in to comment