problem with tokens for fine tuning

#4
by WasamiKirua - opened

Hi Occiglot team,

I am trying to fine tune the model using axolotl. the chat template i am using is chatml. Here are the token i manage with axolotl ( i have taken them all from one of my config i have used for mistral 7b)

special_tokens:
  bos_token: "<s>"
  eos_token: "<|im_end|>"
  unk_token: "<unk>"
tokens:
  - "<|im_start|>"
  - "<|im_end|>"

sometimes the model starts tiu hallucinate and print random text:

... duraturo tra coloro che l'avevano conosciuta. 0\nimport {\n Body,\n Controller,\n Delete,\n Get,\n HttpCode,\n HttpStatus,\n Param,\n Post,\n Put,\n} from '@nestjs/common';\nimport { CreateTodoDto } from './dto/create-todo.dto';\nimport { TodoService } from './todo.service';\nimport { UpdateTodoDto } from './dto/update-todo.dto';\nimport { TodoResponse, TodosResponse } from './interface/todos.response';\nimport { Todo } from './entities/todo.entity';\n\n@Controller('api/v1/todo')\nexport class TodoController {\n constructor(private readonly todoService: TodoService) {}\n\n @Get ()\n async getTodos(): Promise {\n return this.todoService.findAll();\n }\n\n @Post ('create')\n @HttpCode(HttpStatus.CREATED)\n create( @Body () body: CreateTodoDto): Promise {\n return this.todoService.create(body);\n }\n\n @Put ('update/:id')\n update(\n @Param ('id') id: string,\n @Body () body: UpdateTodoDto,\n ): Promise {\n return this.todoService.update(id, body);\n }\n\n @Delete ('delete/:id')\n delete( @Param ('id') id: string): Promise {\n return this.todoService.remove(id);\n }\n}\n

perhaps the tokens it's not needed or something else ? can someone give me an hint ?

i have tried to quantize occiglot (no fine tuning) and to run it in ollama. Now if like you assert the tokenizer is the same as per Mistral 1.0, the Modelfile must be something like:

FROM ./occiglot-7b-it-en.Q8_0.gguf
TEMPLATE "[INST] {{ .System }} {{ .Prompt }} [/INST]"
PARAMETER stop [INST]
PARAMETER stop [/INST]

the model return empry resposnes or it hallucinates almost with 100% rate.

ollama  | llm_load_print_meta: model type       = 7B
ollama  | llm_load_print_meta: model ftype      = Q8_0
ollama  | llm_load_print_meta: model params     = 7.24 B
ollama  | llm_load_print_meta: model size       = 7.17 GiB (8.50 BPW) 
ollama  | llm_load_print_meta: general.name     = Mistral 7B v0.1
ollama  | llm_load_print_meta: BOS token        = 1 '<s>'
ollama  | llm_load_print_meta: EOS token        = 2 '</s>'
ollama  | llm_load_print_meta: UNK token        = 0 '<unk>'
ollama  | llm_load_print_meta: LF token         = 13 '<0x0A>'

Sign up or log in to comment