How to use the CoreML model?
Sorry if this is a noob question:
I was able to drag in the mlpackage folder into my Xcode project and have it generate a class. I then do
let model = try! falcon_7b_64_float32()
and I noticed that the model has a 'prediction' function, but that takes in a falcon_7b_64_float32Input
type. It looks like the return type of that function is another special type as well. How do I convert from a string to input and from the output to another string text?
I'm curious as well! It'd be great to have the code from the demo shown in the video, so we can tinker.
I may be overthinking this, but I suspect it involves passing the String to a tokenizer built for this particular model, similar to these Swift CoreML transformers.
You are right @anomalus : you need to tokenize the text, and then process the outputs to create the output sequence. The model only returns information about the probability of the next token in the sequence, so you need to call it multiple times to get the output.
We intend to publish everything soon.
You are right @anomalus : you need to tokenize the text, and then process the outputs to create the output sequence. The model only returns information about the probability of the next token in the sequence, so you need to call it multiple times to get the output.
We intend to publish everything soon.
Would you be able to provide quick sample code to run this the mlpackage?
Posting this here: https://huggingface.co/blog/swift-coreml-llm
Thanks @pcuenq ! The only part I'm curious about is using Falcon 7b with Swift Chat is unusably slow. It takes maybe 5 minutes per word. I have a Macbook Pro M1 Max with 32GB of RAM, but SwiftChat uses 55GB+ of RAM on a simple run. Any advice on how to navigate this?