leafspark
/

wikichat-v2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

WikiChat-v0.2

Training in progress model to have conversations.

The GGUFs uploaded are full FP32 precision.

Using OpenOrca GPT-4 data + cosmopedia for some extra data + dolly15k for instruct

Model Details:

83.59M parameters (83591800)
8 attention heads
40 layers
384 embeddings size
4096/8192/16384 context (please use 2/4x RoPE scaling, may train a 16k finetuned version later)
Batch size 16
llama.cpp (train-text-from-scratch)

Prompt Format (Alpaca):

Instruction: {system}
Input: {prompt}
Response: {response}

Please structure your prompts in an instruct format for maximum performance.

Training Details:

1x RTX 3070 8GB (Infrencing speed: 80tok/s, full GPU offload)
1x Ryzen 3 3700x
96gb RAM
10 iterations
Loss Target = 2.5 to 3.0
Approx 480 samples/1M train tokens (>0.0001 epoches)
Training data = Refer to OpenOrca page

Notes:

The model isn't ready yet; this is to test tokenization of OpenOrca and a balance between training speed and model size

Example output:

User: What is the square root of 4?

Assistant: The square root of 4 is 2.

Downloads last month: 42

GGUF

Model size

215M params

Architecture

llama

32-bit

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train leafspark/wikichat-v2