Triptuner Model
This model is trained to generate itineraries for locations in Sri Lanka's Central Province. It uses a custom transformer-based language model designed to handle character-level sequences.
Usage
The Triptuner model cannot be directly used with Hugging Face's built-in Inference API because it uses a custom architecture. Below are the instructions on how to manually load and use this model with PyTorch.
Load and Use the Model with PyTorch
import torch
# Define your custom model class
class BigramLanguageModel(nn.Module):
# Include the complete definition of your BigramLanguageModel here
# Example method definitions:
def __init__(self):
super().__init__()
# Define your model layers here as per the training setup
# Example:
# self.token_embedding_table = nn.Embedding(vocab_size, n_embd)
# self.position_embedding_table = nn.Embedding(block_size, n_embd)
# self.blocks = nn.Sequential(*[Block(n_embd, n_head=n_head) for _ in range(n_layer)])
# self.ln_f = nn.LayerNorm(n_embd)
# self.lm_head = nn.Linear(n_embd, vocab_size)
def forward(self, idx, targets=None):
# Define the forward pass as per your model
pass
def generate(self, idx, max_new_tokens):
# Implement the generate method for text generation
pass
# Load the model weights from Hugging Face
model = BigramLanguageModel()
model_url = "https://huggingface.co/yoonusajwardapiit/triptuner/resolve/main/pytorch_model.bin"
model_weights = torch.hub.load_state_dict_from_url(model_url, map_location=torch.device('cpu'), weights_only=True)
model.load_state_dict(model_weights)
model.eval()
# Define your character mappings
chars = sorted(list(set("your_training_text_here"))) # Replace with the actual character set used in training
stoi = {ch: i for i, ch in enumerate(chars)}
itos = {i: ch for i, ch in enumerate(chars)}
encode = lambda s: [stoi[c] for c in s]
decode = lambda l: ''.join([itos[i] for i in l])
# Test the model with a sample prompt
prompt = "Hanthana" # Replace with any relevant location or prompt
context = torch.tensor([encode(prompt)], dtype=torch.long)
# Generate text using the model
with torch.no_grad():
generated = model.generate(context, max_new_tokens=250) # Adjust the number of new tokens as needed
# Decode and print the generated text
generated_text = decode(generated[0].tolist())
print(generated_text)
## Training Data
The model was trained on a dataset containing information about various locations in Sri Lanka's Central Province.
## Model Architecture
- Number of Layers: 4
- Embedding Size: 64
- Number of Heads: 4
- Context Length: 32 tokens
## License
MIT License
- Downloads last month
- 97
Inference API (serverless) does not yet support torch models for this pipeline type.