lafontaine-gpt / README.md
Alexandre D-Julin
updated readme
392f7c6
metadata
license: apache-2.0
tags:
  - text-generation
  - transformers
  - language-model
  - bigram-model
  - lafontaine
model-index:
  - name: Lafontaine GPT Model
    results:
      - task:
          type: text-generation
        dataset:
          name: La Fontaine's Fables
          type: custom
        metrics:
          - type: Perplexity
            value: 15.2

Lafontaine GPT Model

This is a language model based on La Fontaine's fables. It uses a transformer-based architecture to generate text inspired by La Fontaine's style.

Using the Model with Gradio

To interact with the model, you can use the following Gradio script:

import gradio as gr
import torch

# Assuming 'BigramLanguageModel' and 'decode' are defined as in your model code

class GradioInterface:
    def __init__(self, model_path="lafontaine_gpt_v1.pth"):
        self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
        self.model = self.load_model(model_path)
        self.model.eval()

    def load_model(self, model_path):
        model = BigramLanguageModel().to(self.device)
        model.load_state_dict(torch.load(model_path, map_location=self.device))
        return model

    def generate_text(self, input_text, max_tokens=100):
        context = torch.tensor([encode(input_text)], dtype=torch.long, device=self.device)
        output = self.model.generate(context, max_new_tokens=max_tokens)
        return decode(output[0].tolist())

# Load the model
model_interface = GradioInterface()

# Define Gradio interface
gr_interface = gr.Interface(
    fn=model_interface.generate_text,
    inputs=["text", gr.Slider(50, 500)],
    outputs="text",
    description="Bigram Language Model text generation. Enter some text, and the model will continue it.",
    examples=[["Once upon a time"]]
)

# Launch the interface
gr_interface.launch()

Model Details

  • Architecture: Transformer-based bigram language model
  • Dataset: La Fontaine's fables

How to Use

You can use this model in your own projects by loading the model weights and running it on your input text.