Gerson Fabian Buenahora Ormaza commited on
Commit
f912326
1 Parent(s): 609ed28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -3
README.md CHANGED
@@ -1,3 +1,101 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - neural-bridge/rag-dataset-12000
5
+ language:
6
+ - en
7
+ ---
8
+
9
+ # RAGPT: Fine-tuned GPT-2 for Context-Based Question Answering
10
+
11
+ ## Model Description
12
+
13
+ RAGPT is a fine-tuned version of GPT-2 small, specifically adapted for context-based question answering tasks. This model has been trained to generate relevant answers based on a given context and question, similar to a Retrieval-Augmented Generation (RAG) system.
14
+
15
+ ### Key Features
16
+
17
+ - Based on the GPT-2 small architecture (124M parameters)
18
+ - Fine-tuned on the "neural-bridge/rag-dataset-12000" dataset from Hugging Face
19
+ - Capable of generating answers based on provided context and questions
20
+ - Suitable for various question-answering applications
21
+
22
+ ## Training Data
23
+
24
+ The model was fine-tuned using the "neural-bridge/rag-dataset-12000" dataset, which contains:
25
+ - Context passages
26
+ - Questions related to the context
27
+ - Corresponding answers
28
+
29
+ ## Fine-tuning Process
30
+
31
+ The fine-tuning process involved:
32
+ 1. Loading the pre-trained GPT-2 small model
33
+ 2. Preprocessing the dataset to combine context, question, and answer into a single text
34
+ 3. Training the model to predict the next token given the context and question
35
+
36
+ ### Hyperparameters
37
+
38
+ - Base model: GPT-2 small
39
+ - Number of training epochs: 3
40
+ - Batch size: 4
41
+ - Learning rate: Default AdamW optimizer settings
42
+ - Max sequence length: 512 tokens
43
+
44
+ ## Usage
45
+
46
+ To use the model:
47
+
48
+ ```python
49
+ from transformers import AutoTokenizer, AutoModelForCausalLM
50
+
51
+ model_name = "BueormLLC/RAGPT"
52
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
53
+ model = AutoModelForCausalLM.from_pretrained(model_name)
54
+
55
+ # Prepare input
56
+ context = "Your context here"
57
+ question = "Your question here"
58
+ input_text = f"Contexto: {context}\nPregunta: {question}\nRespuesta:"
59
+
60
+ # Generate answer
61
+ input_ids = tokenizer.encode(input_text, return_tensors="pt")
62
+ output = model.generate(input_ids, max_length=150, num_return_sequences=1)
63
+ answer = tokenizer.decode(output[0], skip_special_tokens=True)
64
+ ```
65
+
66
+ ## Limitations
67
+
68
+ - The model's knowledge is limited to its training data and the base GPT-2 model.
69
+ - It may sometimes generate irrelevant or incorrect answers, especially for topics outside its training domain.
70
+ - The model does not have access to external information or real-time data.
71
+
72
+ ## Ethical Considerations
73
+
74
+ Users should be aware that this model, like all language models, may reflect biases present in its training data. It should not be used as a sole source of information for critical decisions.
75
+
76
+ ## Future Improvements
77
+
78
+ - Fine-tuning on a larger and more diverse dataset
79
+ - Experimenting with larger base models (e.g., GPT-2 medium or large)
80
+ - Implementing techniques to improve factual accuracy and reduce hallucinations
81
+
82
+ ## Support us
83
+
84
+ - [Paypal](https://paypal.me/bueorm)
85
+ - [Patreon](https://patreon.com/bueorm)
86
+ ### We appreciate your support, without you we could not do what we do.
87
+
88
+ ## Citation
89
+
90
+ If you use this model in your research, please cite:
91
+
92
+ ```
93
+ @misc{RAGPT,
94
+ author = {Your Name or Organization},
95
+ title = {RAGPT: Fine-tuned GPT-2 for Context-Based Question Answering},
96
+ year = {2024},
97
+ publisher = {GitHub},
98
+ journal = {GitHub repository},
99
+ howpublished = {\url{https://huggingface.co/BueormLLC/RAGPT}}
100
+ }
101
+ ```