File size: 1,581 Bytes
cab3c07
 
 
 
 
8b01a19
 
 
 
 
 
cab3c07
 
8b01a19
 
 
 
 
 
cab3c07
 
 
 
 
 
 
 
 
 
 
 
8b01a19
cab3c07
 
 
 
 
 
 
8b01a19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
license: apache-2.0
pipeline_tag: text-generation
tags:
- code
datasets:
- semiotic/SynQL-KaggleDBQA-Train
language:
- en
base_model:
- google-t5/t5-3b
---

# Model Card for T5-3B/SynQL-KaggleDBQA-Train-Run-02
- Developed by: Semiotic Labs
- Model type: [Text to SQL]
- License: [Apache-2.0]
- Finetuned from model: [google-t5/t5-3b](https://huggingface.co/google-t5/t5-3b)
- Dataset used for finetuning: [semiotic/SynQL-KaggleDBQA-Train](https://huggingface.co/datasets/semiotic/SynQL-KaggleDBQA-Train/blob/main/README.md)

## Model Context

Example metadata can be found below, context represents the prompt that is presented to the model. Database schemas follow the encoding method proposed by [Shaw et al (2020)](https://arxiv.org/pdf/2010.12725).
```
"query": "SELECT count(*) FROM singer",
"question": "How many singers do we have?",
"context": "How many singers do we have? | concert_singer | stadium : stadium_id, location, name, capacity, highest, lowest, average | singer : singer_id, name, country, song_name, song_release_year, age, is_male | concert : concert_id, concert_name, theme, stadium_id, year | singer_in_concert : concert_id, singer_id",
"db_id": "concert_singer",
```
## Model Results

Evaluation set: [KaggleDBQA/test](https://github.com/Chia-Hsuan-Lee/KaggleDBQA)

Evaluation metrics: [Execution Accuracy]

| Model | Data | Run | Execution Accuracy | 
|-------|------|-----|-------------------|
| T5-3B | semiotic/SynQL-KaggleDBQA | 00 | 0.3514 |
| T5-3B | semiotic/SynQL-KaggleDBQA | 01 | 0.3514 |
| T5-3B | semiotic/SynQL-KaggleDBQA | 02 | 0.3514 |