Question Answering
Transformers
PyTorch
Safetensors
French
camembert
Inference Endpoints
BorisAlbar commited on
Commit
7c9e0ff
1 Parent(s): b9be56c

Upload model to v2

Browse files
Files changed (4) hide show
  1. README.md +37 -11
  2. config.json +1 -1
  3. pytorch_model.bin +2 -2
  4. tokenizer.json +2 -16
README.md CHANGED
@@ -39,25 +39,51 @@ This represents a total of over **138 061 questions/answers pairs used to finet
39
  | [PIAFv1.2](https://www.data.gouv.fr/en/datasets/piaf-le-dataset-francophone-de-questions-reponses/)| SQuAD v1 | 9 225 Q & A | X | X |
40
  | [FQuADv1.0](https://fquad.illuin.tech/)| SQuAD v1 | 20 731 Q & A | 3 188 Q & A (not used in training because it serves as a test dataset) | 2 189 Q & A (not used in our work because not freely available)|
41
  | [lincoln/newsquadfr](https://huggingface.co/datasets/lincoln/newsquadfr) | SQuAD v1 | 1 650 Q & A | 455 Q & A (not used in our work) | 415 Q & A (not used in our work) |
42
- | [pragnakalp/squad_v2_french_translated](https://huggingface.co/datasets/pragnakalp/squad_v2_french_translated)| SQuAD v2 | 79 069 Q & A | X | X |
43
- | [Mfa]()♪ | SQuAD v2 | 27 386 Q & A | X | X |
44
-
45
- ♪ this fifth data set will be added soon.
46
 
47
  ## Evaluation results
48
- ### FQuAD v1.0 Evaluation
49
- ```shell
50
- {"f1": 80.75789384679857, "exact_match": 57.214554579673774}
51
- ```
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
 
54
- ### Benchmark
55
 
56
  | Model | Exact_match | F1-score |
57
  | ----------- | ----------- | ----------- |
58
- | [etalab-ia/camembert-base-squadFR-fquad-piaf](https://huggingface.co/etalab-ia/camembert-base-squadFR-fquad-piaf) | 55.14 | 79.81 |
59
- | QAmembert | **57.21** | **80.76** |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
 
61
 
62
  ## Usage
63
  ### Example with answer in the context
 
39
  | [PIAFv1.2](https://www.data.gouv.fr/en/datasets/piaf-le-dataset-francophone-de-questions-reponses/)| SQuAD v1 | 9 225 Q & A | X | X |
40
  | [FQuADv1.0](https://fquad.illuin.tech/)| SQuAD v1 | 20 731 Q & A | 3 188 Q & A (not used in training because it serves as a test dataset) | 2 189 Q & A (not used in our work because not freely available)|
41
  | [lincoln/newsquadfr](https://huggingface.co/datasets/lincoln/newsquadfr) | SQuAD v1 | 1 650 Q & A | 455 Q & A (not used in our work) | 415 Q & A (not used in our work) |
42
+ | [pragnakalp/squad_v2_french_translated](https://huggingface.co/datasets/pragnakalp/squad_v2_french_translated)| SQuAD v2 | 79 069 Q & A | X | X |
 
 
 
43
 
44
  ## Evaluation results
 
 
 
 
45
 
46
+ The evaluation was carried out using the [**evaluate**](https://pypi.org/project/evaluate/) python package.
47
+
48
+ ### FQuaD 1.0 (validation)
49
+
50
+ The metric used is Squad v1.
51
+
52
+ | Model | Exact_match | F1-score |
53
+ | ----------- | ----------- | ----------- |
54
+ | [etalab-ia/camembert-base-squadFR-fquad-piaf](https://huggingface.co/etalab-ia/camembert-base-squadFR-fquad-piaf) | 53.60 | 78.09 |
55
+ | QAmembert (previous version) | 54.26 | 77.87 |
56
+ | QAmembert (this version) | 53.98 | 78.00 |
57
+ | QAmembert-large ♪ | **55.95** | **81.05** |
58
+ | [fT0](https://huggingface.co/CATIE-AQ/frenchT0) | 41.15 | 65.79 |
59
+
60
+ ♪ this model is available on demand only.
61
+
62
+ ### qwant/squad_fr (validation)
63
 
64
+ The metric used is Squad v1.
65
 
66
  | Model | Exact_match | F1-score |
67
  | ----------- | ----------- | ----------- |
68
+ | [etalab-ia/camembert-base-squadFR-fquad-piaf](https://huggingface.co/etalab-ia/camembert-base-squadFR-fquad-piaf) | 60.17 | 78.27 |
69
+ | QAmembert (previous version) | 60.40 | 77.27 |
70
+ | QAmembert (this version) | 60.95 | 77.30 |
71
+ | QAmembert-large ♪ | **65.58** | **81.74** |
72
+
73
+ ♪ this model is available on demand only.
74
+
75
+ ### frenchQA
76
+
77
+ This dataset includes question with no answers in the context. The metric used is Squad v2.
78
+
79
+ | Model | Exact_match | F1-score | Answer_f1 | NoAnswer_f1 |
80
+ | ----------- | ----------- | ----------- | ----------- | ----------- |
81
+ | [etalab-ia/camembert-base-squadFR-fquad-piaf](https://huggingface.co/etalab-ia/camembert-base-squadFR-fquad-piaf) | n/a | n/a | n/a | n/a |
82
+ | QAmembert (previous version) | 60.28 | 71.29 | 75.92 | 66.65
83
+ | QAmembert (this version) | **77.14** | 86.88 | 75.66 | 98.11
84
+ | QAmembert-large ♪ | **77.14** | **88.74** | **78.83** | **98.65**
85
 
86
+ ♪ this model is available on demand only.
87
 
88
  ## Usage
89
  ### Example with answer in the context
config.json CHANGED
@@ -21,7 +21,7 @@
21
  "pad_token_id": 1,
22
  "position_embedding_type": "absolute",
23
  "torch_dtype": "float32",
24
- "transformers_version": "4.24.0",
25
  "type_vocab_size": 1,
26
  "use_cache": true,
27
  "vocab_size": 32005
 
21
  "pad_token_id": 1,
22
  "position_embedding_type": "absolute",
23
  "torch_dtype": "float32",
24
+ "transformers_version": "4.26.1",
25
  "type_vocab_size": 1,
26
  "use_cache": true,
27
  "vocab_size": 32005
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6204c15c7cef356c5a2b4b4c254c71adf9564022176bf383168968f0b09e8115
3
- size 440202673
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36796fd3145baf67e83b7878ce5793998e26115a4dac47d9a5a8fee831a214d7
3
+ size 440204333
tokenizer.json CHANGED
@@ -1,21 +1,7 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 512,
6
- "strategy": "OnlySecond",
7
- "stride": 128
8
- },
9
- "padding": {
10
- "strategy": {
11
- "Fixed": 512
12
- },
13
- "direction": "Right",
14
- "pad_to_multiple_of": null,
15
- "pad_id": 1,
16
- "pad_type_id": 0,
17
- "pad_token": "<pad>"
18
- },
19
  "added_tokens": [
20
  {
21
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,