Spaces:
Runtime error
Runtime error
Alexander Seifert
commited on
Commit
•
d19773e
1
Parent(s):
d5ecc0d
typos
Browse files
README.md
CHANGED
@@ -15,10 +15,10 @@ pinned: true
|
|
15 |
Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Practitioners tend to write throwaway code or, worse, skip this crucial step of understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
|
16 |
|
17 |
|
18 |
-
Some interesting
|
19 |
|
20 |
-
* customizable visualization of neural network activation, based on the embedding and the feed-forward layers of
|
21 |
-
* customizable similarity map of a 2d projection of
|
22 |
* inline HTML representation of samples with token-level prediction + labels (my own; see 'Samples by loss' page for more info)
|
23 |
* automatic selection of foreground-color (black/white) for a user-selected background-color
|
24 |
* some fancy pandas styling here and there
|
@@ -44,7 +44,7 @@ A group of neurons tend to fire in response to commas and other punctuation. Oth
|
|
44 |
|
45 |
### Embeddings
|
46 |
|
47 |
-
For every token in the dataset, we take its hidden state and project it onto a two-dimensional plane. Data points are colored by label/prediction, with mislabeled examples
|
48 |
|
49 |
|
50 |
### Probing
|
|
|
15 |
Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Practitioners tend to write throwaway code or, worse, skip this crucial step of understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
|
16 |
|
17 |
|
18 |
+
Some interesting visualization techniques:
|
19 |
|
20 |
+
* customizable visualization of neural network activation, based on the embedding layer and the feed-forward layers of the selected transformer model. (https://aclanthology.org/2021.acl-demo.30/)
|
21 |
+
* customizable similarity map of a 2d projection of the model's final layer's hidden states, using various algorithms (a bit like the [Tensorflow Embedding Projector](https://projector.tensorflow.org/))
|
22 |
* inline HTML representation of samples with token-level prediction + labels (my own; see 'Samples by loss' page for more info)
|
23 |
* automatic selection of foreground-color (black/white) for a user-selected background-color
|
24 |
* some fancy pandas styling here and there
|
|
|
44 |
|
45 |
### Embeddings
|
46 |
|
47 |
+
For every token in the dataset, we take its hidden state and project it onto a two-dimensional plane. Data points are colored by label/prediction, with mislabeled examples marked by a small black border.
|
48 |
|
49 |
|
50 |
### Probing
|