Spaces:

aseifert
/

ExplaiNER

Runtime error

App Files Files

Alexander Seifert commited on Jun 15, 2022

Commit

c8d36ae

•

1 Parent(s): 7a75a86

add stuff for vis2

Browse files

Files changed (20) hide show

.gitignore +1 -0
Makefile +8 -0
README.md +0 -20
html/index.md +102 -0
html/screenshot.jpg +0 -0
presentation.pdf +0 -0
requirements.txt +1 -0
src/app.py +7 -0
src/data.py +4 -3
src/subpages/attention.py +4 -1
src/subpages/find_duplicates.py +1 -0
src/subpages/hidden_states.py +3 -0
src/subpages/inspect.py +1 -0
src/subpages/losses.py +1 -0
src/subpages/lossy_samples.py +1 -0
src/subpages/metrics.py +3 -0
src/subpages/misclassified.py +1 -0
src/subpages/probing.py +3 -0
src/subpages/random_samples.py +1 -0
src/subpages/raw_data.py +1 -0

.gitignore CHANGED Viewed

@@ -163,3 +163,4 @@ cache_dir/
 outputs/
 models/
 runs/

 outputs/
 models/
 runs/
+.vscode/

Makefile ADDED Viewed

	@@ -0,0 +1,8 @@

+doc:
+	pdoc --docformat google src -o doc
+vis2: doc
+	pandoc html/index.md -s -o html/index.html
+run:
+	python -m streamlit run src/app.py

README.md CHANGED Viewed

@@ -14,26 +14,6 @@ pinned: true
 Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Practitioners tend to write throwaway code or, worse, skip this crucial step of understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
-Some interesting visualization techniques:
-* customizable visualization of neural network activation, based on the embedding layer and the feed-forward layers of the selected transformer model. (https://aclanthology.org/2021.acl-demo.30/)
-* customizable similarity map of a 2d projection of the model's final layer's hidden states, using various algorithms (a bit like the [Tensorflow Embedding Projector](https://projector.tensorflow.org/))
-* inline HTML representation of samples with token-level prediction + labels (my own; see 'Samples by loss' page for more info)
-* automatic selection of foreground-color (black/white) for a user-selected background-color
-* some fancy pandas styling here and there
-Libraries important to this project:
-* `streamlit` for demoing (custom multi-page feature hacked in, also using session state)
-* `plotly` and `matplotlib` for charting
-* `transformers` for providing the models, and `datasets` for, well, the datasets
-* a forked, slightly modified version of [`ecco`](https://github.com/jalammar/ecco) for visualizing the neural net activations
-* `sentence_transformers` for finding potential duplicates
-* `scikit-learn` for TruncatedSVD & PCA, `umap-learn` for UMAP
 ## Sections


14
15	Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Practitioners tend to write throwaway code or, worse, skip this crucial step of understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
16




















17	## Sections
18
19

html/index.md ADDED Viewed

	@@ -0,0 +1,102 @@

+---
+title: "🏷️ ExplaiNER"
+subtitle: "Error Analysis for NER models & datasets"
+---
+Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Practitioners tend to write throwaway code or, worse, skip this crucial step of understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
+[Documentation](../doc/index.html) | [Slides](../presentation.pdf)
+## Getting started
+```bash
+# Install requirements
+pip install -r requirements.txt  # you'll need Python 3.9+
+# Run
+make run
+```
+## ExplaiNER's features
+![](./screenshot.jpg)
+Some interesting visualization techniques contained in this project:
+* customizable visualization of neural network activation, based on the embedding layer and the feed-forward layers of the selected transformer model. (https://aclanthology.org/2021.acl-demo.30/)
+* customizable similarity map of a 2d projection of the model's final layer's hidden states, using various algorithms (a bit like the [Tensorflow Embedding Projector](https://projector.tensorflow.org/))
+* inline HTML representation of samples with token-level prediction + labels (my own; see 'Samples by loss' page for more info)
+Libraries important to this project:
+* `streamlit` for demoing (custom multi-page feature hacked in, also using session state)
+* `plotly` and `matplotlib` for charting
+* `transformers` for providing the models, and `datasets` for, well, the datasets
+* a forked, slightly modified version of [`ecco`](https://github.com/jalammar/ecco) for visualizing the neural net activations
+* `sentence_transformers` for finding potential duplicates
+* `scikit-learn` for TruncatedSVD & PCA, `umap-learn` for UMAP
+## Application Sections
+### Activations
+A group of neurons tend to fire in response to commas and other punctuation. Other groups of neurons tend to fire in response to pronouns. Use this visualization to factorize neuron activity in individual FFNN layers or in the entire model.
+### Hidden States
+For every token in the dataset, we take its hidden state and project it onto a two-dimensional plane. Data points are colored by label/prediction, with mislabeled examples marked by a small black border.
+### Probing
+A very direct and interactive way to test your model is by providing it with a list of text inputs and then inspecting the model outputs. The application features a multiline text field so the user can input multiple texts separated by newlines. For each text, the app will show a data frame containing the tokenized string, token predictions, probabilities and a visual indicator for low probability predictions -- these are the ones you should inspect first for prediction errors.
+### Metrics
+The metrics page contains precision, recall and f-score metrics as well as a confusion matrix over all the classes. By default, the confusion matrix is normalized. There's an option to zero out the diagonal, leaving only prediction errors (here it makes sense to turn off normalization, so you get raw error counts).
+### Misclassified
+This page contains all misclassified examples and allows filtering by specific error types.
+### Loss by Token/Label
+Show count, mean and median loss per token and label.
+### Samples by Loss
+Show every example sorted by loss (descending) for close inspection.
+### Random Samples
+Show random samples. Simple method, but it often turns up interesting things.
+### Find Duplicates
+Find potential duplicates in the data using cosine similarity.
+### Inspect
+Inspect your whole dataset, either unfiltered or by id.
+### Raw data
+See the data as seen by your model.
+### Debug
+Debug info.

html/screenshot.jpg ADDED Viewed

presentation.pdf ADDED Viewed

Binary file (693 kB). View file

requirements.txt CHANGED Viewed

@@ -12,4 +12,5 @@ matplotlib
 seqeval
 streamlit-aggrid
 streamlit_option_menu
 git+https://github.com/aseifert/ecco.git@streamlit

 seqeval
 streamlit-aggrid
 streamlit_option_menu
+pdoc
 git+https://github.com/aseifert/ecco.git@streamlit

src/app.py CHANGED Viewed

@@ -1,3 +1,9 @@
 import pandas as pd
 import streamlit as st
 from streamlit_option_menu import option_menu
@@ -69,6 +75,7 @@ def _write_color_legend(context):
 def main():
     pages: list[Page] = [
         HomePage(),
         AttentionPage(),

+"""The App module is the main entry point for the application.
+    Run `streamlit run app.py` to start the app.
+"""
 import pandas as pd
 import streamlit as st
 from streamlit_option_menu import option_menu
 def main():
+    """The main entry point for the application."""
     pages: list[Page] = [
         HomePage(),
         AttentionPage(),

src/data.py CHANGED Viewed

@@ -12,8 +12,9 @@ from src.utils import device, tokenizer_hash_funcs
 @st.cache(allow_output_mutation=True)
 def get_data(ds_name: str, config_name: str, split_name: str, split_sample_size: int) -> Dataset:
-    """Loads dataset from the HF hub (if not already loaded) and returns a Dataset object.
-    Uses datasets.load_dataset to load the dataset (see its documentation for additional details).
     Args:
         ds_name (str): Path or name of the dataset.
@@ -34,7 +35,7 @@ def get_data(ds_name: str, config_name: str, split_name: str, split_sample_size:
     hash_funcs=tokenizer_hash_funcs,
 )
 def get_collator(tokenizer) -> DataCollatorForTokenClassification:
-    """Data collator that will dynamically pad the inputs received, as well as the labels.
     Args:
         tokenizer ([PreTrainedTokenizer] or [PreTrainedTokenizerFast]): The tokenizer used for encoding the data.

 @st.cache(allow_output_mutation=True)
 def get_data(ds_name: str, config_name: str, split_name: str, split_sample_size: int) -> Dataset:
+    """Loads a Dataset from the HuggingFace hub (if not already loaded).
+    Uses `datasets.load_dataset` to load the dataset (see its documentation for additional details).
     Args:
         ds_name (str): Path or name of the dataset.
     hash_funcs=tokenizer_hash_funcs,
 )
 def get_collator(tokenizer) -> DataCollatorForTokenClassification:
+    """Returns a DataCollator that will dynamically pad the inputs received, as well as the labels.
     Args:
         tokenizer ([PreTrainedTokenizer] or [PreTrainedTokenizerFast]): The tokenizer used for encoding the data.

src/subpages/attention.py CHANGED Viewed

@@ -1,3 +1,6 @@
 import ecco
 import streamlit as st
 from streamlit.components.v1 import html
@@ -151,7 +154,7 @@ class AttentionPage(Page):
             )
         with col2:
             st.subheader("–")
-            text = st.text_area("Text", key="act_default_text")
         inputs = lm.tokenizer([text], return_tensors="pt")
         output = lm(inputs)

+"""
+A group of neurons tend to fire in response to commas and other punctuation. Other groups of neurons tend to fire in response to pronouns. Use this visualization to factorize neuron activity in individual FFNN layers or in the entire model.
+"""
 import ecco
 import streamlit as st
 from streamlit.components.v1 import html
             )
         with col2:
             st.subheader("–")
+            text = st.text_area("Text", key="act_default_text", height=240)
         inputs = lm.tokenizer([text], return_tensors="pt")
         output = lm(inputs)

src/subpages/find_duplicates.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import streamlit as st
 from sentence_transformers.util import cos_sim

+"""Find potential duplicates in the data using cosine similarity."""
 import streamlit as st
 from sentence_transformers.util import cos_sim

src/subpages/hidden_states.py CHANGED Viewed

@@ -1,3 +1,6 @@
 import numpy as np
 import plotly.express as px
 import plotly.graph_objects as go

+"""
+For every token in the dataset, we take its hidden state and project it onto a two-dimensional plane. Data points are colored by label/prediction, with mislabeled examples marked by a small black border.
+"""
 import numpy as np
 import plotly.express as px
 import plotly.graph_objects as go

src/subpages/inspect.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import streamlit as st
 from src.subpages.page import Context, Page

+"""Inspect your whole dataset, either unfiltered or by id."""
 import streamlit as st
 from src.subpages.page import Context, Page

src/subpages/losses.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import streamlit as st
 from src.subpages.page import Context, Page

+"""Show count, mean and median loss per token and label."""
 import streamlit as st
 from src.subpages.page import Context, Page

src/subpages/lossy_samples.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import pandas as pd
 import streamlit as st

+"""Show every example sorted by loss (descending) for close inspection."""
 import pandas as pd
 import streamlit as st

src/subpages/metrics.py CHANGED Viewed

@@ -1,3 +1,6 @@
 import re
 import matplotlib.pyplot as plt

+"""
+The metrics page contains precision, recall and f-score metrics as well as a confusion matrix over all the classes. By default, the confusion matrix is normalized. There's an option to zero out the diagonal, leaving only prediction errors (here it makes sense to turn off normalization, so you get raw error counts).
+"""
 import re
 import matplotlib.pyplot as plt

src/subpages/misclassified.py CHANGED Viewed

@@ -1,3 +1,4 @@
 from collections import defaultdict
 import pandas as pd

+"""This page contains all misclassified examples and allows filtering by specific error types."""
 from collections import defaultdict
 import pandas as pd

src/subpages/probing.py CHANGED Viewed

@@ -1,3 +1,6 @@
 import streamlit as st
 from src.subpages.page import Context, Page

+"""
+A very direct and interactive way to test your model is by providing it with a list of text inputs and then inspecting the model outputs. The application features a multiline text field so the user can input multiple texts separated by newlines. For each text, the app will show a data frame containing the tokenized string, token predictions, probabilities and a visual indicator for low probability predictions -- these are the ones you should inspect first for prediction errors.
+"""
 import streamlit as st
 from src.subpages.page import Context, Page

src/subpages/random_samples.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import pandas as pd
 import streamlit as st

+"""Show random samples. Simple method, but it often turns up interesting things."""
 import pandas as pd
 import streamlit as st

src/subpages/raw_data.py CHANGED Viewed

@@ -1,3 +1,4 @@
 import pandas as pd
 import streamlit as st

+"""See the data as seen by your model."""
 import pandas as pd
 import streamlit as st