charlieoneill commited on
Commit
4c2dac4
1 Parent(s): 74e2e27

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +13 -8
app.py CHANGED
@@ -298,22 +298,27 @@ def create_interface():
298
  gr.Markdown("""
299
  # SAErch: Sparse Autoencoder-enhanced Semantic Search
300
 
301
- Welcome to SAErch, an innovative approach to semantic search using Sparse Autoencoders (SAEs) trained on dense text embeddings.
302
 
303
  ## Key Concepts:
304
 
305
- 1. **Sparse Autoencoders (SAEs)**: Neural networks that learn to reconstruct input data using a sparse set of features, helping to disentangle complex representations.
306
 
307
- 2. **Feature Families**: Groups of related SAE features that represent concepts at varying levels of abstraction.
308
 
309
- 3. **Embedding Interventions**: Technique to modify search queries by manipulating specific semantic features identified by the SAE.
310
 
311
  ## How It Works:
312
 
313
- 1. SAEs are trained on embeddings from scientific paper abstracts.
314
- 2. The SAE learns interpretable features that capture various semantic concepts.
315
- 3. Users can interact with these features to fine-tune search queries.
316
- 4. The system performs semantic search using the modified embeddings.
 
 
 
 
 
317
 
318
  Explore the "SAErch" tab to try out the semantic search capabilities, or dive into the "Feature Visualisation" tab to examine the learned features in more detail.
319
 
 
298
  gr.Markdown("""
299
  # SAErch: Sparse Autoencoder-enhanced Semantic Search
300
 
301
+ Welcome to SAErch, an innovative approach to semantic search using Sparse Autoencoders (SAEs) trained on dense text embeddings. This tool builds upon recent advancements in the application of SAEs to language models and embeddings.
302
 
303
  ## Key Concepts:
304
 
305
+ 1. **Sparse Autoencoders (SAEs)**: Neural networks that learn to reconstruct input data using a sparse set of features, helping to disentangle complex representations. SAEs have shown promising results in uncovering interpretable features in language models.
306
 
307
+ 2. **Feature Families**: Groups of related SAE features that represent concepts at varying levels of abstraction, allowing for multi-scale semantic analysis and manipulation.
308
 
309
+ 3. **Embedding Interventions**: Technique to modify search queries by manipulating specific semantic features identified by the SAE, enabling fine-grained control over query semantics.
310
 
311
  ## How It Works:
312
 
313
+ 1. SAEs are trained on embeddings from scientific paper abstracts, learning interpretable features that capture various semantic concepts.
314
+ 2. Users can interact with these features to fine-tune search queries.
315
+ 3. The system performs semantic search using the modified embeddings, allowing for more precise and controllable results.
316
+
317
+ ## Key References:
318
+
319
+ - [Towards Monosemanticity: Decomposing Language Models With Dictionary Learning](https://transformer-circuits.pub/2023/monosemantic-features) - Anthropic's pioneering work on applying SAEs to language models.
320
+ - [Prism: Mapping Interpretable Concepts and Features in a Latent Space of Language](https://thesephist.com/posts/prism/#caveats-and-limitations) - An early application of SAEs to embeddings, demonstrating their potential for interpretable concept mapping.
321
+ - [Scaling and Evaluating Sparse Autoencoders](https://arxiv.org/html/2406.04093v1) - OpenAI's research on scaling SAEs, showcasing the effectiveness of top-k SAEs.
322
 
323
  Explore the "SAErch" tab to try out the semantic search capabilities, or dive into the "Feature Visualisation" tab to examine the learned features in more detail.
324