derek-thomas HF staff commited on
Commit
52ba5a5
1 Parent(s): 87e30f1

Adding more clarification

Browse files
Files changed (1) hide show
  1. notebooks/03_get_embeddings.ipynb +19 -0
notebooks/03_get_embeddings.ipynb CHANGED
@@ -8,6 +8,24 @@
8
  "# Pre-requisites"
9
  ]
10
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  {
12
  "cell_type": "markdown",
13
  "id": "3102abce-ea42-4da6-8c98-c6dd4edf7f0b",
@@ -297,6 +315,7 @@
297
  }
298
  ],
299
  "source": [
 
300
  "# Create a list of async tasks\n",
301
  "tasks = [main(documents[i:i+MAX_WORKERS]) for i in range(0, len(documents), MAX_WORKERS)]\n",
302
  "\n",
 
8
  "# Pre-requisites"
9
  ]
10
  },
11
+ {
12
+ "cell_type": "markdown",
13
+ "id": "5f625807-0707-4e2f-a0e0-8fbcdf08c865",
14
+ "metadata": {},
15
+ "source": [
16
+ "## Why TEI\n",
17
+ "There are 2 **unsung** challenges with RAG at scale:\n",
18
+ "1. Getting the embeddings efficiently\n",
19
+ "1. Efficient ingestion into the vector DB\n",
20
+ "\n",
21
+ "The issue with `1.` is that there are techniques but they are not widely *applied*. TEI solves a number of aspects:\n",
22
+ "- Token Based Dynamic Batching\n",
23
+ "- Using latest optimizations (Flash Attention, Candle and cuBLASLt)\n",
24
+ "- Fast loading with safetensors\n",
25
+ "\n",
26
+ "The issue with `2.` is that it takes a bit of planning. We wont go much into that side of things here though."
27
+ ]
28
+ },
29
  {
30
  "cell_type": "markdown",
31
  "id": "3102abce-ea42-4da6-8c98-c6dd4edf7f0b",
 
315
  }
316
  ],
317
  "source": [
318
+ "%%time\n",
319
  "# Create a list of async tasks\n",
320
  "tasks = [main(documents[i:i+MAX_WORKERS]) for i in range(0, len(documents), MAX_WORKERS)]\n",
321
  "\n",