kevin510 commited on
Commit
1461f24
1 Parent(s): c68cdaa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -7
README.md CHANGED
@@ -12,7 +12,7 @@ This [Github repository](https://github.com/ConiferLabsWA/flan-ul2-alpaca) conta
12
 
13
  ### Resource Considerations
14
 
15
- A goal of this project was to produce this model with a limited budget demonstrating the ability train a robust, commercially viable LLM using systems available to even small businesses and individuals. This had the added benefit of personally saving me money as well :). To achieve this a server was rented on [vultr.com](vultr.com) with the following pricing/specs:
16
  - Pricing: $1.302/hour
17
  - OS: Ubuntu 22.10 x64
18
  - 6 vCPUs
@@ -27,12 +27,6 @@ To dramatically reduce memory footprint and compute requirements [Low Rank Adapt
27
  - 8 Bit Mode: Yes
28
 
29
 
30
- ### Why?
31
-
32
- Rapid recent advancements in the natural language processing (NLP) space have been extraordinary. Large Language Models (LLMs) like Meta's LLaMA are getting a lot of attention with their remarkable generative abilities however, many people are looking at the implications of these projects and looking for ways to leverage the technology in a commercial setting. Unfortunately, many LLMs (ie LLaMA, Vicuna) are limited by their licensing, restricting opportunities for usage within businesses and products.
33
-
34
- To address this issue, the entirely open-source [Flan-UL2 model](https://huggingface.co/google/flan-ul2), built by Google on the [Flan-T5](https://arxiv.org/abs/2210.11416) encoder-decoder framework, is an excellent alternative to LLMs with more restrictive licensing. Flan-UL2 is accessible for commercial applications and fine-tuned on academic NLP tasks, providing exceptional performance in comparison to models of similar size across various benchmarks. Additionally, with a receptive field of 2048 token is suitable for a number of LLM tasks including [Retrieval Augmented Generation (RAG)](https://arxiv.org/abs/2005.11401).
35
-
36
  ### Usage
37
 
38
  ```
 
12
 
13
  ### Resource Considerations
14
 
15
+ A goal of this project was to produce this model with a limited budget demonstrating the ability train a robust LLM using systems available to even small businesses and individuals. This had the added benefit of personally saving me money as well :). To achieve this a server was rented on [vultr.com](vultr.com) with the following pricing/specs:
16
  - Pricing: $1.302/hour
17
  - OS: Ubuntu 22.10 x64
18
  - 6 vCPUs
 
27
  - 8 Bit Mode: Yes
28
 
29
 
 
 
 
 
 
 
30
  ### Usage
31
 
32
  ```