|
### Code to consider including: |
|
[flan-alpaca](https://github.com/declare-lab/flan-alpaca)<br /> |
|
[text-generation-webui](https://github.com/oobabooga/text-generation-webui)<br /> |
|
[minimal-llama](https://github.com/zphang/minimal-llama/)<br /> |
|
[finetune GPT-NeoX](https://nn.labml.ai/neox/samples/finetune.html)<br /> |
|
[GPTQ-for_LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa/compare/cuda...Digitous:GPTQ-for-GPT-NeoX:main)<br /> |
|
[OpenChatKit on multi-GPU](https://github.com/togethercomputer/OpenChatKit/issues/20)<br /> |
|
[Non-Causal LLM](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForSequenceClassification)<br /> |
|
[OpenChatKit_Offload](https://github.com/togethercomputer/OpenChatKit/commit/148b5745a57a6059231178c41859ecb09164c157)<br /> |
|
[Flan-alpaca](https://github.com/declare-lab/flan-alpaca/blob/main/training.py)<br /> |
|
|
|
### Some open source models: |
|
[GPT-NeoXT-Chat-Base-20B](https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B/tree/main)<br /> |
|
[GPT-NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)<br /> |
|
[GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)<br /> |
|
[Pythia-6.9B](https://huggingface.co/EleutherAI/pythia-6.9b)<br /> |
|
[Pythia-12B](https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b)<br /> |
|
[Flan-T5-XXL](https://huggingface.co/google/flan-t5-xxl)<br /> |
|
[GPT-J-Moderation-6B](https://huggingface.co/togethercomputer/GPT-JT-Moderation-6B)<br /> |
|
[OIG safety models](https://laion.ai/blog/oig-dataset/#safety-models)<br /> |
|
[BigScience-mT0](https://huggingface.co/mT0)<br /> |
|
[BigScience-XP3](https://huggingface.co/datasets/bigscience/xP3)<br /> |
|
[BigScience-Bloomz](https://huggingface.co/bigscience/bloomz)<br /> |
|
|
|
### Some create commons models that would be interesting to use: |
|
[Galactica-120B](https://huggingface.co/facebook/galactica-120b)<br /> |
|
[LLaMa-small-pt](https://huggingface.co/decapoda-research/llama-smallint-pt)<br /> |
|
[LLaMa-64b-4bit](https://huggingface.co/maderix/llama-65b-4bit/tree/main)<br /> |
|
|
|
### Papers/Repos |
|
[Self-improve](https://arxiv.org/abs/2210.11610)<br /> |
|
[Coding](https://arxiv.org/abs/2303.17491)<br /> |
|
[self-reflection](https://arxiv.org/abs/2303.11366)<br /> |
|
[RLHF](https://arxiv.org/abs/2204.05862)<br /> |
|
[DERA](https://arxiv.org/abs/2303.17071)<br /> |
|
[HAI Index Report 2023](https://aiindex.stanford.edu/report/)<br /> |
|
[LLaMa](https://arxiv.org/abs/2302.13971)<br /> |
|
[GLM-130B](https://github.com/THUDM/GLM-130B)<br /> |
|
[RWKV RNN](https://github.com/BlinkDL/RWKV-LM)<br /> |
|
[Toolformer](https://arxiv.org/abs/2302.04761)<br /> |
|
[GPTQ](https://github.com/qwopqwop200/GPTQ-for-LLaMa)<br /> |
|
[Retro](https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens)<br /> |
|
[Clinical_outperforms](https://arxiv.org/abs/2302.08091)<br /> |
|
[Chain-Of-Thought](https://github.com/amazon-science/mm-cot)<br /> |
|
[scaling law1](https://arxiv.org/abs/2203.15556)<br /> |
|
[Big-bench](https://github.com/google/BIG-bench)<br /> |
|
[Natural-Instructions](https://github.com/allenai/natural-instructions)<br /> |
|
|
|
### Other projects: |
|
[StackLLaMa](https://huggingface.co/blog/stackllama)<br /> |
|
[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br /> |
|
[ColossalAIChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat)<br /> |
|
[EasyLM](https://github.com/young-geng/EasyLM.git)<br /> |
|
[Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)<br /> |
|
[Vicuna](https://vicuna.lmsys.org/)<br /> |
|
[Flan-Alpaca](https://github.com/declare-lab/flan-alpaca)<br /> |
|
[FastChat](https://chat.lmsys.org/)<br /> |
|
[alpaca-lora](https://github.com/h2oai/alpaca-lora)<br /> |
|
[alpaca.http](https://github.com/Nuked88/alpaca.http)<br /> |
|
[chatgpt-retrieval-pllugin](https://github.com/openai/chatgpt-retrieval-plugin)<br /> |
|
[subtl.ai docs search on private docs](https://www.subtl.ai/)<br /> |
|
[gertel](https://gretel.ai/)<br /> |
|
[alpaca_lora_4bit](https://github.com/johnsmith0031/alpaca_lora_4bit)<br /> |
|
[alpaca_lora_4bit_readme](https://github.com/s4rduk4r/alpaca_lora_4bit_readme)<br /> |
|
[code alpaca](https://github.com/sahil280114/codealpaca)<br /> |
|
[serge](https://github.com/nsarrazin/serge)<br /> |
|
[BlinkDL](https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio)<br /> |
|
[RWKV-LM](https://github.com/BlinkDL/RWKV-LM)<br /> |
|
[MosaicCM](https://github.com/mosaicml/examples#large-language-models-llms)<br /> |
|
[OpenAI Plugins](https://openai.com/blog/chatgpt-plugins)<br /> |
|
[GPT3.5-Turbo-PGVector](https://github.com/gannonh/gpt3.5-turbo-pgvector)<br /> |
|
[LLaMa-Adapter](https://github.com/ZrrSkywalker/LLaMA-Adapter)<br /> |
|
[llama-index](https://github.com/jerryjliu/llama_index)<br /> |
|
[minimal-llama](https://github.com/zphang/minimal-llama/)<br /> |
|
[llama.cpp](https://github.com/ggerganov/llama.cpp)<br /> |
|
[ggml](https://github.com/ggerganov/ggml)<br /> |
|
[mmap](https://justine.lol/mmap/)<br /> |
|
[lamma.cpp more](https://til.simonwillison.net/llms/llama-7b-m2)<br /> |
|
[TargetedSummarization](https://github.com/helliun/targetedSummarization)<br /> |
|
[OpenFlamingo](https://laion.ai/blog/open-flamingo/)<br /> |
|
[Auto-GPT](https://github.com/Torantulino/Auto-GPT)<br /> |
|
|
|
### Apache2/etc. Data |
|
[OIG 43M instructions](https://laion.ai/blog/oig-dataset/) [direct HF link](https://huggingface.co/datasets/laion/OIG)<br /> |
|
[More on OIG](https://laion.ai/blog/oig-dataset/)<br /> |
|
[DataSet Viewer](https://huggingface.co/datasets/viewer/?dataset=squad)<br /> |
|
[Anthropic RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)<br /> |
|
[WebGPT_Comparisons](https://huggingface.co/datasets/openai/webgpt_comparisons)<br /> |
|
[Self_instruct](https://github.com/yizhongw/self-instruct)<br /> |
|
[20BChatModelData](https://github.com/togethercomputer/OpenDataHub)<br /> |
|
|
|
### Apache2/MIT/BSD-3 Summarization Data |
|
[xsum for Summarization](https://huggingface.co/datasets/xsum)<br /> |
|
[Apache2 Summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:apache-2.0&sort=downloads)<br /> |
|
[MIT summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:mit&sort=downloads)<br /> |
|
[BSD-3 summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:bsd-3-clause&sort=downloads)<br /> |
|
[OpenRail](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:openrail&sort=downloads)<br /> |
|
[Summarize_from_feedback](https://huggingface.co/datasets/openai/summarize_from_feedback)<br /> |
|
|
|
### Ambiguous License Data |
|
[GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)<br /> |
|
[GPT4All](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)<br /> |
|
[LinkGPT4](https://github.com/lm-sys/FastChat/issues/90#issuecomment-1493250773)<br /> |
|
[ShareGPT52K](https://huggingface.co/datasets/RyokoAI/ShareGPT52K)<br /> |
|
[ShareGPT_Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered)<br /> |
|
[ChatLogs](https://chatlogs.net/)<br /> |
|
[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br /> |
|
[LaMini-LM](https://github.com/mbzuai-nlp/LaMini-LM)<br /> |
|
|
|
### Non-commercial Data |
|
[GPT-3 based Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned)<br /> |
|
|
|
### Prompt ENGR |
|
[Prompt/P-tuning](https://github.com/huggingface/peft)<br /> |
|
[Prompt/P-tuing Nemo/NVIDIA](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html)<br /> |
|
[Info](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)<br /> |
|
[Info2](https://github.com/dair-ai/Prompt-Engineering-Guide)<br /> |
|
[Prompt-Tuning](https://arxiv.org/abs/2104.08691)<br /> |
|
[P-tuning v2](https://arxiv.org/abs/2110.07602)<br /> |
|
[babyagi](https://github.com/yoheinakajima/babyagi/blob/main/babyagi.py#L97-L134)<br /> |
|
[APE](https://www.promptingguide.ai/techniques/ape)<br /> |
|
|
|
### Validation |
|
[Bleu/Rouge/Meteor/Bert-Score](https://arize.com/blog-course/generative-ai-metrics-bleu-score/)<br /> |
|
|
|
### Generate Hyperparameters |
|
[hot-to-generate](https://huggingface.co/blog/how-to-generate)<br /> |
|
[Notes_on_Transformers Chpt5](https://christianjmills.com/posts/transformers-book-notes/chapter-5/index.html)<br /> |
|
[Notes_on_Transformers_Chpt10](https://christianjmills.com/posts/transformers-book-notes/chapter-10/index.html)<br /> |
|
|
|
### Embeddings |
|
[OpenAI Expensive?](https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9)<br /> |
|
[Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)<br /> |
|
|
|
### Commercial products |
|
[OpenAI](https://platform.openai.com/docs/guides/fine-tuning/advanced-usage)<br /> |
|
[OpenAI Tokenizer](https://platform.openai.com/tokenizer)<br /> |
|
[OpenAI Playground](https://platform.openai.com/playground)<br /> |
|
[OpenAI Chat](https://chat.openai.com/chat?)<br /> |
|
[OpenAI GPT-4 Chat](https://chat.openai.com/chat?model=gpt-4)<br /> |
|
[cohere](https://cohere.io/)<br /> |
|
[coherefinetune](https://docs.cohere.ai/reference/finetune)<br /> |
|
[DocsBotAI](https://docsbot.ai/)<br /> |
|
[Perplexity](https://www.perplexity.ai/)<br /> |
|
[VoiceFlow](https://www.voiceflow.com/)<br /> |
|
[NLPCloud](https://nlpcloud.com/effectively-using-gpt-j-gpt-neo-gpt-3-alternatives-few-shot-learning.html)<br /> |
|
|
|
### Multinode inference |
|
[FasterTransformer](https://github.com/triton-inference-server/fastertransformer_backend#multi-node-inference)<br /> |
|
[Kubernetes Triton](https://developer.nvidia.com/blog/deploying-nvidia-triton-at-scale-with-mig-and-kubernetes/)<br /> |
|
|
|
### Faster inference |
|
[text-generation-inference](https://github.com/huggingface/text-generation-inference)<br /> |
|
[Optimum](https://github.com/huggingface/optimum)<br /> |
|
|
|
### Semi-Open source Semi-Commercial products |
|
[OpenAssistant](https://open-assistant.io/)<br /> |
|
[OpenAssistant Repo](https://github.com/LAION-AI/Open-Assistant)<br /> |
|
[OpenChatKit](https://github.com/togethercomputer/OpenChatKit)<br /> |
|
[OpenChatKit2](https://github.com/togethercomputer/OpenDataHub)<br /> |
|
[OpenChatKit3](https://www.together.xyz/blog/openchatkit)<br /> |
|
[OpenChatKit4](https://github.com/togethercomputer/OpenChatKit/blob/main/training/README.md#arguments)<br /> |
|
[OpenChatKitPreview](https://api.together.xyz/open-chat?preview=1)<br /> |
|
[langchain](https://python.langchain.com/en/latest/)<br /> |
|
[langchain+pinecone](https://www.youtube.com/watch?v=nMniwlGyX-c)<br /> |
|
|
|
### Q/A docs |
|
[HUMATA](https://www.humata.ai/)<br /> |
|
[OSSCHat](https://osschat.io/)<br /> |
|
[NeuralSearchCohere](https://txt.cohere.com/embedding-archives-wikipedia/)<br /> |
|
[ue5](https://github.com/bublint/ue5-llama-lora)<br /> |
|
|
|
### AutoGPT type projects |
|
[AgentGPT](https://github.com/reworkd/AgentGPT)<br /> |
|
[Self-DEBUG](https://arxiv.org/abs/2304.05128)<br /> |
|
[BabyAGI](https://github.com/yoheinakajima/babyagi/)<br /> |
|
[AutoPR](https://github.com/irgolic/AutoPR)<br /> |
|
|
|
### Cloud fine-tune |
|
[AWS](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-fine-tune.html)<br /> |
|
[AWS2](https://aws.amazon.com/blogs/machine-learning/training-large-language-models-on-amazon-sagemaker-best-practices/)<br /> |
|
|
|
### Chatbots: |
|
[GPT4ALL Chat](https://github.com/nomic-ai/gpt4all-chat)<br /> |
|
[GLT4ALL](https://github.com/nomic-ai/gpt4all)<br /> |
|
[OASSST](https://open-assistant.io/chat)<br /> |
|
[FastChat](https://github.com/lm-sys/FastChat)<br /> |
|
[Dolly](https://huggingface.co/spaces/HuggingFaceH4/databricks-dolly)<br /> |
|
[HF Instructions](https://huggingface.co/spaces/HuggingFaceH4/instruction-model-outputs-filtered)<br /> |
|
[DeepSpeed Chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)<br /> |
|
[LoraChat](https://github.com/bupticybee/FastLoRAChat)<br /> |
|
[Tabby](https://github.com/TabbyML/tabby)<br /> |
|
[TalkToModel](https://github.com/dylan-slack/TalkToModel)<br /> |
|
[You.com](https://you.com/)<br /> |
|
|
|
### LangChain or Agent related |
|
[Gradio Tools](https://github.com/freddyaboulton/gradio-tools)<br /> |
|
[LLM Agents](https://blog.langchain.dev/gradio-llm-agents/)<br /> |
|
[Meta Prompt](https://github.com/mbchang/meta-prompt)<br /> |
|
[HF Agents](https://huggingface.co/docs/transformers/transformers_agents) |
|
[HF Agents Collab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj) |
|
[Einstein GPT](https://www.salesforce.com/products/einstein/overview/?d=cta-body-promo-8) |
|
[SMOL-AI](https://github.com/smol-ai/developer) |
|
[Pandas-AI](https://github.com/gventuri/pandas-ai/) |
|
|
|
### Summaries |
|
[LLMs](https://github.com/Mooler0410/LLMsPracticalGuide)<br /> |
|
|
|
### Deployment |
|
[MLC-LLM](https://github.com/mlc-ai/mlc-llm)<br /> |
|
|
|
### Evaluations |
|
[LMSYS (check for latest glob)](https://lmsys.org/blog/2023-05-25-leaderboard/)<br /> |
|
[LMSYS Chatbot Arena](https://chat.lmsys.org/?arena)<br /> |
|
[LMSYS Add model](https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model)<br /> |
|
[NLL](https://blog.gopenai.com/lmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)<br /> |
|
[HackAPrompt](https://www.aicrowd.com/challenges/hackaprompt-2023/leaderboards)<br /> |
|
|