Spaces:

akashkj
/

H2OGPT

Runtime error

File size: 12,950 Bytes

3f7cfab

### Code to consider including:
[flan-alpaca](https://github.com/declare-lab/flan-alpaca)<br />
[text-generation-webui](https://github.com/oobabooga/text-generation-webui)<br />
[minimal-llama](https://github.com/zphang/minimal-llama/)<br />
[finetune GPT-NeoX](https://nn.labml.ai/neox/samples/finetune.html)<br />
[GPTQ-for_LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa/compare/cuda...Digitous:GPTQ-for-GPT-NeoX:main)<br />
[OpenChatKit on multi-GPU](https://github.com/togethercomputer/OpenChatKit/issues/20)<br />
[Non-Causal LLM](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForSequenceClassification)<br />
[OpenChatKit_Offload](https://github.com/togethercomputer/OpenChatKit/commit/148b5745a57a6059231178c41859ecb09164c157)<br />
[Flan-alpaca](https://github.com/declare-lab/flan-alpaca/blob/main/training.py)<br />

### Some open source models:
[GPT-NeoXT-Chat-Base-20B](https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B/tree/main)<br />
[GPT-NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)<br />
[GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)<br />
[Pythia-6.9B](https://huggingface.co/EleutherAI/pythia-6.9b)<br />
[Pythia-12B](https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b)<br />
[Flan-T5-XXL](https://huggingface.co/google/flan-t5-xxl)<br />
[GPT-J-Moderation-6B](https://huggingface.co/togethercomputer/GPT-JT-Moderation-6B)<br />
[OIG safety models](https://laion.ai/blog/oig-dataset/#safety-models)<br />
[BigScience-mT0](https://huggingface.co/mT0)<br />
[BigScience-XP3](https://huggingface.co/datasets/bigscience/xP3)<br />
[BigScience-Bloomz](https://huggingface.co/bigscience/bloomz)<br />

### Some create commons models that would be interesting to use:
[Galactica-120B](https://huggingface.co/facebook/galactica-120b)<br />
[LLaMa-small-pt](https://huggingface.co/decapoda-research/llama-smallint-pt)<br />
[LLaMa-64b-4bit](https://huggingface.co/maderix/llama-65b-4bit/tree/main)<br />

### Papers/Repos
[Self-improve](https://arxiv.org/abs/2210.11610)<br />
[Coding](https://arxiv.org/abs/2303.17491)<br />
[self-reflection](https://arxiv.org/abs/2303.11366)<br />
[RLHF](https://arxiv.org/abs/2204.05862)<br />
[DERA](https://arxiv.org/abs/2303.17071)<br />
[HAI Index Report 2023](https://aiindex.stanford.edu/report/)<br />
[LLaMa](https://arxiv.org/abs/2302.13971)<br />
[GLM-130B](https://github.com/THUDM/GLM-130B)<br />
[RWKV RNN](https://github.com/BlinkDL/RWKV-LM)<br />
[Toolformer](https://arxiv.org/abs/2302.04761)<br />
[GPTQ](https://github.com/qwopqwop200/GPTQ-for-LLaMa)<br />
[Retro](https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens)<br />
[Clinical_outperforms](https://arxiv.org/abs/2302.08091)<br />
[Chain-Of-Thought](https://github.com/amazon-science/mm-cot)<br />
[scaling law1](https://arxiv.org/abs/2203.15556)<br />
[Big-bench](https://github.com/google/BIG-bench)<br />
[Natural-Instructions](https://github.com/allenai/natural-instructions)<br />

### Other projects:
[StackLLaMa](https://huggingface.co/blog/stackllama)<br />
[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br />
[ColossalAIChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat)<br />
[EasyLM](https://github.com/young-geng/EasyLM.git)<br />
[Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)<br />
[Vicuna](https://vicuna.lmsys.org/)<br />
[Flan-Alpaca](https://github.com/declare-lab/flan-alpaca)<br />
[FastChat](https://chat.lmsys.org/)<br />
[alpaca-lora](https://github.com/h2oai/alpaca-lora)<br />
[alpaca.http](https://github.com/Nuked88/alpaca.http)<br />
[chatgpt-retrieval-pllugin](https://github.com/openai/chatgpt-retrieval-plugin)<br />
[subtl.ai docs search on private docs](https://www.subtl.ai/)<br />
[gertel](https://gretel.ai/)<br />
[alpaca_lora_4bit](https://github.com/johnsmith0031/alpaca_lora_4bit)<br />
[alpaca_lora_4bit_readme](https://github.com/s4rduk4r/alpaca_lora_4bit_readme)<br />
[code alpaca](https://github.com/sahil280114/codealpaca)<br />
[serge](https://github.com/nsarrazin/serge)<br />
[BlinkDL](https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio)<br />
[RWKV-LM](https://github.com/BlinkDL/RWKV-LM)<br />
[MosaicCM](https://github.com/mosaicml/examples#large-language-models-llms)<br />
[OpenAI Plugins](https://openai.com/blog/chatgpt-plugins)<br />
[GPT3.5-Turbo-PGVector](https://github.com/gannonh/gpt3.5-turbo-pgvector)<br />
[LLaMa-Adapter](https://github.com/ZrrSkywalker/LLaMA-Adapter)<br />
[llama-index](https://github.com/jerryjliu/llama_index)<br />
[minimal-llama](https://github.com/zphang/minimal-llama/)<br />
[llama.cpp](https://github.com/ggerganov/llama.cpp)<br />
[ggml](https://github.com/ggerganov/ggml)<br />
[mmap](https://justine.lol/mmap/)<br />
[lamma.cpp more](https://til.simonwillison.net/llms/llama-7b-m2)<br />
[TargetedSummarization](https://github.com/helliun/targetedSummarization)<br />
[OpenFlamingo](https://laion.ai/blog/open-flamingo/)<br />
[Auto-GPT](https://github.com/Torantulino/Auto-GPT)<br />

### Apache2/etc. Data
[OIG 43M instructions](https://laion.ai/blog/oig-dataset/) [direct HF link](https://huggingface.co/datasets/laion/OIG)<br />
[More on OIG](https://laion.ai/blog/oig-dataset/)<br />
[DataSet Viewer](https://huggingface.co/datasets/viewer/?dataset=squad)<br />
[Anthropic RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)<br />
[WebGPT_Comparisons](https://huggingface.co/datasets/openai/webgpt_comparisons)<br />
[Self_instruct](https://github.com/yizhongw/self-instruct)<br />
[20BChatModelData](https://github.com/togethercomputer/OpenDataHub)<br />

### Apache2/MIT/BSD-3 Summarization Data
[xsum for Summarization](https://huggingface.co/datasets/xsum)<br />
[Apache2 Summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:apache-2.0&sort=downloads)<br />
[MIT summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:mit&sort=downloads)<br />
[BSD-3 summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:bsd-3-clause&sort=downloads)<br />
[OpenRail](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:openrail&sort=downloads)<br />
[Summarize_from_feedback](https://huggingface.co/datasets/openai/summarize_from_feedback)<br />

### Ambiguous License Data
[GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)<br />
[GPT4All](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)<br />
[LinkGPT4](https://github.com/lm-sys/FastChat/issues/90#issuecomment-1493250773)<br />
[ShareGPT52K](https://huggingface.co/datasets/RyokoAI/ShareGPT52K)<br />
[ShareGPT_Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered)<br />
[ChatLogs](https://chatlogs.net/)<br />
[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)<br />
[LaMini-LM](https://github.com/mbzuai-nlp/LaMini-LM)<br />

### Non-commercial Data
[GPT-3 based Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned)<br />

### Prompt ENGR
[Prompt/P-tuning](https://github.com/huggingface/peft)<br />
[Prompt/P-tuing Nemo/NVIDIA](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html)<br />
[Info](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)<br />
[Info2](https://github.com/dair-ai/Prompt-Engineering-Guide)<br />
[Prompt-Tuning](https://arxiv.org/abs/2104.08691)<br />
[P-tuning v2](https://arxiv.org/abs/2110.07602)<br />
[babyagi](https://github.com/yoheinakajima/babyagi/blob/main/babyagi.py#L97-L134)<br />
[APE](https://www.promptingguide.ai/techniques/ape)<br />

### Validation
[Bleu/Rouge/Meteor/Bert-Score](https://arize.com/blog-course/generative-ai-metrics-bleu-score/)<br />

### Generate Hyperparameters
[hot-to-generate](https://huggingface.co/blog/how-to-generate)<br />
[Notes_on_Transformers Chpt5](https://christianjmills.com/posts/transformers-book-notes/chapter-5/index.html)<br />
[Notes_on_Transformers_Chpt10](https://christianjmills.com/posts/transformers-book-notes/chapter-10/index.html)<br />

### Embeddings
[OpenAI Expensive?](https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9)<br />
[Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)<br />

### Commercial products
[OpenAI](https://platform.openai.com/docs/guides/fine-tuning/advanced-usage)<br />
[OpenAI Tokenizer](https://platform.openai.com/tokenizer)<br />
[OpenAI Playground](https://platform.openai.com/playground)<br />
[OpenAI Chat](https://chat.openai.com/chat?)<br />
[OpenAI GPT-4 Chat](https://chat.openai.com/chat?model=gpt-4)<br />
[cohere](https://cohere.io/)<br />
[coherefinetune](https://docs.cohere.ai/reference/finetune)<br />
[DocsBotAI](https://docsbot.ai/)<br />
[Perplexity](https://www.perplexity.ai/)<br />
[VoiceFlow](https://www.voiceflow.com/)<br />
[NLPCloud](https://nlpcloud.com/effectively-using-gpt-j-gpt-neo-gpt-3-alternatives-few-shot-learning.html)<br />

### Multinode inference
[FasterTransformer](https://github.com/triton-inference-server/fastertransformer_backend#multi-node-inference)<br />
[Kubernetes Triton](https://developer.nvidia.com/blog/deploying-nvidia-triton-at-scale-with-mig-and-kubernetes/)<br />

### Faster inference
[text-generation-inference](https://github.com/huggingface/text-generation-inference)<br />
[Optimum](https://github.com/huggingface/optimum)<br />

### Semi-Open source Semi-Commercial products
[OpenAssistant](https://open-assistant.io/)<br />
[OpenAssistant Repo](https://github.com/LAION-AI/Open-Assistant)<br />
[OpenChatKit](https://github.com/togethercomputer/OpenChatKit)<br />
[OpenChatKit2](https://github.com/togethercomputer/OpenDataHub)<br />
[OpenChatKit3](https://www.together.xyz/blog/openchatkit)<br />
[OpenChatKit4](https://github.com/togethercomputer/OpenChatKit/blob/main/training/README.md#arguments)<br />
[OpenChatKitPreview](https://api.together.xyz/open-chat?preview=1)<br />
[langchain](https://python.langchain.com/en/latest/)<br />
[langchain+pinecone](https://www.youtube.com/watch?v=nMniwlGyX-c)<br />

### Q/A docs
[HUMATA](https://www.humata.ai/)<br />
[OSSCHat](https://osschat.io/)<br />
[NeuralSearchCohere](https://txt.cohere.com/embedding-archives-wikipedia/)<br />
[ue5](https://github.com/bublint/ue5-llama-lora)<br />

### AutoGPT type projects
[AgentGPT](https://github.com/reworkd/AgentGPT)<br />
[Self-DEBUG](https://arxiv.org/abs/2304.05128)<br />
[BabyAGI](https://github.com/yoheinakajima/babyagi/)<br />
[AutoPR](https://github.com/irgolic/AutoPR)<br />

### Cloud fine-tune
[AWS](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-fine-tune.html)<br />
[AWS2](https://aws.amazon.com/blogs/machine-learning/training-large-language-models-on-amazon-sagemaker-best-practices/)<br />

### Chatbots:
[GPT4ALL Chat](https://github.com/nomic-ai/gpt4all-chat)<br />
[GLT4ALL](https://github.com/nomic-ai/gpt4all)<br />
[OASSST](https://open-assistant.io/chat)<br />
[FastChat](https://github.com/lm-sys/FastChat)<br />
[Dolly](https://huggingface.co/spaces/HuggingFaceH4/databricks-dolly)<br />
[HF Instructions](https://huggingface.co/spaces/HuggingFaceH4/instruction-model-outputs-filtered)<br />
[DeepSpeed Chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)<br />
[LoraChat](https://github.com/bupticybee/FastLoRAChat)<br />
[Tabby](https://github.com/TabbyML/tabby)<br />
[TalkToModel](https://github.com/dylan-slack/TalkToModel)<br />
[You.com](https://you.com/)<br />

### LangChain or Agent related
[Gradio Tools](https://github.com/freddyaboulton/gradio-tools)<br />
[LLM Agents](https://blog.langchain.dev/gradio-llm-agents/)<br />
[Meta Prompt](https://github.com/mbchang/meta-prompt)<br />
[HF Agents](https://huggingface.co/docs/transformers/transformers_agents)
[HF Agents Collab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj)
[Einstein GPT](https://www.salesforce.com/products/einstein/overview/?d=cta-body-promo-8)
[SMOL-AI](https://github.com/smol-ai/developer)
[Pandas-AI](https://github.com/gventuri/pandas-ai/)

### Summaries
[LLMs](https://github.com/Mooler0410/LLMsPracticalGuide)<br />

### Deployment
[MLC-LLM](https://github.com/mlc-ai/mlc-llm)<br />

### Evaluations
[LMSYS (check for latest glob)](https://lmsys.org/blog/2023-05-25-leaderboard/)<br />
[LMSYS Chatbot Arena](https://chat.lmsys.org/?arena)<br />
[LMSYS Add model](https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model)<br />
[NLL](https://blog.gopenai.com/lmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)<br />
[HackAPrompt](https://www.aicrowd.com/challenges/hackaprompt-2023/leaderboards)<br />