Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1810.04805

This is information about BERT

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14

Language model papers

RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 9
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 29
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 10

Must-read Papers

Some of the most important and insightful (in my opinion) AI papers of today, with a focus on NLP and LLMs.

ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 14
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 103

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 14
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

Fundamentals NLP

Distributed Representations of Sentences and Documents

Paper • 1405.4053 • Published May 16, 2014
Sequence to Sequence Learning with Neural Networks

Paper • 1409.3215 • Published Sep 10, 2014 • 3
PaLM: Scaling Language Modeling with Pathways

Paper • 2204.02311 • Published Apr 5, 2022 • 1
Recent Trends in Deep Learning Based Natural Language Processing

Paper • 1708.02709 • Published Aug 9, 2017

Must Reads On Language Model

Dive into the world of generative AI with some prominent papers of Language Model, unlocking the secrets of natural language processing.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 7
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11
OPT: Open Pre-trained Transformer Language Models

Paper • 2205.01068 • Published May 2, 2022 • 2

Research Papers

Research papers related to NLP.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
Self-Attention with Relative Position Representations

Paper • 1803.02155 • Published Mar 6, 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Paper • 2401.12954 • Published Jan 23 • 28

Papers - Text - Decoders

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14 • 12
A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1

Papers - Text - Encoders

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14 • 12
Triple-Encoders: Representations That Fire Together, Wire Together

Paper • 2402.12332 • Published Feb 19 • 2

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs