AxAI (Shikhar Singh)

upvoted an article 8 days ago

Article

Llama can now see and run on your device - welcome Llama 3.2

11 days ago

• 137

upvoted a paper 24 days ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 78

upvoted a collection about 2 months ago

Qwen2-Math

Collection

Math-specific model series based on Qwen2 • 8 items • Updated 17 days ago • 44

upvoted 3 articles 3 months ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

Jul 11

• 93

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18

• 65

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5

• 110

upvoted a collection 3 months ago

Gemma 2 Release

Collection

15 items • Updated 26 days ago • 176

upvoted an article 3 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 170

upvoted a collection 4 months ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 17 days ago • 340

upvoted a paper 4 months ago

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23 • 32

upvoted 2 articles 5 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 201

Article

Fine-tuning Llama 2 70B using PyTorch FSDP

Sep 13, 2023

• 13

upvoted a collection 5 months ago

Yi-1.5 (2024/05)

Collection

10 items • Updated May 20 • 88

upvoted 5 articles 5 months ago

Article

Optimizing your LLM in production

Sep 15, 2023

• 14

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 161

Article

Accelerating Document AI

Nov 21, 2022

• 31

Article

A Dive into Pretraining Strategies for Vision-Language Models

Feb 3, 2023

• 36

Article

Design choices for Vision Language Models in 2024

By

•

Apr 16

• 24

upvoted a paper 5 months ago

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

upvoted 2 articles 5 months ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

By

•

Jun 23

• 33

Article

Vision Language Models Explained

Apr 11

• 185

upvoted a collection 5 months ago

VILA: On Pre-training for Visual Language Models

Collection

10 items • Updated Aug 21 • 44

upvoted a paper 6 months ago

MobileSAMv2: Faster Segment Anything to Everything

Paper • 2312.09579 • Published Dec 15, 2023 • 20

upvoted a collection 6 months ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 10 days ago • 676

upvoted a paper 6 months ago

LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 65

upvoted 3 papers 7 months ago

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15 • 56

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Paper • 2307.02499 • Published Jul 4, 2023 • 15

upvoted 2 papers 10 months ago

TinyGSM: achieving >80% on GSM8k with small language models

Paper • 2312.09241 • Published Dec 14, 2023 • 36

CogVLM: Visual Expert for Pretrained Language Models

Paper • 2311.03079 • Published Nov 6, 2023 • 23

upvoted 2 collections 11 months ago

OpenChat

Collection

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data • 7 items • Updated Jul 31 • 33

Zephyr 7B

Collection

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 144

upvoted 2 papers 12 months ago

Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

Paper • 2308.07921 • Published Aug 15, 2023 • 22

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53

upvoted 10 papers about 1 year ago

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Paper • 2309.11674 • Published Sep 20, 2023 • 31

Investigating Answerability of LLMs for Long-Form Question Answering

Paper • 2309.08210 • Published Sep 15, 2023 • 12

Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Paper • 2309.10020 • Published Sep 18, 2023 • 40

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82

YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 65

OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

Paper • 2306.16527 • Published Jun 21, 2023 • 47

UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Paper • 2308.03279 • Published Aug 7, 2023 • 21

upvoted a paper over 1 year ago

Faster Segment Anything: Towards Lightweight SAM for Mobile Applications

Paper • 2306.14289 • Published Jun 25, 2023 • 15

Shikhar Singh

AI & ML interests

Organizations

AxAI's activity

Llama can now see and run on your device - welcome Llama 3.2

How NuminaMath Won the 1st AIMO Progress Prize

Docmatix - a huge dataset for Document Visual Question Answering

ColPali: Efficient Document Retrieval with Vision Language Models 👀

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Fine-tuning Llama 2 70B using PyTorch FSDP

Optimizing your LLM in production

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Accelerating Document AI

A Dive into Pretraining Strategies for Vision-Language Models

Design choices for Vision Language Models in 2024

SeeMoE: Implementing a MoE Vision Language Model from Scratch

Vision Language Models Explained