Anas Awadalla's picture

Anas Awadalla

anas-awadalla

·

AI & ML interests

None yet

Recent Activity

liked a dataset 5 days ago

Salesforce/blip3-kale

authored a paper 7 days ago

upvoted a paper 8 days ago

Organizations

anas-awadalla's activity

upvoted a paper 8 days ago

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Paper • 2411.07461 • Published 10 days ago • 21

upvoted 4 papers 3 months ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20 • 56

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 97

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15 • 44

upvoted a collection 3 months ago

XGen-MM-1 models and datasets

A collection of all XGen-MM (Foundation LMM) models! • 15 items • Updated 17 days ago • 34

upvoted 2 collections 4 months ago

Gemma 2 2B Release

The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 31 • 76

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 54

upvoted a paper 4 months ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10 • 67

upvoted a collection 5 months ago

4M Tokenizers

Multimodal tokenizers from https://4m.epfl.ch/ • 12 items • Updated Jun 14 • 4

upvoted 2 papers 5 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 99

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Paper • 2406.11271 • Published Jun 17 • 20

upvoted a paper 10 months ago

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18 • 15

upvoted a collection about 1 year ago

Tiny Series

Tiny datasets that empower the foundation of Small Language Model! • 11 items • Updated Jan 26 • 36

upvoted 2 papers over 1 year ago

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Paper • 2308.01390 • Published Aug 2, 2023 • 32

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Paper • 2306.17842 • Published Jun 30, 2023 • 9