Erwin Weiß's picture

16 9

Erwin Weiß

Rexschwert

AI & ML interests

AI, Big Data, Data Science, Machine Learning, Computer Vision, Natural Language Processing

Recent Activity

upvoted a paper 6 days ago

Predicting Emergent Capabilities by Finetuning

upvoted a paper 6 days ago

LLMs Do Not Think Step-by-step In Implicit Reasoning

upvoted a paper 6 days ago

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

View all activity

Organizations

Rexschwert's activity

upvoted 7 papers 6 days ago

Predicting Emergent Capabilities by Finetuning

Paper • 2411.16035 • Published 8 days ago • 6

LLMs Do Not Think Step-by-step In Implicit Reasoning

Paper • 2411.15862 • Published 9 days ago • 8

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Paper • 2411.15671 • Published 10 days ago • 7

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Paper • 2411.16508 • Published 8 days ago • 7

Knowledge Transfer Across Modalities with Natural Language Supervision

Paper • 2411.15611 • Published 10 days ago • 15

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Paper • 2411.16489 • Published 8 days ago • 35

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Paper • 2411.16594 • Published 8 days ago • 35

liked a model 12 days ago

empower-dev/llama3-empower-functions-small-gguf-v1.1

Updated Sep 11 • 155 • 2

upvoted 7 papers about 1 month ago

Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch

Paper • 2410.18693 • Published Oct 24 • 40

Can Knowledge Editing Really Correct Hallucinations?

Paper • 2410.16251 • Published Oct 21 • 54

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22 • 88

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30 • 17

Stealing User Prompts from Mixture of Experts

Paper • 2410.22884 • Published Oct 30 • 13

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Paper • 2410.22391 • Published Oct 29 • 21

Task Vectors are Cross-Modal

Paper • 2410.22330 • Published Oct 29 • 11

liked 3 models about 1 month ago

unsloth/Llama-3.2-90B-Vision-Instruct-bnb-4bit

Image-Text-to-Text • Updated 11 days ago • 3.35k • 16

AetherArchitectural/GGUF-Quantization-Script

Text Generation • Updated 14 days ago • 62

empower-dev/llama3-empower-functions-large-v1.1

Text Generation • Updated Sep 10 • 53 • 1

upvoted 2 collections about 2 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 630

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Oct 24 • 517