James PRO

jtatman

AI & ML interests

improving domain specific models and re-sampling data, refining datasets for use in different modalities, small scale micro-llm clusters using quantized and smoothed models, and all emerging llm stack connecting technologies. Small models rock.

Recent Activity

updated a model about 8 hours ago

jtatman/grounding-dino-finetuned-license-plates

updated a model about 8 hours ago

jtatman/grounding-dino-license-plates-finetuned

updated a model about 8 hours ago

jtatman/grounding-dino-license-plates-finetuned

View all activity

Organizations

jtatman's activity

upvoted a paper about 14 hours ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published 4 days ago • 33

upvoted a paper 2 months ago

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Paper • 2409.04593 • Published Sep 6 • 22

upvoted a collection 4 months ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 624

upvoted an article 4 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23

• 215

upvoted a paper 4 months ago

Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Paper • 2407.12982 • Published Jul 17 • 5

upvoted a paper 5 months ago

Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

Paper • 2406.14563 • Published Jun 20 • 29

upvoted an article 5 months ago

Article

Welcome Gemma - Google's new open LLM

Feb 21

• 16

upvoted a collection 5 months ago

abliterated-v3

Collection

Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3 • 97

upvoted an article 7 months ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

•

Jun 23

• 34

upvoted a paper 9 months ago

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 62

upvoted a paper 10 months ago

Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Paper • 2401.12954 • Published Jan 23 • 29

upvoted 2 papers 11 months ago

Transformers are Multi-State RNNs

Paper • 2401.06104 • Published Jan 11 • 36

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258

upvoted a paper 12 months ago

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Paper • 2311.10642 • Published Nov 17, 2023 • 23