jiakai's picture

33 323

jiakai

real-jiakai

·

https://blog.gujiakai.top

AI & ML interests

LLM && Smart QA

Recent Activity

liked a dataset about 2 hours ago

tblard/allocine

liked a model about 11 hours ago

bartowski/Marco-o1-GGUF

liked a model about 11 hours ago

AIDC-AI/Marco-o1

View all activity

Organizations

real-jiakai's activity

upvoted a paper 4 days ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published 8 days ago • 94

upvoted an article 9 days ago

Article

Releasing the largest multilingual open pretraining dataset

By

•

10 days ago

• 94

upvoted a paper 11 days ago

TableGPT2: A Large Multimodal Model with Tabular Data Integration

Paper • 2411.02059 • Published 20 days ago • 5

upvoted a collection 12 days ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated 6 days ago • 229

upvoted a collection 13 days ago

OpenCoder

OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 8 items • Updated about 14 hours ago • 71

upvoted a paper 14 days ago

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published 18 days ago • 60

upvoted a paper 24 days ago

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Paper • 2410.18603 • Published about 1 month ago • 30

upvoted a paper 26 days ago

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Paper • 2410.19355 • Published 30 days ago • 23

upvoted a paper 29 days ago

Why Does the Effective Context Length of LLMs Fall Short?

Paper • 2410.18745 • Published about 1 month ago • 16

upvoted a paper about 1 month ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20 • 62

upvoted an article about 1 month ago

Article

GaLore: Advancing Large Model Training on Consumer-grade Hardware

Mar 20

• 25

upvoted a paper about 2 months ago

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24 • 41

upvoted a collection about 2 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated about 1 month ago • 489

upvoted 2 collections 3 months ago

Enhance Your Images

Some trending Gradio apps on Spaces that you can use to enhance/upscale your images for free. This collection will be kept uptodate with new releases. • 7 items • Updated Aug 22 • 17

Gradio Spaces for Background Removal

Enhance your images by removing the background. Will ensure these Spaces are up and maintained for the community. • 5 items • Updated Aug 20 • 23

upvoted a collection 5 months ago

LLM Compiler

Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 148

upvoted 2 papers 5 months ago

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17 • 57

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Paper • 2406.12793 • Published Jun 18 • 31

upvoted a collection 5 months ago

Nemotron 4 340B

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 21 days ago • 158

upvoted an article 6 months ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28

• 159