alkinun's picture

alkinun

AtAndDev

·

AI & ML interests

LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..

Recent Activity

Reacted to burtenshaw's post with ❤️ 34 minutes ago

For anyone looking to boost their LLM fine-tuning and alignment skills this decemeber. We're running this free and open course called smol course. It’s not big like Li Yin and @mlabonne, it’s just smol. 👷 It focuses on practical use cases, so if you’re working on something, bring it along. 👯‍♀️ It’s peer reviewed and open so you can discuss and get feedback. 🤘 If you’re already a smol pro, feel free to drop a star or issue. > > Part 1 starts now, and it’s on instruction tuning! https://github.com/huggingface/smol-course

Reacted to vincentg64's post with 👀 about 3 hours ago

LLM 2.0, the New Generation of Large Language Models https://mltblog.com/49ksOLL I get many questions about the radically different LLM technology that I started to develop 2 years ago. Initially designed to retrieve information that I could no longer find on the Internet, not with search, OpenAI, Gemini, Perplexity or any other platform, it evolved to become the ideal solution for professional enterprise users. Now agentic and multimodal, automating business tasks at scale with lightning speed, consistently delivering real ROI, bypassing the costs associated to training and GPU with zero weight and explainable AI, tested and developed for Fortune 100 company. So, what is behind the scenes, how different is it compared to LLM 1.0 (GPT and the likes), how can it be hallucination-free, what makes it a game changer, how did it eliminate prompt engineering, how does it handle knowledge graphs without neural networks, and what are the other benefits? In a nutshell, the performance is due to building a robust architecture from the ground up and at every step, offering far more than a prompt box, relying on home-made technology rather than faulty Python libraries, and designed by enterprise and tech visionaries for enterprise users. Contextual smart crawling to retrieve underlying taxonomies, augmented taxonomies, long contextual multi-tokens, real-time fine-tunning, increased security, LLM router with specialized sub-LLMs, an in-memory database architecture of its own to efficiently handle sparsity in keyword associations, contextual backend tables, agents built on the backend, mapping between prompt and corpus keywords, customized PMI rather than cosine similarity, variable-length embeddings, and the scoring engine (the new “PageRank” of LLMs) returning results along with the relevancy scores, are but a few of the differentiators. ➡️ Read the full article, at https://mltblog.com/49ksOLL

Reacted to vansin's post with 👀 about 3 hours ago

Try InternThinker~ https://internlm-chat.intern-ai.org.cn/internthinker

View all activity

Organizations

models 6

AtAndDev/Ogno-Monarch-Neurotic-9B-Passthrough

Text Generation • Updated Mar 1 • 73

AtAndDev/Ogno-Monarch-Neurotic-7B-Dare-Ties

Text Generation • Updated Mar 1 • 83

AtAndDev/Marcoro14-7B-Slerp

Text Generation • Updated Mar 1 • 14

AtAndDev/CapybaraMarcoroni-7B

Text Generation • Updated Jan 7 • 787

AtAndDev/ShortKing-3b-v0.2

Text Generation • Updated Oct 2, 2023 • 76 • 2

AtAndDev/ShortKing-1.4b-v0.1

Text Generation • Updated Sep 29, 2023 • 5k • 2

datasets 7

AtAndDev/sedir-clean

Viewer • Updated about 5 hours ago • 6.4k

AtAndDev/sedir-unclean

Viewer • Updated about 5 hours ago • 10.5k

AtAndDev/ultrachat_200k_formatted

Viewer • Updated Oct 10 • 208k • 41

AtAndDev/MedInstruct

Viewer • Updated Jul 20 • 216 • 43

AtAndDev/MedRag-textbooks-stella_en_400M_v5

Viewer • Updated Jul 14 • 126k • 43

AtAndDev/MedRag-textbooks-gte-large-en-v1.5

Viewer • Updated Jul 14 • 126k • 42

AtAndDev/MedRag-textbooks-mxbai-embed-large-v1

Viewer • Updated Jul 14 • 126k • 51