Seth L's picture
5 19

Seth L

splevine

AI & ML interests

NLP, Sentiment, Dialogue, Conversational AI

Recent Activity

Organizations

None yet

splevine's activity

Reacted to victor's post with 🔥 3 days ago
view post
Post
1656
Qwen2.5-72B is now the default HuggingChat model.
This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!
upvoted an article 3 days ago
upvoted an article 11 days ago
view article
Article

How to build a custom text classifier without days of human labeling

By sdiazlor
55
Reacted to tomaarsen's post with 🔥 12 days ago
view post
Post
4805
I just released Sentence Transformers v3.3.0 & it's huge! 4.5x speedup for CPU with OpenVINO int8 static quantization, training with prompts for a free perf. boost, PEFT integration, evaluation on NanoBEIR, and more! Details:

1. We integrate Post-Training Static Quantization using OpenVINO, a very efficient solution for CPUs that processes 4.78x as many texts per second on average, while only hurting performance by 0.36% on average. There's a new export_static_quantized_openvino_model method to quantize a model.

2. We add the option to train with prompts, e.g. strings like "query: ", "search_document: " or "Represent this sentence for searching relevant passages: ". It's as simple as using the prompts argument in SentenceTransformerTrainingArguments. Our experiments show that you can easily reach 0.66% to 0.90% relative performance improvement on NDCG@10 at no extra cost by adding "query: " before each training query and "document: " before each training answer.

3. Sentence Transformers now supports training PEFT adapters via 7 new methods for adding new adapters or loading pre-trained ones. You can also directly load a trained adapter with SentenceTransformer as if it's a normal model. Very useful for e.g. 1) training multiple adapters on 1 base model, 2) training bigger models than otherwise possible, or 3) cheaply hosting multiple models by switching multiple adapters on 1 base model.

4. We added easy evaluation on NanoBEIR, a subset of BEIR a.k.a. the MTEB Retrieval benchmark. It contains 13 datasets with 50 queries and up to 10k documents each. Evaluation is fast, and can easily be done during training to track your model's performance on general-purpose information retrieval tasks.

Additionally, we also deprecate Python 3.8, add better compatibility with Transformers v4.46.0, and more. Read the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.3.0
Reacted to m-ric's post with 🔥 12 days ago
view post
Post
3150
𝗤𝘄𝗲𝗻𝟮.𝟱-𝗖𝗼𝗱𝗲𝗿-𝟯𝟮𝗕: 𝗻𝗲𝘄 𝗯𝗲𝘀𝘁-𝗶𝗻-𝗰𝗹𝗮𝘀𝘀 𝗼𝗽𝗲𝗻 𝗰𝗼𝗱𝗶𝗻𝗴 𝗺𝗼𝗱𝗲𝗹, 𝗯𝗲𝗮𝘁𝘀 𝗚𝗣𝗧-𝟰𝗼 𝗼𝗻 𝗺𝗼𝘀𝘁 𝗰𝗼𝗱𝗶𝗻𝗴 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝘀!💥

💪 It's the first time Open-Source coding model of this size class that clearly matches GPT-4o's coding capabilities!

✨ Completes the previous two Qwen 2.5 Coder release with 4 new size: 0.5B, 3B, 14B, 32B
📚 Support long context up to 128K (for the 14B and 32B models)
✅ Drop-in replacement to GPT-4o as a coding assistant on Cursor or for Artifacts!
🤗 Models available right now on the Hub, under Apache 2.0 license!

They have setup a crazy Artifacts demo, you should go have a look!
👉 Qwen/Qwen2.5-Coder-Artifacts
upvoted an article 4 months ago
view article
Article

🪆 Introduction to Matryoshka Embedding Models

56