5 32 57

Guille Pérez-Torró

guishe

https://www.linkedin.com/in/guipetor/

GuishePerez

AI & ML interests

Information Retrieval, Few-Shot Learning, Named Entity Recognition, Named Entity Disambiguation, Semantic Search, Aspect-based Sentiment Analysis

Recent Activity

liked a model 2 days ago

NousResearch/Genstruct-7B

liked a Space 7 days ago

merve/uncertainty-calibration

updated a collection 10 days ago

Embedding Encoder-based Models

Organizations

None yet

guishe's activity

upvoted a collection 17 days ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 10 items • Updated about 8 hours ago • 172

upvoted a paper 18 days ago

HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2 • 21

upvoted an article about 1 month ago

Article

How to build a custom text classifier without days of human labeling

•

Oct 17

• 55

upvoted an article about 2 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

Feb 23

• 55

upvoted 2 articles 2 months ago

Article

Taxonomy Completion with Embedding Quantization and an LLM-based Pipeline: A Case Study in Computational Linguistics

•

Jul 22

• 4

Article

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

Mar 22

• 64

upvoted a collection 3 months ago

4bit Instruct Models

Collection

18 items • Updated about 17 hours ago • 25

upvoted 2 articles 4 months ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23

• 215

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

•

Jul 29

• 244

upvoted 2 collections 4 months ago

Zeroshot Classifiers

Collection

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 112

ReLiK: Retrieve, Read and LinK

Collection

A blazing fast and lightweight Information Extraction model for Entity Linking and Relation Extraction. • 20 items • Updated Aug 8 • 22

upvoted a paper 5 months ago

Nomic Embed: Training a Reproducible Long Context Text Embedder

Paper • 2402.01613 • Published Feb 2 • 14

upvoted a collection 5 months ago

Instruction Pre-Training

Collection

8 items • Updated Jun 21 • 26

upvoted a paper 5 months ago

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7 • 55

upvoted 6 papers 6 months ago

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

Paper • 2212.14024 • Published Dec 28, 2022 • 3

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Paper • 2310.03714 • Published Oct 5, 2023 • 30

DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines

Paper • 2312.13382 • Published Dec 20, 2023 • 3

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 29

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3 • 43

Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 79