Mert's picture

7 16

Mert

Sengil

·

https://www.linkedin.com/in/mertsengil/

AI & ML interests

LLM's

Recent Activity

Reacted to csabakecskemeti's post with 👍 5 days ago

Some time ago, I built a predictive LLM router that routes chat requests between small and large LLM models based on prompt classification. It dynamically selects the most suitable model depending on the complexity of the user input, ensuring optimal performance while maintaining conversation context. I also fine-tuned a RoBERTa model to use with the package, but you can plug and play any classifier of your choice. Project's homepage: https://devquasar.com/llm-predictive-router/ Pypi: https://pypi.org/project/llm-predictive-router/ Model: https://huggingface.co/DevQuasar/roberta-prompt_classifier-v0.1 Training data: https://huggingface.co/datasets/DevQuasar/llm_router_dataset-synth Git: https://github.com/csabakecskemeti/llm_predictive_router_package Feel free to check it out, and/or contribute.

liked a model 8 days ago

yeniguno/absa-turkish-bert-dbmdz

Reacted to ImranzamanML's post with 🔥 about 1 month ago

LoRA with code 🚀 using PEFT (parameter efficient fine-tuning) LoRA (Low-Rank Adaptation) LoRA adds low-rank matrices to specific layers and reduce the number of trainable parameters for efficient fine-tuning. Code: Please install these libraries first: pip install peft pip install datasets pip install transformers ``` from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments from peft import LoraConfig, get_peft_model from datasets import load_dataset # Loading the pre-trained BERT model model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2) # Configuring the LoRA parameters lora_config = LoraConfig( r=8, lora_alpha=16, lora_dropout=0.1, bias="none" ) # Applying LoRA to the model model = get_peft_model(model, lora_config) # Loading dataset for classification dataset = load_dataset("glue", "sst2") train_dataset = dataset["train"] # Setting the training arguments training_args = TrainingArguments( output_dir="./results", per_device_train_batch_size=16, num_train_epochs=3, logging_dir="./logs", ) # Creating a Trainer instance for fine-tuning trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, ) # Finally we can fine-tune the model trainer.train() ``` LoRA adds low-rank matrices to fine-tune only a small portion of the model and reduces training overhead by training fewer parameters. We can perform efficient fine-tuning with minimal impact on accuracy and its suitable for large models where full-precision training is still feasible.

View all activity

Organizations

None yet

Sengil's activity

Reacted to csabakecskemeti's post with 👍 5 days ago

Post

1198

Some time ago, I built a predictive LLM router that routes chat requests between small and large LLM models based on prompt classification. It dynamically selects the most suitable model depending on the complexity of the user input, ensuring optimal performance while maintaining conversation context. I also fine-tuned a RoBERTa model to use with the package, but you can plug and play any classifier of your choice.

Project's homepage:
https://devquasar.com/llm-predictive-router/
Pypi:
https://pypi.org/project/llm-predictive-router/
Model:
DevQuasar/roberta-prompt_classifier-v0.1
Training data:
DevQuasar/llm_router_dataset-synth
Git:
https://github.com/csabakecskemeti/llm_predictive_router_package

Feel free to check it out, and/or contribute.

liked a model 8 days ago

yeniguno/absa-turkish-bert-dbmdz

Text Classification • Updated Sep 22 • 45 • 4

Reacted to ImranzamanML's post with 🔥 about 1 month ago

Post

1355

LoRA with code 🚀 using PEFT (parameter efficient fine-tuning)

LoRA (Low-Rank Adaptation)
LoRA adds low-rank matrices to specific layers and reduce the number of trainable parameters for efficient fine-tuning.

Code:
Please install these libraries first:
pip install peft
pip install datasets
pip install transformers

from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments
from peft import LoraConfig, get_peft_model
from datasets import load_dataset

# Loading the pre-trained BERT model
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

# Configuring the LoRA parameters
lora_config = LoraConfig(
    r=8,
    lora_alpha=16, 
    lora_dropout=0.1, 
    bias="none" 
)

# Applying LoRA to the model
model = get_peft_model(model, lora_config)

# Loading dataset for classification
dataset = load_dataset("glue", "sst2")
train_dataset = dataset["train"]

# Setting the training arguments
training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=16,
    num_train_epochs=3,
    logging_dir="./logs",
)

# Creating a Trainer instance for fine-tuning
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
)

# Finally we can fine-tune the model
trainer.train()

LoRA adds low-rank matrices to fine-tune only a small portion of the model and reduces training overhead by training fewer parameters.
We can perform efficient fine-tuning with minimal impact on accuracy and its suitable for large models where full-precision training is still feasible.

liked a model about 1 month ago

BAAI/bge-base-en-v1.5

Feature Extraction • Updated Feb 21 • 3.05M • 251

New activity in facebook/musicgen-small about 1 month ago

How to get best result?

#31 opened about 1 month ago by

Faster MusicGen Generation with Streaming

#23 opened about 1 year ago by

Question

#28 opened 8 months ago by

New activity in facebook/musicgen-large about 1 month ago

how to get best result

#22 opened about 1 month ago by

New activity in black-forest-labs/FLUX.1-schnell 2 months ago

GPU and memory requirements

#89 opened 2 months ago by

liked a dataset 2 months ago

jondurbin/gutenberg-dpo-v0.1

Viewer • Updated Jan 12 • 918 • 1.65k • 125

New activity in black-forest-labs/FLUX.1-schnell 2 months ago

How can I make this model run faster?

#78 opened 3 months ago by

liked a model 2 months ago

aleksa-codes/flux-ghibsky-illustration

Text-to-Image • Updated 9 days ago • 48.1k • • 177

liked a model 3 months ago

black-forest-labs/FLUX.1-schnell

Text-to-Image • Updated Aug 16 • 2.03M • • 2.9k

liked a model 4 months ago

sentence-transformers/all-MiniLM-L6-v2

Sentence Similarity • Updated 23 days ago • 96.6M • • 2.57k

liked 3 datasets 5 months ago

fka/awesome-chatgpt-prompts

Viewer • Updated Sep 3 • 170 • 10.2k • 6.34k

Salesforce/xlam-function-calling-60k

Viewer • Updated Jul 19 • 60k • 2.78k • 387

MrOvkill/svg-stack-tmp-alpha-chunk

Viewer • Updated Jul 15 • 93.8k • 48 • 3

liked 2 models 5 months ago

NeuML/pubmedbert-base-embeddings

Sentence Similarity • Updated Oct 18, 2023 • 129k • 103

google-bert/bert-base-uncased

Fill-Mask • Updated Feb 19 • 70.5M • 1.92k

updated a model 5 months ago

Sengil/gemma-StableDiffusion-prompt-generator-v1

Text Generation • Updated Jul 11 • 1