1 10

DeMarcus Edwards

djmcflush

https://www.darelabs.xyz

AI & ML interests

Adversarial Machine learning, Robotics, Vision, NLP, Graph theory

Recent Activity

updated a model 12 days ago

djmcflush/LLNL_LLAMA

View all activity

Organizations

None yet

djmcflush's activity

updated a model 12 days ago

djmcflush/LLNL_LLAMA

Updated 12 days ago

New activity in SmilingWolf/wd-tagger 8 months ago

wd-v1-4-swinv2-tagger-v3 Torch Model is either Incorrect or Model is vastly different

#9 opened 8 months ago by

djmcflush

Reacted to osanseviero's post with 👍 9 months ago

Post

Mixture of experts: beware 🛡️⚔️

New paper by DeepMind: Buffer Overflow in MoE Buffer Overflow in Mixture of Experts (2402.05526)

The paper shows an adversarial attack strategy in which a user sends malicious queries that can affect the output of other user queries from the same batch.

So if in the same batch we have
- User A benign query
- User B malicious query
The response for A might be altered!😱

How is this possible?
One approach is to fill the token buffers with adversarial data, hence forcing the gating to use the non-ideal experts or to entirely drop the bening tokens (in the case of finite limit size).

This assumes that the adversary can use the model as a black-box but can observe the logit outputs + ensure that the data is always grouped in the same batch.

How to mitigate this?
- Randomize batch order (and even run twice if some queries are very sensitive)
- Use a large capacity slack
- Sample from gate weights instead of top-k (not great IMO, as that require more memory for inference)

Very cool paper!!