1 2 1

_

Xa9aX

https://digantamisra98.github.io/

AI & ML interests

Sparsity, Deep learning theory, MoE, Continual learning, NSL

Recent Activity

upvoted a paper 12 days ago

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models

View all activity

Organizations

Xa9aX's activity

upvoted a paper 12 days ago

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models

Paper • 2411.05830 • Published 18 days ago • 20

commented a paper 3 months ago

Spectrum: Targeted Training on Signal to Noise Ratio

Paper • 2406.06623 • Published Jun 7 • 7 •

authored a paper 7 months ago

Just Say the Name: Online Continual Learning with Category Names Only via Data Generation

Paper • 2403.10853 • Published Mar 16

Reacted to akhaliq's post with ❤️ 8 months ago

Post

2194

Aurora-M

The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order (2404.00399)

Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, whereas pretraining from scratch is computationally expensive, and compliance with AI safety and development laws. This paper presents Aurora-M, a 15B parameter multilingual open-source model trained on English, Finnish, Hindi, Japanese, Vietnamese, and code. Continually pretrained from StarCoderPlus on 435 billion additional tokens, Aurora-M surpasses 2 trillion tokens in total training token count. It is the first open-source multilingual model fine-tuned on human-reviewed safety instructions, thus aligning its development not only with conventional red-teaming considerations, but also with the specific concerns articulated in the Biden-Harris Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Aurora-M is rigorously evaluated across various tasks and languages, demonstrating robustness against catastrophic forgetting and outperforming alternatives in multilingual settings, particularly in safety evaluations.

Reacted to huu-ontocord's post with 🔥❤️ 8 months ago

Post

1615

We would like to announce our Aurora-M multilingual models which is based on Starcoderplus.
Twitter: https://twitter.com/ontocord/status/1772778544051155029
LinkedIn: https://www.linkedin.com/feed/update/urn:li:activity:7178521998845759488/
Blog post: https://huggingface.co/blog/mayank-mishra/aurora
Arxiv: Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order (2404.00399)

Current LLMs are very susceptible to generating toxic, harmful and even dangerous content. They can also generate outputs with gender or racial biases. The Biden-Harris Executive Order https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence) sets forth guidelines on what is considered a safe AI system.
Following up on these guidelines, we present the world's first open source Biden-Harris Executive Order Red teamed Multilingual Language Model: Aurora-M. Inspired by BigScience, the model is trained on 5 languages: English, Hindi, Japanese, Vietnamese and Finnish.

* Red teamed model: aurora-m/aurora-m-biden-harris-redteamed tuned according to the order mentioned above)
* Base model: aurora-m/aurora-m-base (not safety tuned)
* Instruct model: aurora-m/aurora-m-instruct (not safety tuned)

@mayank-mishra @cabbage972 @sted97 @Xa9aX @Taishi-N324 @Muennighoff @vumichien @prateeky2806 @felfri @spyysalo and many many others!

authored a paper 8 months ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30 • 41

upvoted a paper 8 months ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30 • 41

liked a model 8 months ago

aurora-m/aurora-m-biden-harris-redteamed

Text Generation • Updated 13 days ago • 581 • 19

authored 6 papers about 1 year ago

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Paper • 2206.04615 • Published Jun 9, 2022 • 5

Reprogramming under constraints: Revisiting efficient and reliable transferability of lottery tickets

Paper • 2308.14969 • Published Aug 29, 2023

Mish: A Self Regularized Non-Monotonic Activation Function

Paper • 1908.08681 • Published Aug 23, 2019 • 1