stereoplegic
's Collections
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of
Language Model
Paper
•
2305.15265
•
Published
•
1
Mesa: A Memory-saving Training Framework for Transformers
Paper
•
2111.11124
•
Published
•
1
Full Parameter Fine-tuning for Large Language Models with Limited
Resources
Paper
•
2306.09782
•
Published
•
29
Layered gradient accumulation and modular pipeline parallelism: fast and
efficient training of large language models
Paper
•
2106.02679
•
Published
•
1
Outliers with Opposing Signals Have an Outsized Effect on Neural Network
Optimization
Paper
•
2311.04163
•
Published
•
1
LoopTune: Optimizing Tensor Computations with Reinforcement Learning
Paper
•
2309.01825
•
Published
•
1
Utility-based Perturbed Gradient Descent: An Optimizer for Continual
Learning
Paper
•
2302.03281
•
Published
•
1
Fine-Tuning Language Models with Just Forward Passes
Paper
•
2305.17333
•
Published
•
2
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Paper
•
2308.02060
•
Published
•
1
Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Paper
•
1909.12778
•
Published
•
1
Lottery Tickets in Evolutionary Optimization: On Sparse
Backpropagation-Free Trainability
Paper
•
2306.00045
•
Published
•
1
Multiplication-Free Transformer Training via Piecewise Affine Operations
Paper
•
2305.17190
•
Published
•
2
XGrad: Boosting Gradient-Based Optimizers With Weight Prediction
Paper
•
2305.18240
•
Published
•
1
Gradients without Backpropagation
Paper
•
2202.08587
•
Published
•
1
Learning with Local Gradients at the Edge
Paper
•
2208.08503
•
Published
•
1
HyperTuning: Toward Adapting Large Language Models without
Back-propagation
Paper
•
2211.12485
•
Published
•
1
DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule
Paper
•
2302.12022
•
Published
•
1
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and
Uncertainty in Zeroth-order Optimization
Paper
•
2312.15184
•
Published
•
1
Versatile Black-Box Optimization
Paper
•
2004.14014
•
Published
PyPop7: A Pure-Python Library for Population-Based Black-Box
Optimization
Paper
•
2212.05652
•
Published
B2Opt: Learning to Optimize Black-box Optimization with Little Budget
Paper
•
2304.11787
•
Published
MKOR: Momentum-Enabled Kronecker-Factor-Based Optimizer Using Rank-1
Updates
Paper
•
2306.01685
•
Published
CoRe Optimizer: An All-in-One Solution for Machine Learning
Paper
•
2307.15663
•
Published