sc_ma
Add sample outputs.
5a9ffbd
raw
history blame
3.73 kB
\section{related works}
\paragraph{Adversarial Training and Generalization}
Adversarial training has been widely studied for enhancing the robustness and generalization ability of neural networks. In the context of time series analysis, the adaptively scaled adversarial training (ASAT) has been introduced to improve both generalization ability and adversarial robustness of neural networks by rescaling data at different time slots with adaptive scales \cite{2108.08976}. ASAT has been shown to achieve better generalization ability and similar adversarial robustness compared to traditional adversarial training algorithms.
\paragraph{Dropout Techniques}
Dropout has been a popular technique for mitigating overfitting and improving the performance of deep neural networks (DNNs). Advanced dropout is a model-free methodology that applies a parametric prior distribution and adaptively adjusts the dropout rate \cite{2010.05244}. This technique has been shown to outperform other dropout methods on various computer vision datasets. Moreover, continuous dropout has been proposed as an extension to traditional binary dropout, inspired by the random and continuous firing rates of neurons in the human brain \cite{1911.12675}. Continuous dropout has demonstrated better performance in preventing the co-adaptation of feature detectors and improving test performance compared to binary dropout, adaptive dropout, and DropConnect.
\paragraph{Adaptive Variational Dropout}
Adaptive variational dropout has been proposed to address the limitations of input-independent dropout by allowing each neuron to be evolved either to be generic or specific for certain inputs or dropped altogether \cite{1805.10896}. This input-adaptive sparsity-inducing dropout allows the resulting network to tolerate a larger degree of sparsity without losing its expressive power by removing redundancies among features. The method has been validated on multiple public datasets, obtaining significantly more compact networks than baseline methods, with consistent accuracy improvements over the base networks.
\paragraph{DropHead for Multi-head Attention}
In the context of natural language processing, DropHead has been introduced as a structured dropout method specifically designed for regularizing the multi-head attention mechanism in transformer models \cite{2004.13342}. DropHead prevents the multi-head attention model from being dominated by a small portion of attention heads and reduces the risk of overfitting the training data, thus making use of the multi-head attention mechanism more efficiently. A specific dropout rate schedule has been proposed to adaptively adjust the dropout rate of DropHead and achieve better regularization effect.
\paragraph{Generative Adversarial Networks (GANs)}
Generative Adversarial Networks (GANs) have been widely used for generating realistic images and other forms of data. Unbalanced GANs have been proposed to pre-train the generator using a variational autoencoder (VAE) to guarantee stable training and reduce mode collapses \cite{2002.02112}. Unbalanced GANs have been shown to outperform ordinary GANs in terms of stabilized learning, faster convergence, and better image quality at early epochs. Wasserstein GAN, on the other hand, aims to improve GANs' training by adopting a smooth metric for measuring the distance between two probability distributions \cite{1904.08994}.
In summary, various techniques have been proposed to improve the performance and robustness of neural networks, such as adversarial training, different dropout methods, and advanced GAN models. Each technique has its strengths and weaknesses, and their effectiveness depends on the specific application and dataset.