# Autoencoders: Sparse Autoencoders

## Sparse Autoencoders

While autoencoders normally discover useful structures by having a small number of hidden units, they can also be useful with a large number of hidden units. By doing so, the autoencoder enlarges the given input’s representation. This is possible by introducing a sparsity constraint. The aim of this, is to cause the large number of neurons to all have a low average output so that the neurons are inactive most of the time. If we are using a Sigmoid activation function we would want their output to be as close to 0 as possible, and as close to -1 as possible if using a Tanh activation function.

If we have the activation function of a neuron, $a_j$, we can calculate the average activation function of all the neurons, $p_j$, with the formula:

$p_j=\frac1m\sum_i=1^m\left[a_jx\right]$

The aim of the sparsity constraint is to minimise $p_j$, so that $p_j=p_c$ where $p_c$ is a small number close to 0 (for the Sigmoid activation function), such as 0.05. We can do so by adding a penalty term to mean squared error cost function that we normally try to minimise for classical autoencoders. The penalty term, as with variational autoencoders, is the KL divergence between the Bernoulli random variables $p_j$ and $p_c$, and can be calculated with the formula:

$KL(p_j||p_c)=plog\fracp_cp_j+(1-p)log\frac1-p_c1-p_j$

Figure 1: KL divergence penalty term for $p_c=0.2$ [ 1 ].

An advancement to sparse autoencoders is the k-sparse autoencoder. This is where we choose k neurons with the highest activation functions and ignore the others, by either sorting the activities or using ReLU activation functions and adaptively adjusting the thresholds until the k largest neurons are identified. This lets us tune the value of k to obtain a sparsity level most suitable for our dataset. A large sparsity level would learn very local features which may not be useful for identifying handwritten digits but useful for pretraining neural nets [ 2 ].

Figure 2: Different sparsity levels k, learned from MNIST with 1000 hidden units [ 2 ].

Previous
Next

### References

[1]    NG A. Sparse autoencoder. CS294A Lecture notes;.

[2]    Alireza Makhzani BF. k-Sparse Autoencoders. arXiv. 2013;.

Help Design Your New ACM Digital Library

# Group sparse autoencoder

 Authors: Anush Sankaran Indraprastha Institute of Information Technology (IIIT) Delhi, India Mayank Vatsa Indraprastha Institute of Information Technology (IIIT) Delhi, India Richa Singh Indraprastha Institute of Information Technology (IIIT) Delhi, India Angshul Majumdar Indraprastha Institute of Information Technology (IIIT) Delhi, India

 Published in: · Journal Image and Vision Computing archive Volume 60 Issue C, April 2017Pages 64-74Butterworth-Heinemann Newton, MA, USA table of contentsdoi> 10.1016/j.imavis.2017.01.005

# Tools and Resources

• Save to Binder

• Export Formats:

• BibTeX
• EndNote
• ACM Ref

Share:

|

Author Tags