Deep neural networks have become the model of choice for many prediction tasks. However, they require careful regularization to avoid overfitting and optimization pathologies. One popular regularization strategy is ‘dropout’: the hidden units of the network are randomly set to zero during training. In this talk, I will discuss equivalences between dropout (and some of its extensions) and Bayesian regularization via shrinkage priors. This perspective then allows us to derive structured shrinkage priors for other neural network architectures.