Summary: | Symmetries in the data and how they constrain the learned weights of modern deep networks is still an open problem. In this work we study the simple case of fully connected shallow non-linear neural networks and consider two types of symmetries: full dataset symmetries where the dataset <inline-formula> <tex-math notation="LaTeX">$X$ </tex-math></inline-formula> is mapped into itself by any transformation <inline-formula> <tex-math notation="LaTeX">$g$ </tex-math></inline-formula>, i.e. <inline-formula> <tex-math notation="LaTeX">$gX=X$ </tex-math></inline-formula> or single data point symmetries where <inline-formula> <tex-math notation="LaTeX">$gx=x$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$x\in X$ </tex-math></inline-formula>. We prove and experimentally confirm that symmetries in the data are directly inherited at the level of the network’s learned weights and relate these findings with the common practice of data augmentation in modern machine learning. Finally, we show how symmetry constraints have a profound impact on the spectrum of the learned weights, an aspect of the so-called network implicit bias.
|