Stav dette: Data parallelism in training sparse neural networks