Anfonwch hwn fel neges destun: Data parallelism in training sparse neural networks