Beyond Memorization: Exploring the Dynamics of Grokking in Sparse Neural Networks

In the domain of machine learning, "grokking" is a phenomenon where neural network models demonstrate a sudden improvement in generalization, distinct from traditional learning phases, long after the initial training appears complete. This behavior was first identified by Power et al. (202...

Full description

Bibliographic Details
Main Author: Fuangkawinsombut, Siwakorn
Other Authors: Raghuraman, Srinivasan
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156751