Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/153881 |