Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia

Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for...

Deskribapen osoa

Xehetasun bibliografikoak
Egile nagusia: Delelegn, Yonatan
Beste egile batzuk: Edelman, Alan
Formatua: Thesis
Argitaratua: Massachusetts Institute of Technology 2024
Sarrera elektronikoa:https://hdl.handle.net/1721.1/153881
_version_ 1826192773187895296
author Delelegn, Yonatan
author2 Edelman, Alan
author_facet Edelman, Alan
Delelegn, Yonatan
author_sort Delelegn, Yonatan
collection MIT
description Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited. Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization.
first_indexed 2024-09-23T09:28:58Z
format Thesis
id mit-1721.1/153881
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T09:28:58Z
publishDate 2024
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1538812024-03-22T04:00:58Z Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia Delelegn, Yonatan Edelman, Alan Rackauckas, Christopher Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited. Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization. M.Eng. 2024-03-21T19:13:17Z 2024-03-21T19:13:17Z 2024-02 2024-03-04T16:38:16.929Z Thesis https://hdl.handle.net/1721.1/153881 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Delelegn, Yonatan
Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_full Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_fullStr Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_full_unstemmed Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_short Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_sort implementing robust and efficient pseudo transient methods for solving neural complementarity problems in julia
url https://hdl.handle.net/1721.1/153881
work_keys_str_mv AT delelegnyonatan implementingrobustandefficientpseudotransientmethodsforsolvingneuralcomplementarityproblemsinjulia