Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for...
Egile nagusia: | |
---|---|
Beste egile batzuk: | |
Formatua: | Thesis |
Argitaratua: |
Massachusetts Institute of Technology
2024
|
Sarrera elektronikoa: | https://hdl.handle.net/1721.1/153881 |
_version_ | 1826192773187895296 |
---|---|
author | Delelegn, Yonatan |
author2 | Edelman, Alan |
author_facet | Edelman, Alan Delelegn, Yonatan |
author_sort | Delelegn, Yonatan |
collection | MIT |
description | Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited.
Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization. |
first_indexed | 2024-09-23T09:28:58Z |
format | Thesis |
id | mit-1721.1/153881 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T09:28:58Z |
publishDate | 2024 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1538812024-03-22T04:00:58Z Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia Delelegn, Yonatan Edelman, Alan Rackauckas, Christopher Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited. Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization. M.Eng. 2024-03-21T19:13:17Z 2024-03-21T19:13:17Z 2024-02 2024-03-04T16:38:16.929Z Thesis https://hdl.handle.net/1721.1/153881 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Delelegn, Yonatan Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia |
title | Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia |
title_full | Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia |
title_fullStr | Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia |
title_full_unstemmed | Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia |
title_short | Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia |
title_sort | implementing robust and efficient pseudo transient methods for solving neural complementarity problems in julia |
url | https://hdl.handle.net/1721.1/153881 |
work_keys_str_mv | AT delelegnyonatan implementingrobustandefficientpseudotransientmethodsforsolvingneuralcomplementarityproblemsinjulia |