Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia

Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for...

Deskribapen osoa

Xehetasun bibliografikoak
Egile nagusia:	Delelegn, Yonatan
Beste egile batzuk:	Edelman, Alan
Formatua:	Thesis
Argitaratua:	Massachusetts Institute of Technology 2024
Sarrera elektronikoa:	https://hdl.handle.net/1721.1/153881

_version_	1826192773187895296
author	Delelegn, Yonatan
author2	Edelman, Alan
author_facet	Edelman, Alan Delelegn, Yonatan
author_sort	Delelegn, Yonatan
collection	MIT
description	Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited. Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization.
first_indexed	2024-09-23T09:28:58Z
format	Thesis
id	mit-1721.1/153881
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T09:28:58Z
publishDate	2024
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1538812024-03-22T04:00:58Z Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia Delelegn, Yonatan Edelman, Alan Rackauckas, Christopher Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Traditional deep learning models typically consist of explicitly defined layers, such as fully connected and self-attention layers found in Transformers, which have been pivotal in recent advancements in computer vision and large language models. Selecting an appropriate architecture is critical for these models. However, even with optimal architecture, these models may fail to capture intricate relationships and dependencies within hidden states due to the inherent limitations of the chosen layers. Furthermore, in several scientific applications, particularly those simulating physical systems, there is a pressing need to integrate domain-specific knowledge into the modeling process, a task for which explicit neural networks may not be ideally suited. Recent studies, such as [2] and [4] have highlighted the potential of implicit layers in capturing more complex relationships and learning more stringent constraints than traditional neural networks. Beyond capturing intricate relationships, implicit layers offer the advantage of decoupling the solution process from the layer definition, thus facilitating faster training and the seamless integration of domain-specific knowledge. To enable implicit models to rival state-of-the-art performance, robust and efficient solvers are required for the forward pass. In this project, we focus on exploring stable and efficient solvers, specifically Pseudo-transient methods, for resolving neural complementarity problems. We aim to derive the sensitivity analysis of these problems, implement it in julia, and delve into the applications of differentiable complementarity problems in fields such as economics, game theory, and optimization. M.Eng. 2024-03-21T19:13:17Z 2024-03-21T19:13:17Z 2024-02 2024-03-04T16:38:16.929Z Thesis https://hdl.handle.net/1721.1/153881 In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Delelegn, Yonatan Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title	Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_full	Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_fullStr	Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_full_unstemmed	Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_short	Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia
title_sort	implementing robust and efficient pseudo transient methods for solving neural complementarity problems in julia
url	https://hdl.handle.net/1721.1/153881
work_keys_str_mv	AT delelegnyonatan implementingrobustandefficientpseudotransientmethodsforsolvingneuralcomplementarityproblemsinjulia

Implementing Robust and Efficient Pseudo-transient Methods for Solving Neural Complementarity Problems in Julia

Antzeko izenburuak