Auto-parametrized Kernel Methods for Biomolecular Modeling
Author | : Timothy Andrew Szocinski |
Publisher | : |
Total Pages | : 171 |
Release | : 2021 |
ISBN-10 | : 9798759993155 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Auto-parametrized Kernel Methods for Biomolecular Modeling written by Timothy Andrew Szocinski and published by . This book was released on 2021 with total page 171 pages. Available in PDF, EPUB and Kindle. Book excerpt: Being able to predict various physical quantities of biomolecules is of great importance to biologists, chemists, and pharmaceutical companies. By applying machine learning techniques to develop these predictive models, we find much success in our endeavors. Advanced mathematical techniques involving graph theory, algebraic topology, differential geometry, etc. have been very profitable in generating first-rate biomolecular representations that are used to train a variety of machine learning models. Some of these representations are dependent on a choice of kernel function along with parameters that determine its shape. These kernel-based methods of producing features require careful tuning of the kernel parameters, and the tuning cost increases exponentially as more kernels are involved. This limitation largely restricts us to the use of machine learning models with less hyper-parameters, such as random forest (RF) and gradient-boosting trees (GBT), thus precluding the use of neural networks for kernel-based representations.To alleviate these concerns, we have developed the auto-parametrized weighted element-specific graph neural network (AweGNN), which uses kernel-based geometric graph features in which the kernel parameters are automatically updated throughout the training to reach an optimal combination of kernel parameters. The AweGNN models have shown to be particularly success in toxicity and solvation predictions, especially when a multi-task approach is taken. Although the AweGNN had introduced hundreds of parameters that were automatically tuned, the ability to include multiple kernel types simultaneously was hindered because of the computational expense. In response, the GPU-enhanced AweGNN was developed to tackle the issue.Working with GPU architecture, the AweGNN's computation speed was greatly enhanced. To achieve a more comprehensive representation, we suggested a network consisting of fixed topological and spectral auxiliary features to bolster the original AweGNN success. The proposed network was tested on new hydration and solubility datasets, with excellent results. To extend the auto-parametrized kernel technique to include features of a different type, we introduced the theoretical foundation for building an auto-parametrized spectral layer, which uses kernel-based spectral features to represent biomolecular structures.In this dissertation, we explore some underlying notions of mathematics useful in our models, review important topics in machine learning, discuss techniques and models used in molecular biology, detail the AweGNN architecture and results, and test and expand new concepts pertaining to these auto-parametrized kernel methods.