Research | Marius Merkle

Quantitative Research Projects

Throughout my studies, I have worked on a number of projects which may be final projects of classes, thesis, or seminars. They cover various technical areas that I am excited about such as mathematical optimization, uncertainty quantification, data science, deep learning, and differential equations.

Optimization of Precision and Recall in Classification Using Mixed Integer Programming

For classification, we are often interested in specific metrics such as precision and recall. How- ever, with gradient-based learning methods, one can only optimize a differentiable substitute metric (e.g. binary cross-entropy) that does not necessarily optimize the metric of interest. For this project, we formulate classification via logistic regression as a mixed-integer optimization (MIO) program. As this sort of optimization does not require a differentiable loss function we are more flexible to optimize the desired metric directly. For our studies, we focus on the precision metric and demonstrate that our formulation is able to optimize precision in the training data. Due to a quadratic increase in the number of variables and constraints, the model does not scale well to new data. To deal with this issue we also introduce a relaxed proxy version of the MIO program which grows linearly with the number of data points.

[Final project MIT 15.095 - Machine Learning under a Modern Optimization Lens]

Download Report

Local Counterfactuals for Chest X-Ray Images Using Saliency Map Guidance

Machine learning for medical imaging diagnosis promises to improve patient care and make clinicians more efficient. However, due to the high-risk nature of the medical diagnosis, these models need to be able to produce explanations that are easily interpretable by healthcare professionals. Counterfactual explanations are very natural to the medical image domain but are difficult to produce correctly due to the fine anatomical de- tails inherent in a medical image and the often localized nature of an ailment. To solve this we introduce a localized optimization objective that finds a point in learned latent image space that focuses the image modifications on a small, localized region. This localized region is defined by the saliency map produced by the image classifier. We show that this procedure is able to make localized counterfactuals in chest X-ray images taken from the CheXpert data set for multiple conditions.

[AC 297R - Capstone Project]

Download Report

Supply Chain Management

For the final project of the MBA class in Supply Chain Management at Harvard Business School (HBS), we stepped into the shoes of a supply chain manager. We had full control over factories, transportation, and warehouses with the goal of maximizing profit in the simulation. We built demand prediction models based on historical data that served as input to the logistics optimization problem. In the latter, we strategized an optimal production and shipping plan to satisfy demand while being as inventory efficient as possible.

[Final project HBS MBA 2108 - Supply Chain Management]

Download Report

Review on Uncertainty Quantification for Infinite Neural Networks

To better understand the theoretical behavior of large neural networks, several works have analyzed the case where the network’s width tends to infinity. In this regime, the effect of random initialization and the process of training a neural network can be formally expressed with analytical tools like Gaussian processes and neural tangent kernels. In this paper, we review methods for uncertainty quantification in infinite-width neural networks and compare their relationship to Gaussian processes in the Bayesian inference framework. We make use of several equivalence results along the way to obtain exact closed-form solutions for predictive uncertainty.

[Final project AM 207 - Probabilistic Machine Learning]

Download Report

Mortality Prediction and Interpretation

In this project, we use a subset of the data used in the NHANES I Epidemiologic Follow-up Study (NHEFS). NHEFS is a national longitudinal study, which was conducted by the National Center for Health Statistics and the National Institute on Aging between 1971-75. The main research question that we seek to explore is: what factors from our data set are predictive of an individual’s longevity? Or in other words, what factors from our data set are predictive of living at least x years into the future from an individual’s initial examination?

[Final project CS 209A - Data Science]

Download Report

Case Studies on Physics-Informed Neural Networks

We empirically investigated how physics-informed neural networks (PINNs) can be used to solve differential equations. We studied three problems and applied several modeling tricks to accelerate convergence including the reparametrization trick, weighed loss function, and curriculum learning.

Solving differential equations with PINNs can bring great benefits. The solution framework is generalizable to any differential equation and it gives analytic closed-form continuous solutions which can be evaluated at arbitrary accuracy without the need for explicit interpolation. Furthermore, the solution is infinitely differentiable, and partial derivatives can easily be obtained with automatic differentiation. Finally, if we train a PINN for solution bundles then we can have a single function that will be the solution of a family of differential equations and can thus provide a solution for any parameter at test time as a forward pass.

[Final project AM 205 - Numerical Methods]

Download Report

Boosting the Training of Physics-Informed Neural Networks with Transfer Learning

This bachelor’s thesis has introduced a new paradigm in training physics-informed neural networks. Instead of random initialization, applied in almost all previous works in scientific machine learning, databases with converged physics-informed neural networks are exploited. After matching a new target problem of interest with its most similar problem from a database, the trained network parameters of that source model are taken as an initialization for the target model. Therefore, a principled framework for initialization has been presented. During training, both the speed of convergence and the final accuracy was superior.

[Bachelor thesis]

Download Report

Reinventing Engineering Simulations

As a whitepaper for my bachelor thesis, I elaborated on my thoughts on how to reinvent engineering simulations to reduce computational effort significantly and make it accessible to students at a low cost. In case you are interested in developing a next-generation tool exploiting synergies of physics and artificial intelligence, feel free to contact me via e-mail.

Download Exposé

Linear Stability Analysis of Rayleigh-Bénard Convection

In one of my final projects at École Polytechnique, we worked on a special type of natural convection. Rayleigh-Bénard convection occurs when a liquid is heated from below and enclosed in a rectangular box. Depending on the temperature difference between the top and bottom wall, the liquid shows different flow patterns which can result in beautiful pictures.

[Final project MAP 551 - Dynamic Systems]

Download Report

Hamiltonian Neural Networks

Neural networks that model systems without any prior physical knowledge usually struggle to learn the basic laws of physics and hence generate physically unrealistic solutions. This dilemma leads to the idea of physics-informed neural networks which feed physical information to the neural network. This can be done by training the neural network to minimize the error on a given physical law, i.e. by incorporating a physical law in the network’s loss function. In the presented paper [Greydanus et al. 2019], the authors integrate fundamental equations of well-established Hamiltonian mechanics into neural networks. In particular, the resulting Hamiltonian neural network (HNN) learns a parametrized version of the underlying Hamiltonian scalar and shows remarkable results in comparison to a baseline neural network (BNN) without any physics priors.

[IN 0014 - Deep Learning in Physics]

Download Report