PhD theses
(2585 Words, 15 Minutes)
Alex Moriarty (2025)

Improving formulations efficiency with molecularly-enhanced machine learning
Herbicide retention on leaf surfaces is a critical challenge in agricultural chemistry, with significant economic and environmental implications. This thesis integrates molecular simulation, coarse-grained (CG) model development, and machine learning to address the design of surfactants that enhance droplet adhesion and reduce dynamic surface tension (DST). A series of CG models for Aerosol-OT (AOT) is developed and validated against experimental observables, enabling the first computational observation of micelle-to-vesicle transitions for this surfactant. The physical processes underlying surfactant adsorption are discussed, and CG simulations of AOT are used to estimate the kinetics of different competing pathways to adsorption. A model of a leaf surface based on the alkane octacosane is developed, and the adsorption of AOT vesicles on this surface is investigated. Evidence is presented for the importance of vesicle hemifusion to rapid DST reduction. Machine learning approaches, including graph neural networks (GNNs) and Gaussian processes (GPs), are applied to surfactant datasets for critical micelle concentration prediction, outperforming traditional methods and providing principled uncertainty quantification. It is shown that dimensionality reduction of the GNN latent space facilitates chemical space analysis and interpretation of surfactant similarity. The combined simulation and data-driven framework described here establishes a robust methodology for surfactant design, with direct applications to agricultural formulation development. These findings contribute to a deeper understanding of molecular mechanisms underlying droplet retention and provide a foundation for future empirical and computational research in surfactant science.
Xinrui Cai (2025)

Clathrate Hydrates: Effects of Kinetic and Thermodynamic Promoters probed via Molecular Simulations
Gas hydrates, crystalline compounds made from water molecules formed under high-pressure and low-temperature conditions, have attracted significant interest for their potential in energy storage and carbon capture. However, their slow formation kinetics and narrow thermodynamic stability range pose significant challenges for practical applications. This thesis investigates, at the molecular level, how chemical additives known as promoters influence gas hydrate growth. Using molecular dynamics (MD) simulations, thermodynamic and kinetic promoters were studied, either by themselves or in combination. Explicitly, the role of L-tryptophan (TRP) as a kinetic promoter and 1,3-dioxane (DIO) as a thermodynamic promoter was individually examined. TRP was found to enhance interfacial structuring and facilitate the adsorption of CO2 molecules into hydrate cavities, thus promoting hydrate growth. On the other hand, DIO was observed to induce the formation of structure II (sII) hydrates rather than the typical structure I (sI) hydrates observed in the pure CH4 system, leading to the emergence of a novel methane uptake pathway driven by preferential cage occupancy. The combined effect of kinetic promoter - sodium dodecyl sulfate (SDS) and thermodynamic promoter - tetrahydrofuran (THF) revealed a complex interplay: while synergistic behaviour was observed at certain concentrations, SDS-induced micelle formation at elevated temperatures sequestered THF, leading to a reduction in its thermodynamic stabilisation and to an antagonistic effect. Moreover, a novel machine learning framework was developed to more accurately classify the local structure and identify hydrate cages, interfacial regions and bulk liquids. The findings in this thesis offer important insights into how hydrate promoters behave and serve as a useful foundation for designing more effective additive systems in hydrate-based applications.
Antoniu Bjola (2025)

Mean Force Integration: A Unified Framework for Merging Independent Simulations subject to Various Bias Potentials
Molecular dynamics (MD) simulation has become a powerful tool for studying and predicting molecular properties due to significant algorithm advances and the explosive growth of computational capabilities. Moreover, molecular dynamics enables the direct evaluation of free energy surfaces (FESs), offering atomistic insight that complements experimental studies and enables the prediction of numerous thermodynamic properties. Yet MD remains constrained by system size and accessible time scales. Furthermore, many processes of interest, such as nucleation or protein folding, are characterised by rare events, where considerable energy barriers impede transitions between stable states. Nevertheless, transitions must be sampled multiple times for statistically significant predictions of free energies, rendering brute force simulations unfeasible. To address this issue, numerous methods have been proposed to enhance the sampling of rare events. Two widely used methods are umbrella sampling and metadynamics. The former makes use of parallel simulations that sample local predefined regions of configuration space, while the latter continually constructs a bias potential that facilitates the sampling of high-energy configurations. A new method called mean force integration (MFI), which works on the basis of metadynamics, computes the mean force rather than the FES directly, thereby simplifying reweighting and accelerating convergence. Additionally, it can be used to combine independent simulations, turning a serial problem into a parallel one, which increases computational efficiency. This thesis advances MFI to a versatile framework: A general formulation is presented, accommodating the combination of arbitrary static and history-dependent biases. This is complemented by an on-the-fly uncertainty metric that estimates the convergence of the mean force, and a bootstrap analysis that provides a quantitative assessment of the error of the FES. These advances are validated with complex chemical systems, including the nucleation of supersaturated argon vapour, the two-step crystallisation of a colloidal system, and the beta-scission reaction of butyl acrylate. It is shown how the computational cost of excessively expensive simulations can be reduced by employing several shorter simulations subject to diverse biasing parameters. The resulting under-converged trajectories were analysed and combined with MFI, resulting in converged FESs. For the beta-scission reaction, the FES was used to predict reaction rates, which agreed with experimental rates. Additionally, novel reinitialisation protocols are introduced, dividing simulations into diverse biasing stages and recycling interim FES estimates as starting static biases, thereby consistently enhancing convergence of the FES. This was further developed into a framework where simulations are analysed in real time, terminated and reinitialised automatically, as biasing parameters are optimised iteratively. To encourage a wider adoption of MFI, all the Python code used in this work is made openly accessible at github.com/mme-ucl/MFI. By unifying data from independent biased trajectories, enabling an iterative improvement of biasing parameters, and providing reliable convergence metrics, MFI broadens the range of phenomena that researchers can tackle.
Alex Ferreira (2025)

Sampling and Classifying High-Dimensional Conformational Free Energy Landscapes of Active Pharmaceutical Ingredients
We present a new method for calculating the high-dimensional conformational free energy landscapes of flexible drug-like molecules. Using Density Peaks Advanced’s density estimator, the free energy associated with individual configurations sampled from an enhanced sampling simulation can be calculated in a gridless manner, thus enabling the mapping of conformational ensembles in dimensionalities computationally inaccessible to grid-based methods. Due to the physics-based configurational sampling, conformers can be characterized by the configurations corresponding to the density peaks. The gridless nature of this method enables this characterization in the full dimensionality of a flexible molecule’s conformation space. This method can produce per-point free energy maps, which enable the study of conformational interchanges in a level of detail previously inaccessible. This method is initially demonstrated on molecules with 2, 4, and 11-dimensional conformational spaces and is presented alongside a set of consistency checks which enable the quality of the high-dimensional results to be assessed. Finally, to further demonstrate the utility of the method, a study of the conformational landscapes of 4 different molecules is presented. Each molecule is subjected to two distinct solvent environments, which have been linked experimentally to conformational changes in these systems. The subsequent impact on the high-dimensional free energy landscapes is explored through the use of Sketch-map projections and visual inspections of the conformers generated by the method.
Florian Dietrich (2025)

Machine Learning Nucleation Collective Variables using Graph Neural Networks
One of the main obstacles limiting the use of enhanced sampling techniques to study nucleation events, in particular those of molecular systems, in realistic environments, is the computational cost associated with evaluating and differentiating nucleation collective variables. This work addresses this bottleneck by developing a framework for training approximations of nucleation collective variables based on graph neural networks. These approximations are more computationally efficient and scale more favourably with system size than their analytical counterparts. This work further demonstrates how to recover the free energy surfaces corresponding to the analytical variables from model sampling, which enables the construction of free energy profiles in non-differentiable, previously inaccessible, descriptor spaces. The work on constructing, training and deploying graph neural network-based machine-learned collective variables led to studying the theoretical implications of constructing free energy surfaces in the space of variables that are the result of stochastic processes. This has led to the development of guidelines for the interpretation of free energy surfaces in machine-learned collective variable spaces, a novel gradient rescaling method and a novel regularisation technique. The generality and transferability of this framework are finally demonstrated by enhancing the sampling of guanine nucleation from solution. A system for which the nucleation mechanism was elucidated prior, in a joint experimental and computational study.
Nicholas Francia (2022)

Reducing crystal structure overprediction: from small rigid molecules to conformationally complex drugs
In the pharmaceutical industry, the control of a new drug’s crystal form is key to optimise its formulation and mode of action. Computational Crystal Structure Prediction (CSP) methods for organic crystalline materials are becoming increasingly accurate at predicting the relative stability between packings, even if they usually grossly overestimate the number of polymorphs. The purpose of this work is to develop a systematic and scalable method to reduce CSP sets to a small number of putative polymorphs by including temperature effects. In fact, not all hypothetical structures corresponding to local minima in the lattice energy landscape are expected to be stable at finite temperature with many of these that merge into a smaller set of persistent states. In order to identify persistent structures, classical molecular dynamics simulations at finite temperature are performed on CSP-generated crystal structures. Unstable structures are thus automatically removed by checking if molecules exhibit a random inter-molecular orientation, typical of the melted state. On the other hand, to identify those structures that convert to the same geometry, I devised a clustering analysis based on probabilistic fingerprints that provide information on the relative position, relative orientation and conformation of molecules within a dynamic crystal supercell. These molecule-specific fingerprints are able to efficiently distinguish different structures of large supercells and can handle robustly the displacement of atomic positions from equilibrium typical of finite-temperature simulations. These are used to quantitatively assess the similarity between pairs of structures and cluster analogous geometries. Finally, I used Well-Tempered Metadynamics on the cluster centres to overcome MD limits and sample possible slow transitions. I applied this method on molecules of increasingly conformational complexity and datasets spanning from a few dozens to thousands of structures. Instrumental in achieving scalability over a large set of crystal structures has been the development of a Python library that handles the setup of MD simulations and automatically analyses the resulting trajectories, enabling us to manage the large sets of structures typical of real-world CSP applications.
Loukas Kollias (2020)

Molecular modelling of the early stages of Metal-Organic Framework self-assembly
Metal-Organic Frameworks (MOFs) constitute a class of novel hierarchical materials. MOFs are strong candidates in several applications including catalysis, carbon capture and storage and drug delivery. Past and current research on MOFs has utilised several experimental techniques. Nevertheless, a thorough investigation of MOF synthesis requires molecular simulations in order to provide information at length scales unreachable scale for experimental techniques and thus understand the mechanisms of assembly. The purpose of this research project is to study the formation of the MIL-101(Cr) structural building units (SBU) using Molecular Dynamics. A bottom-up approach of assembly is followed starting from the evaluation of half-SBU conformational flexibility in solution. SBU association-dissociation and rearrangement are then assessed leading to a connection between synthesis conditions and the configuration of small scale adducts at an early stage of assembly. In particular the effect of ions (Na+, F-) and solvent (water, DMF) on promoting crystal– like configurations of SBUs is investigated. The enthalpic and entropic contributions are also calculated under various conditions leading to a better understanding on the thermostructural behaviour of the conformers. Finally, the collective behaviour of SBUs during assembly is analysed through simulations in which numerous half– SBUs interact and form clusters. In summary, this work provides a molecular-level understanding to the experimental finding that ions favour crystallinity in MOFs. Ultimately, the conformational complexity in early stages of MOF self-assembly leads to the conclusion that guest molecules that affect the entropic landscape of MOF precursors are key in order to regulate the extent of defects in a MOF cluster.
Veselina Marinova (2020)

Molecular-Level Characterisation of Crystal-Solution Interfaces
The shape of solution-grown crystal particles is largely dependent on the relative growth rate of the morphologically dominant crystal faces, which is known to be affected by the solvent. Developing accurate models for predicting crystal morphologies requires a molecular-level understanding of the solid-liquid interface. Using a combination of molecular dynamics simulations and enhanced sampling methods, this work carries out a comprehensive study on the dynamics and thermodynamics of crystal-solution interfaces for the case of ibuprofen, focusing on aspects often neglected in mesoscopic models for crystal growth. An investigation on the conformational isomerism of ibuprofen shows that conformational rearrangements at the crystal-solution interface are governed by specific surface-solvent interactions and can have a non-negligible impact on the surface growth/dissolution kinetics. An unsupervised clustering algorithm is proposed to extend the study of conformational isomerism for systems with a large number of conformationally relevant degrees of freedom. By assessing thermodynamic and kinetic information on the solvent in contact with crystal surfaces, surface-solvent interactions are found to be solvent- and face-specific. Following this analysis, a computational screening procedure is proposed for identifying solvents which can significantly affect the relative growth rate of the crystal facets and hence, the growth morphology of the crystal. To gain an in-depth understanding into the role of the solvent on the ease of association/dissociation of solute molecules at the crystal surface, a study on the formation of a vacancy on the morphologically dominant crystal faces of ibuprofen is carried out. Thermodynamics of the process reveal a distinct solvent-dependency for several faces, indicating in such cases desolvation-dominated defect formation. The research subject of this dissertation contributes to developing general and computationally-affordable workflows necessary to obtain a comprehensive and quantitative understanding of molecular processes, impacting the solid-liquid interface, which will contribute towards the formulation of detailed mesoscopic growth and dissolution models.
Ilaria Gimondi (2020)

A molecular modelling journey from packing to conformational polymorphism
The efficient and reproducible crystallisation of a polymorph showing the desired properties and functionalities is crucial in a variety of fields, such as the pharmaceutical sector. Characterising thermodynamics and mechanisms of polymorphic transitions at the molecular level is thus a key step towards developing a rational design of crystallisation processes and products. Despite its relevance, a systematic computational analysis of polymorphism and polymorphic transitions still represents a major challenge. In this thesis, metadynamics is employed in combination with state-of-the-art techniques, such as committor analysis and Markov State Models, to provide insight into polymorphism in molecular systems. The first part of the work focuses on packing polymorphism. The investigation of the transition between phases I and III in bulk carbon dioxide aims at testing a set of computational tools able to characterise in detail thermodynamics and mechanism of polymorphic transitions. This set-up is then applied and further developed for the study of CO2 confined in cylindrical nanopores, unveiling a complex landscape of ordered structures, unaccessible in unconfined conditions. Next, the serendipitous and irreproducible discovery of a new polymorph of succinic acid, γ, provides a challenging context to tackle the study of conformational polymorphism. Form γ presents folded conformers in its unit cell, while the other known polymorphs show planar molecules. From molecular dynamics and metadynamics, γ appears labile and metastable, a characteristic that might hinder its crystallisation. The study of the conformational behaviour of succinic acid in water reveals fast interconversions within a network of nine conformers, both folded and planar, among which the folded conformation observed in γ is the most thermodynamically stable. The high flexibility of this molecule is relevant in determining the nucleation mechanism. Simulations of supersaturated solutions and of crystal seeds dissolution suggest that nucleation cannot be classical, but it is rather likely to be a multi-step process.