skip to content

Data-driven turbulence modelling using Gene Expression Programming

Renzhi Tian, Richard Dwight, and Stefan Hickel, Aerodynamics Group, Faculty of Aerospace, Delft University of Technology

Gene Expression Programming (GEP) is an established Evolutionary Algorithm (EA) used to search for expressions regressing a data-set [1]. The last decades have seen rapid progress and development in GEP. It has been successfully applied to classification-, regression-, and time series prediction problems. It has the benefit of producing explicit algebraic expressions which are concise and interpretable, in contrast to neural-networks and other highly parameterized models.
Turbulent flows are ubiquitous in aerospace engineering applications, wind energy, and many other fields. The numerical prediction of turbulent motion is highly challenging because of the wide range of spatial and temporal scales involved. Efficiently predicting turbulence with Computational Fluid Dynamics (CFD) therefore requires models for the small scales. Traditional turbulence modelling consists of physically motivated models, with a few parameters, calibrated on a small number of flow cases. Increasing Machine Learning (ML) methods are used to increase the model parameterization and fit large amounts of data. Since these models are running inside a CFD code, with physical models for mass, momentum and energy transport, they are required to be efficient in evaluation, well-behaved in extrapolation, and interpretable, and they must not destabilize the simulation.
GEP’s promising potential for turbulence modelling has been put in the spotlight by Weatheritt et al. [2]. In the present study, we develop an efficient GEP platform for turbulence modelling, attempting to reproduce the capability of Weatheritt et al., before attempting to improve the algorithms, especially with respect to reducing the number of code evaluations required to fit a model.
Currently, our algorithm is dimension-aware and capable of optimizing model constants with a gradient-based method running inside GEP. We demonstrate this algorithm for a simple flow consisting of periodic hills, in two settings: (i) an a priori setting, where the required model-corrective fields are known in advance [3], and must only be regressed in terms of the flow quantities; and (ii) an a posteriori setting, where the best model running in situ in the CFD code is found. We later aim to use a multi-fidelity training approach, using known corrective fields as a proxy for in situ model fitness, without running the full solver at every step.

[1] C. Ferreira. Gene Expression Programming: Mathematical modelling by an artificial intelligence. Springer, 2006.

[2] J. Weatheritt and R. Sandberg. A novel evolutionary algorithm applied to algebraic modifications of the RANS stress–strain relationship. Journal of Computational Physics, 325 (2016): pp.22-37.

[3] M. Schmelzer, R. P. Dwight, and P. Cinnella. Discovery of algebraic Reynolds-stress models using sparse symbolic regression. Flow, Turbulence and Combustion, 104 (2022): pp.579-603