Non-Linear Models – Biology Is Never Straight

Biological systems are beautifully complex — and this complexity often shows up in data that doesn’t follow a straight line. While linear models are simple and powerful, they can’t always capture the richness of real biological processes. That’s where non-linear models, including non-linear regression, come into play.

This article will guide you through what non-linear modeling means, how non-linear regression works, and how such models are used in biostatistics, genetics, and machine learning — all with examples in R. By the end, you’ll understand how to analyze curves, not just lines.

Why Biology Needs Non-Linear Models

Imagine you’re studying how a plant’s height changes with light intensity. At first, more light means more growth. But eventually, the plant reaches a maximum height, no matter how much light you add. That’s not a straight-line relationship — it’s a curve that levels off.

This is exactly the kind of pattern non-linear models are designed to handle. These models allow us to describe:

Saturation effects (e.g., enzyme kinetics)
Sigmoidal growth (e.g., tumor growth, population expansion)
Hormetic responses (e.g., low-dose stimulation, high-dose inhibition)
Threshold effects (e.g., gene expression activation)

What is Non-Linear Regression?

Non-linear regression is a type of regression analysis where the relationship between independent variables and the dependent variable is modeled by a non-linear equation.

Unlike linear regression, which assumes the response changes at a constant rate, non-linear regression allows rates of change to vary.

One classic example in biology is enzyme kinetics, modeled by the Michaelis-Menten equation:

[ V = \frac{V_{max} \cdot [S]}{K_m + [S]} ]

V: Reaction rate (what we’re predicting)
[S]: Substrate concentration (input variable)
Vₘₐₓ: Maximum rate the enzyme can achieve
Kₘ: The substrate concentration at which the reaction rate is half of Vₘₐₓ (a key biological constant)

This equation is non-linear because it involves a ratio and a curve, not a straight line.

What Is Non-Linear Regression?

Non-linear regression fits a curve to your data using a mathematical model that is not a straight line. In R or Python, you can estimate the values of experimental data using packages like:

In R: nls() function from base R, or minpack.lm for more robust fits
In Python: curve_fit from scipy.optimize

It helps biologists:

Estimate enzyme efficiency
Model population growth
Predict drug responses
Fit logistic curves to infection rates

Where Are Non-Linear Models Used in Biology?

You may be surprised to learn that non-linear models are everywhere in genetics and phenotype prediction:

1. Gene-Environment Interactions

Sometimes, the effect of a gene isn’t linear — it interacts in complex ways with environmental conditions like diet or stress.

2. Growth Curves

Plants, animals, and bacteria don’t grow at a constant rate. Models like the logistic or Gompertz curve help model S-shaped growth patterns.

[ P(t) = \frac{K}{1 + e^{-r(t - t_0)}} ]

Where:

P(t) is population at time t
K is carrying capacity
r is growth rate
t_0 is the time of inflection

3. Neural Networks and Machine Learning

In machine learning, non-linear activation functions like sigmoid or ReLU are crucial for capturing complexity in biological data. These models allow for detecting patterns in:

Gene expression profiles
Phenotypic traits
Imaging data (e.g., tumors)

4. Quantitative Trait Loci (QTL) Mapping

In genetics, traits are influenced by many genes and the relationship is often non-linear. Models like Gaussian processes or Bayesian networks help map complex genotype-to-phenotype relationships.

In QTL mapping, we try to relate genotype information (like SNPs) to quantitative traits (like plant height or animal weight). The effect may be non-linear, for example, when the effect saturates or follows a sigmoid curve due to biological thresholds.

Gaussian and Bayesian Models in Biology

When analyzing complex biological traits like height, growth rate, or disease resistance, data is rarely perfect or predictable. Biological systems are influenced by many small factors, random variation, and uncertainty. To understand such systems, scientists often use Gaussian and Bayesian models. These approaches help model variation, estimate unknowns, and make informed predictions even with incomplete data.

What Is a Gaussian Model?

A Gaussian model, also known as a normal distribution model, assumes that data follows a bell-shaped curve. This curve appears often in nature. For example, if you measure the height of hundreds of plants or animals, the results usually cluster around a central average, with fewer individuals being extremely tall or short.

This type of model is defined by two key parameters:

μ (mu), the mean or average
σ (sigma), the standard deviation or spread

The formula for the Gaussian distribution is:

[ f(x) = \frac{1}{\sigma \sqrt{2\pi}} \, e^{ -\frac{(x - \mu)^2}{2\sigma^2} } ]

In genetics, Gaussian models are foundational. The infinitesimal model, which assumes that a trait is influenced by many genes with small additive effects, is based on this distribution. This leads to methods like:

G-BLUP (Genomic Best Linear Unbiased Prediction)
RR-BLUP (Ridge Regression BLUP)

Both are widely used in animal and plant breeding programs to predict traits such as milk production, weight gain, or disease resistance.

What Is a Bayesian Model?

While Gaussian models assume data follows a specific shape, Bayesian models are built on the idea of updating your beliefs using both prior knowledge and new data. This is incredibly useful in biology, where we often have previous experiments, historical findings, or expert opinions before collecting new observations.

Bayesian modeling is based on Bayes’ Theorem

[ \text{Posterior} = \frac{\text{Likelihood} \times \text{Prior}}{\text{Evidence}} ]

Prior: What we believe before seeing new data
Likelihood: How likely the observed data is under different assumptions
Posterior: The updated belief after considering the new evidence

For example, imagine you’re studying whether a new drug improves recovery in infected animals. If earlier studies suggested a slight effect, and your new trial confirms a bigger effect, the Bayesian approach combines both to estimate the overall benefit more accurately.

Non-linear Models in Biology

Model Type	Used When / Use Case	Environment & Packages
Logistic Growth	Population growth with limited resources (e.g., bacterial growth curve)	R: `nls`, `brms`Python: `scipy.optimize`, `PyMC`
Exponential Growth	Unlimited growth assumption, early stages of epidemics or cell division	R: `nls`, `drm`Python: `scipy.optimize`, `lmfit`
Michaelis-Menten	Enzyme kinetics (reaction rate vs. substrate concentration)	R: `nls`, `drc`Python: `lmfit`, `PyMC`
Hill Equation	Cooperative binding in biochemical systems (e.g., hemoglobin-oxygen binding)	R: `nls`, `brms`Python: `scipy`, `PyMC`, `bokeh`
Gompertz Curve	Tumor growth, organ development, mortality modeling	R: `nls`, `brms`Python: `lmfit`, `scipy.optimize`
Sigmoid Function	Threshold effects in traits, gene expression, QTL effects	R: `brms`, `nls`, `mgcv`Python: `PyMC`, `scipy`, `tensorflow-probability`
Dose-Response Curve	Pharmacology, toxicology (e.g., LD50, EC50 analysis)	R: `drc`, `brms`, `drfit`Python: `scikit-bio`, `lmfit`, `bmd`
Nonlinear Mixed Models	Repeated measures or grouped data with nonlinear trend (e.g., animal weight growth)	R: `nlme`, `brms`, `lme4`Python: `PyMC`, `Stan`, `mixed-effects` (statsmodels planned)
Generalized Additive Models (GAMs)	Flexible non-linear relationships between variables (e.g., time series in ecology)	R: `mgcv`Python: `pyGAM`, `statsmodels`
Richards Curve	Generalized sigmoid; growth curves with varying shapes	R: `nls`, `drm`Python: custom in `lmfit`, `scipy`
Nonlinear Bayesian Regression	Any situation with prior knowledge and non-linear relationships	R: `brms`, `rstan`Python: `PyMC`, `Stan`, `TensorFlow Probability`
Saturating Hyperbola	Photosynthesis rate vs. light intensity; receptor-ligand binding	R: `nls`, `drm`Python: `lmfit`, `PyMC`