Lyophilization Mechanistic Model
First-principles ODE simulator for freeze-drying LNP/mRNA formulations. Nine coupled state variables across four process phases, with a batch driver that generates synthetic datasets for MSPC and ML pipelines.
Context
Lyophilization is the process that stabilizes mRNA vaccines for storage and transport. Each experimental run occupies expensive equipment for hours, the parameter space is large, and failed runs destroy drug product. There is a clear need for synthetic data, both for process understanding and for training downstream ML models, without consuming bench time.
This project is a mechanistic simulator grounded in heat transfer, mass transfer, and LNP degradation kinetics. Developed at the University of Massachusetts Lowell Department of Chemical Engineering.
Model
The core is a 9-state ODE system: product temperature (Tp), ice front position (z), chamber pressure (Pc), residual moisture (X), potency (N), cake resistance (Rp), encapsulation efficiency (EE), RNA integrity (RIN), and particle size (Dh). The first six states follow standard freeze-drying physics. The last three capture LNP-specific dynamics:
- Encapsulation efficiency: Arrhenius leakage kinetics modulated by moisture exposure, pH, and glass transition proximity
- RNA integrity: Coupled thermal and hydrolytic degradation pathways with moisture dependence
- Particle size: Aggregation driven by temperature history, encapsulation loss, and ionic environment
These are mechanistic relationships derived from the underlying chemistry, not empirical curve fits.
Specifications
| Parameter | Value |
|---|---|
| State vector | [Tp, z, Pc, X, N, Rp, EE, RIN, Dh] — 9 coupled ODEs |
| Solver | SciPy solve_ivp, BDF method, rtol = 1e-6 |
| Process phases | Freeze, Anneal, Primary Dry, Secondary Dry (automatic transition detection) |
| LNP kinetics | Arrhenius leakage (E_leak = 15 kJ/mol), thermal + hydrolytic RNA degradation (E_deg = 15 kJ/mol), ionic-coupled aggregation |
| Configuration | Single config.yaml, strict runtime validation, no default substitution |
| Output formats | CSV, Parquet (MSPC and ML pipeline ready) |
| CQA predictions | Residual moisture, EE, RIN, particle size (Z50), PDI, reconstitution time, visual cake |
| Noise model | Configurable sensor noise, vial-to-vial parameter jittering |
Synthetic Data Generation
A batch driver wraps the ODE kernel with recipe parsing from Excel templates (matching the format labs already use), vial-to-vial parameter jittering, and configurable sensor noise. Output is structured CSV/Parquet, ready for MSPC or ML pipelines.
All parameters resolve from a single config.yaml. Missing parameters raise explicit errors at startup. This was a deliberate choice after earlier versions had inconsistent defaults across modules (e.g., E_leak defaulting to 40,000 J/mol in one file and 45,000 in another).
CQA Correlation Analysis
The system maps process-quality relationships across the full parameter space: EE loss vs thermal exposure, RIN degradation vs cumulative moisture damage, particle aggregation drivers (pH, ionic strength, lipid ratio), and per-phase contribution to final CQA outcomes.