Summary: | The bulk of stochastic gene expression models in the literature do not have an explicit description of the age of a cell within a generation and hence they cannot capture events such as cell division and DNA replication. Instead, many models incorporate the cell cycle implicitly by assuming that dilution due to cell division can be described by an effective decay reaction with first-order kinetics. If it is further assumed that protein production occurs in bursts, then the stationary protein distribution is a negative binomial. Here we seek to understand how accurate these implicit models are when compared with more detailed models of stochastic gene expression. We derive the exact stationary solution of the chemical master equation describing bursty protein dynamics, binomial partitioning at mitosis, age-dependent transcription dynamics including replication, and random interdivision times sampled from Erlang or more general distributions; the solution is different for single lineage and population snapshot settings. We show that protein distributions are well approximated by the solution of implicit models (a negative binomial) when the mean number of mRNAs produced per cycle is low and the cell cycle length variability is large. When these conditions are not met, the distributions are either almost bimodal or else display very flat regions near the mode and cannot be described by implicit models. We also show that for genes with low transcription rates, the size of protein noise has a strong dependence on the replication time, it is almost independent of cell cycle variability for lineage measurements, and increases with cell cycle variability for population snapshot measurements. In contrast for large transcription rates, the size of protein noise is independent of replication time and increases with cell cycle variability for both lineage and population measurements.
|