Genetically identical cells from the same clonal population may exhibit variations in their gene expression pattern, even when cultivated simultaneously under similar conditions. This phenomenon can be observed as variability in protein expression within an immunofluorescence (IF) image, denoted single-cell variability (SCV). The scale and significance of single-cell variation at the single cell level remains poorly understood, and the origin can be both due to stochastic fluctuations or deterministic influences
(Snijder B et al, 2011;
Kilfoil ML et al, 2009;
Ansel J et al, 2008;
Colman-Lerner A et al, 2005). Interestingly, as much as 16% (n=1896) of all human proteins localized in the Cell Atlas show single-cell variation in their expression. Functional enrichment analysis of these proteins reveal an enrichment for terms related to cell cycle processes and extracellular stimuli response.
So far, our ongoing research has classified the variation to be cell cycle dependent for 217 proteins.
1896 proteins show cell to cell variation in their expression patterns. Of these, 1671
proteins show variation in expression level (intensity), and 222 proteins shows single-cell spatial
217 proteins show a variation correlated to cell cycle progression.
Figure 1. Examples of proteins showing single-cell variation. FAM63B is a hydrolase that plays a role in ubiquitination (detected in U-2 OS cells). KLHDC8B aids protein-protein interactions (detected in U-2 OS cells). CNN1 is implicated in the regulation and modulation of smooth muscle contraction (detected in BJ). PPM1K regulates the mitochondrial permeability transition pore (detected in U-2 OS cells). MKI67 is required to maintain individual mitotic chromosomes dispersed in the cytoplasm following nuclear envelope disassembly (detected in A-431 cells). LGALS1 plays a role in regulating apoptosis, cell proliferation and cell differentiation (detected in U-2 OS cells). IFIT1 may inhibit viral replication and translational initiation (detected in HeLa cells). PDX1 is a transcriptional activator (detected in HeLa cells). KRT17 encodes the type I intermediate filament chain keratin 17 (detected in U-2 OS cells).
Single-cell variation in the Cell Atlas
In the IF confocal images, single-cell variation can easily be observed; either as different expression levels (staining intensity) or different spatial distribution as exemplified in Figure 1. Out of the 1896 proteins displaying single-cell variation, 1671 proteins show variation in expression level and 222 show variation in spatial distribution. Single-cell variation is most commonly observed for proteins localized to the nucleus, cytosol, nucleoli and mitochondria (Figure 2).
Figure 2. Distribution of the genes encoding for proteins showing single-cell variations across the different organelles, grouped by meta-compartments.
It is hypothesized that there is an underlying functional importance of single-cell variations, however the effect of these variations remain highly uncharacterized
(Dueck H et al, 2016). Factors such as environmental changes, stochasticity, cell cycle progression, DNA damage response, post-translational modifications and activation/suppression signals are all known to cause changes in protein expression within a cellular population
(Alberts B et al, 2002a);
Liberali P et al, 2015;
Elowitz MB et al, 2002;
Kaern M et al, 2005). Proteins with single-cell varying expression alter the characteristics of isogenic cells and provide them with a specific fingerprint. Identification of all human proteins that display single-cell variation provides a starting point for studies aimed to research the driving forces of the expression dynamics and provide a functional understanding.
Gene Ontology (GO)-based enrichment analysis of genes encoding proteins with single-cell variable expression patterns reveals several functions associated with cell cycle progression and cellular response to various extracellular stimuli (Figure 3). The enriched terms for the GO domain Biological Process are related to post-translational modifications, processes involved in cell cycle progression and cellular response to various extracellular stimuli (Figure 3a). Enrichment analysis of GO domain Molecular Function provides top hits for terms related to snoRNA binding, Cdks and transcriptional activators and factors (Figure 3b). The enrichment of cell cycle related terms, further supports the hypothesis that a large extent of the observed single-cell variation may be correlated to cell cycle progression. This underlines the importance of distinguishing between variations due to the cell cycle and variations caused by other factors.
Figure 3a. Gene Ontology-based enrichment analysis for the cell cycle dependent proteome showing the significantly enriched terms for the GO domain Biological Process.
Figure 3b. Gene Ontology-based enrichment analysis for the cell cycle dependent proteome showing the significantly enriched terms for the GO domain Molecular Function.
Towards characterizing the cell cycle proteomes
The human body is estimated to contain approximately 37 trillion cells
(Bianconi E et al, 2013). Every second, it needs to reproduce many millions of cells to replace the cells that die. Consequently, cells constantly undergo duplication via the cell cycle, a highly conserved series of events that ultimately leads to division into two daughter cells. A complex network of regulatory proteins called the "cell cycle control system" keeps the cell cycle tightly controlled and responsive to various intracellular and extracellular signals. Proteins called cyclins, such as
CCNA2, CCND2, CCNE1, constitute the core machinery for controlling and driving cell cycle progression together with their catalytic partners cyclin-dependent kinases, such as
CDK1, CDK2, CDK4
(Malumbres M. 2014;
Alberts B et al, 2002b). Aberrations in this control system can lead to proliferative diseases such as cancer
(Collins K et al, 1997;
Zhivotovsky B et al, 2010.
The cell cycle consists of four main phases: gap 1 (G1), synthesis (S), gap 2 (G2), and mitosis (M). G1, S, and G2 are together called the interphase during which the cell grows, and is
followed by the M phase, when cell division occurs. Depending on extracellular signals, the cell may enter a rest phase called G0 instead of proceeding through G1. The cell can remain in G0 phase for years or even permanently until the organism dies. During the G1 phase, the cell increases its mass of proteins and initiates the synthesis of D-type cyclins, such as
CCND1 and CCND2 that binds to
CDK4 or CDK6. This activated complex drives the G1/S phase transition. Thereafter, S phase is initiated, in which DNA replication occurs. Activation of S-Cdks, such as
CDK2, triggers the assembly of proteins needed to unwind the DNA helix and recruit DNA polymerases and other replication enzymes onto the DNA strands. The G2 phase follows the successful completion of S phase, where the cell continues to grow and many proteins are synthesized in preparation for mitosis. After checking for and repairing DNA damage, the cell enters mitosis - the shortest, yet most crucial phase of cell division. An increase of mitotic cyclins, such as
CCNB1, activates the mitotic Cdks, such as
CDK1. Its activation triggers various cell rearrangements including chromosome condensation, nuclear envelope breakdown, and mitotic spindle assembly. Thereafter, the chromosomes align midway between the mitotic spindle poles before segregating and finally, the cell divides into two separate daughter cells
(Alberts B et al, 2007).
The cell cycle control system is well conserved through evolution; rendering possible the study of cell cycle regulation in a variety of organisms and model systems. The most commonly studied organisms are yeast, animal embryos and mammalian cell cultures. Genome-wide studies using DNA microarray technology has revealed between 400 to 800 genes periodically expressed for S. cerevisiae
(Cho RJ et al, 1998;
Spellman PT et al, 1998;
Orlando DA et al, 2008;
Rustici G et al, 2004). Recent investigations in mammalian cells, show that approximately 700 genes display transcriptional fluctuations with a periodicity consistent with the cell cycle in primary human fibroblasts
(Cho RJ et al, 2001), and >850 genes are periodically expressed during the cell cycle in synchronized HeLa cells using cDNA microarrays
(Whitfield ML et al, 2002).
In our research, two approaches are used to investigate if the observed single-cell variation is correlated with the cell cycle. In the first approach, selected proteins with an observed single-cell variation are stained in the U-2 OS FUCCI (Fluorescence Ubiquitination Cell Cycle Indicator) cell line. The FUCCI cells, are cells tagged with two different fluorescent dyes, each fused into different cell cycle regulators that allow cell cycle monitoring;
CDT1, expressed in G1 phase, and Geminin
(GMNN), expressed in S and G2 phases
(Sakaue-Sawano A et al, 2008). Using this approach we confirmed a cell cycle-dependent expression of 64 proteins as exemplified in Figure 4.
Figure 4. Staining of proteins in U-2 OS FUCCI cells to characterize the cell cycle dependency of the protein expression pattern. The FUCCI cells express the cell cycle regulators Cdt1 (red, G1 phase) and Geminin (Green, S and G2 phases). When both proteins are present, the overlay of the images appears in yellow marking the G1/S transition, top row. Staining of proteins targeting respectively
ANLN, which is required for cytokinesis,
ADA, which is an enzyme that catalyzes the hydrolysis of adenosine to inosine, and
TPX2, which is required for normal assembly of mitotic spindles, bottom row.
In the second approach, the cell cycle position was inferred from the DAPI and microtubule features of each cell using a continuous time computational regression model trained on the expression of Cyclin B1
(CCNB1). Protein expression over the resulting pseudo time series for each protein was then fit with a periodic regressive function to capture the periodic nature of protein expression over the cell cycle. The significance of this periodic fit between single-cell variation and cell cycle position was assessed using a permutation test for each assay and fits with significance p<0.05 were selected for publication on the Cell Atlas (Figure 5). This approach allowed us to confirm and characterize the continuous cell cycle-dependent expression of 18 proteins to date.
Figure 5. Expression profile of ANLN throughout the cell cycle (see Figure 4 for example image). The time axis was generated based on time-series tracing of U2-OS cells under standard growth conditions and the y-axis represents the expression relative to the maximal per-cell expression for an experiment.
In addition to these metrics, 144 proteins were categorized as having a cell cycle dependent expression as they localize to structures only present at certain time points of the cell cycle. In the Cell Atlas we define these to be: the
cytokinetic bridge, midbody, midbody ring and
The investigation of the extent of cell cycle dependency for the proteins exhibiting single-cell variation in their expression patterns is a work in progress. To date, evidence from our analysis can confirm that for 217 of the single-cell variable proteins the expression correlates to cell cycle position (see examples, Figure 6).
Figure 6. Example images of cell cycle dependent protein expression validated with at least one of the approaches described or by biological definition.
CCNB1 is essential for the control of the cell cycle at the G2/M transition (detected in U-2 OS cells).
TOP2A is involved in processes such as chromosome condensation or chromatid separation (detected in U-2 OS cells).
CDCA2 is a regulator of chromosome structure during mitosis (detected in U-2 OS cells).
CCNB2 is a member of the cyclin B family (detected in U-2 OS cells).
FAM64A may play a role in the control of metaphase-to-anaphase transition (detected in U-2 OS cells).
NCOA1 acts as a transcriptional coactivator (detected in U-2 OS cells).
CCSAP is involved in the maintenance of
NUMA1 at the spindle poles (detected in U-2 OS cells).
CKAP2L is required for mitotic spindle formation (detected in A-431 cells).
CTTNBP2 regulates the dendritic spine distribution of cortactin (detected in HeLa cells).
Cho RJ et al, 1998. A genome-wide transcriptional analysis of the mitotic cell cycle.Mol Cell.
Cho RJ et al, 2001. Transcriptional regulation and function during the human cell cycle.Nat Genet.
PubMed: 11137997 DOI: 10.1038/83751
Collins K et al, 1997. The cell cycle and cancer.Proc Natl Acad Sci U S A.
Colman-Lerner A et al, 2005. Regulated cell-to-cell variation in a cell-fate decision system.Nature.
PubMed: 16170311 DOI: 10.1038/nature03998
Dueck H et al, 2016. Variation is function: Are single cell differences functionally important?: Testing the hypothesis that single cell variation is required for aggregate function.Bioessays.
PubMed: 26625861 DOI: 10.1002/bies.201500124