THE HUMAN CELL


Genetically identical cells from the same clonal population may exhibit variations in their gene expression pattern, even when cultivated simultaneously under similar conditions. This phenomenon can be observed as variability in protein expression within an immunofluorescence (IF) image, denoted cell-to-cell variability (CCV). The scale and significance of cell-to-cell variation at the single cell level remains poorly understood, and the origin can be both due to stochastic fluctuations or deterministic influences (Snijder B et al, 2011; Kilfoil ML et al, 2009; Ansel J et al, 2008; Colman-Lerner A et al, 2005). Interestingly, as much as 16% (n=1903) of all human proteins localized in the Cell Atlas show cell-to-cell variation in their expression. Functional enrichment analysis of these proteins reveal an enrichment for terms related to cell cycle processes and extracellular stimuli response. So far, our ongoing research has classified the variation to be cell cycle dependent for 219 proteins.

  • 1903 proteins show cell to cell variation in their expression patterns. Of these, 1678 proteins show variation in expression level (intensity), and 222 proteins shows cell to cell spatial variation.
  • 219 proteins show a variation correlated to cell cycle progression.

FAM63B - U-2 OS
KLHDC8B - U-2 OS
CNN1 - BJ


PPM1K - U-2 OS
MKI67 - A-431
LGALS1 - U-2 OS


IFIT1 - HeLa
PDX1 - HeLa
KRT17 - U-2 OS

Figure 1. Examples of proteins showing cell-to-cell variation. FAM63B is a hydrolase that plays a role in ubiquitination (detected in U-2 OS cells). KLHDC8B aids protein-protein interactions (detected in U-2 OS cells). CNN1 is implicated in the regulation and modulation of smooth muscle contraction (detected in BJ). PPM1K regulates the mitochondrial permeability transition pore (detected in U-2 OS cells). MKI67 is required to maintain individual mitotic chromosomes dispersed in the cytoplasm following nuclear envelope disassembly (detected in A-431 cells). LGALS1 plays a role in regulating apoptosis, cell proliferation and cell differentiation (detected in U-2 OS cells). IFIT1 may inhibit viral replication and translational initiation (detected in HeLa cells). PDX1 is a transcriptional activator (detected in HeLa cells). KRT17 encodes the type I intermediate filament chain keratin 17 (detected in U-2 OS cells).

Cell-to-cell variation in the Cell Atlas


In the IF confocal images, cell-to-cell variation can easily be observed; either as different expression levels (staining intensity) or different spatial distribution as exemplified in Figure 1. Out of the 1903 proteins displaying cell-to-cell variation, 1678 proteins show variation in expression level and 222 show variation in spatial distribution. Cell-to-cell variation is most commonly observed for proteins localized to the nucleus, cytosol, nucleoli and mitochondria (Figure 2).

Figure 2. Distribution of the genes encoding for proteins showing cell-to-cell variations across the different organelles, grouped by meta-compartments.

It is hypothesized that there is an underlying functional importance of cell-to-cell variations, however the effect of these variations remain highly uncharacterized (Dueck H et al, 2016). Factors such as environmental changes, stochasticity, cell cycle progression, DNA damage response, post-translational modifications and activation/suppression signals are all known to cause changes in protein expression within a cellular population (Alberts B et al, 2002a); Liberali P et al, 2015; Elowitz MB et al, 2002; Kaern M et al, 2005). Proteins with cell-to-cell varying expression alter the characteristics of isogenic cells and provide them with a specific fingerprint. Identification of all human proteins that display cell-to-cell variation provides a starting point for studies aimed to research the driving forces of the expression dynamics and provide a functional understanding.

Gene Ontology (GO)-based enrichment analysis of genes encoding proteins with cell-to-cell variable expression patterns reveals several functions associated with cell cycle progression and cellular response to various extracellular stimuli (Figure 3). The enriched terms for the GO domain Biological Process are related to post-translational modifications, processes involved in cell cycle progression and cellular response to various extracellular stimuli (Figure 3a). Enrichment analysis of GO domain Molecular Function provides top hits for terms related to snoRNA binding, Cdks and transcriptional activators and factors (Figure 3b). The enrichment of cell cycle related terms, further supports the hypothesis that a large extent of the observed cell-to-cell variation may be correlated to cell cycle progression. This underlines the importance of distinguishing between variations due to the cell cycle and variations caused by other factors.

Figure 3a. Gene Ontology-based enrichment analysis for the cell cycle dependent proteome showing the significantly enriched terms for the GO domain Biological Process.

Figure 3b. Gene Ontology-based enrichment analysis for the cell cycle dependent proteome showing the significantly enriched terms for the GO domain Molecular Function.

Towards characterizing the cell cycle proteomes


The human body is estimated to contain approximately 37 trillion cells (Bianconi E et al, 2013). Every second, it needs to reproduce many millions of cells to replace the cells that die. Consequently, cells constantly undergo duplication via the cell cycle, a highly conserved series of events that ultimately leads to division into two daughter cells. A complex network of regulatory proteins called the "cell cycle control system" keeps the cell cycle tightly controlled and responsive to various intracellular and extracellular signals. Proteins called cyclins, such as CCNA2, CCND2, CCNE1, constitute the core machinery for controlling and driving cell cycle progression together with their catalytic partners cyclin-dependent kinases, such as CDK1, CDK2, CDK4 (Malumbres M. 2014; Alberts B et al, 2002b). Aberrations in this control system can lead to proliferative diseases such as cancer (Collins K et al, 1997; Zhivotovsky B et al, 2010.

The cell cycle consists of four main phases: gap 1 (G1), synthesis (S), gap 2 (G2), and mitosis (M). G1, S, and G2 are together called the interphase during which the cell grows, and is followed by the M phase, when cell division occurs. Depending on extracellular signals, the cell may enter a rest phase called G0 instead of proceeding through G1. The cell can remain in G0 phase for years or even permanently until the organism dies. During the G1 phase, the cell increases its mass of proteins and initiates the synthesis of D-type cyclins, such as CCND1 and CCND2 that binds to CDK4 or CDK6. This activated complex drives the G1/S phase transition. Thereafter, S phase is initiated, in which DNA replication occurs. Activation of S-Cdks, such as CDK2, triggers the assembly of proteins needed to unwind the DNA helix and recruit DNA polymerases and other replication enzymes onto the DNA strands. The G2 phase follows the successful completion of S phase, where the cell continues to grow and many proteins are synthesized in preparation for mitosis. After checking for and repairing DNA damage, the cell enters mitosis - the shortest, yet most crucial phase of cell division. An increase of mitotic cyclins, such as CCNB1, activates the mitotic Cdks, such as CDK1. Its activation triggers various cell rearrangements including chromosome condensation, nuclear envelope breakdown, and mitotic spindle assembly. Thereafter, the chromosomes align midway between the mitotic spindle poles before segregating and finally, the cell divides into two separate daughter cells (Alberts B et al, 2007).

The cell cycle control system is well conserved through evolution; rendering possible the study of cell cycle regulation in a variety of organisms and model systems. The most commonly studied organisms are yeast, animal embryos and mammalian cell cultures. Genome-wide studies using DNA microarray technology has revealed between 400 to 800 genes periodically expressed for S. cerevisiae (Cho RJ et al, 1998; Spellman PT et al, 1998; Orlando DA et al, 2008; Rustici G et al, 2004). Recent investigations in mammalian cells, show that approximately 700 genes display transcriptional fluctuations with a periodicity consistent with the cell cycle in primary human fibroblasts (Cho RJ et al, 2001), and >850 genes are periodically expressed during the cell cycle in synchronized HeLa cells using cDNA microarrays (Whitfield ML et al, 2002).

In our research, two approaches are used to investigate if the observed cell-to-cell variation is correlated with the cell cycle. In the first approach, selected proteins with an observed cell-to-cell variation are stained in the U-2 OS FUCCI (Fluorescence Ubiquitination Cell Cycle Indicator) cell line. The FUCCI cells, are cells tagged with two different fluorescent dyes, each fused into different cell cycle regulators that allow cell cycle monitoring; CDT1, expressed in G1 phase, and Geminin (GMNN), expressed in S and G2 phases (Sakaue-Sawano A et al, 2008). Using this approach we confirmed a cell cycle-dependent expression of 64 proteins as exemplified in Figure 4.

U-2 OS FUCCI U-2 OS FUCCI U-2 OS FUCCI FUCCI with ANLN stained FUCCI with ADA stained FUCCI with TPX2 stained

Figure 4. Staining of proteins in U-2 OS FUCCI cells to characterize the cell cycle dependency of the protein expression pattern. The FUCCI cells express the cell cycle regulators Cdt1 (red, G1 phase) and Geminin (Green, S and G2 phases). When both proteins are present, the overlay of the images appears in yellow marking the G1/S transition, top row. Staining of proteins targeting respectively ANLN, which is required for cytokinesis, ADA, which is an enzyme that catalyzes the hydrolysis of adenosine to inosine, and TPX2, which is required for normal assembly of mitotic spindles, bottom row.

In the second approach, the cell cycle position was inferred from the DAPI and microtubule features of each cell using a continuous time computational regression model trained on the expression of Cyclin B1 (CCNB1). Protein expression over the resulting pseudo time series for each protein was then fit with a periodic regressive function to capture the periodic nature of protein expression over the cell cycle. The significance of this periodic fit between cell-to-cell variation and cell cycle position was assessed using a permutation test for each assay and fits with significance p<0.05 were selected for publication on the Cell Atlas (Figure 5). This approach allowed us to confirm and characterize the continuous cell cycle-dependent expression of 18 proteins to date.

Figure 5. Expression profile of ANLN throughout the cell cycle (see Figure 4 for example image). The time axis was generated based on time-series tracing of U2-OS cells under standard growth conditions and the y-axis represents the expression relative to the maximal per-cell expression for an experiment.

In addition to these metrics, 145 proteins were categorized as having a cell cycle dependent expression as they localize to structures only present at certain time points of the cell cycle. In the Cell Atlas we define these to be: the cytokinetic bridge, midbody, midbody ring and mitotic spindle.

The investigation of the extent of cell cycle dependency for the proteins exhibiting cell-to-cell variation in their expression patterns is a work in progress. To date, evidence from our analysis can confirm that for 219 of the cell-to-cell variable proteins the expression correlates to cell cycle position (see examples, Figure 6).

CCNB1 - U-2 OS
TOP2A - U-2 OS
CDCA2 - U-2 OS


CCNB2 - U-2 OS
FAM64A - U-2 OS
NCOA1 - U-2 OS


CCSAP - U-2 OS
CKAP2L - A-431
CTTNBP2 - HeLa

Figure 6. Example images of cell cycle dependent protein expression validated with at least one of the approaches described or by biological definition. CCNB1 is essential for the control of the cell cycle at the G2/M transition (detected in U-2 OS cells). TOP2A is involved in processes such as chromosome condensation or chromatid separation (detected in U-2 OS cells). CDCA2 is a regulator of chromosome structure during mitosis (detected in U-2 OS cells). CCNB2 is a member of the cyclin B family (detected in U-2 OS cells). FAM64A may play a role in the control of metaphase-to-anaphase transition (detected in U-2 OS cells). NCOA1 acts as a transcriptional coactivator (detected in U-2 OS cells). CCSAP is involved in the maintenance of NUMA1 at the spindle poles (detected in U-2 OS cells). CKAP2L is required for mitotic spindle formation (detected in A-431 cells). CTTNBP2 regulates the dendritic spine distribution of cortactin (detected in HeLa cells).

Relevant links and publications


Alberts B et al, 2002a. Molecular Biology of the Cell. 4th edition. General Principles of Cell Communication. New York: Garland Science.

Alberts B et al, 2002b. Molecular Biology of the Cell. 4th edition. Components of the Cell-Cycle Control System. New York: Garland Science.

Alberts B et al, 2007. Molecular Biology of the Cell. 5th edition. Chapter 17. New York: Garland Science.

http://www.garlandscience.com/product/isbn/0815341059

Ansel J et al, 2008. Cell-to-cell stochastic variation in gene expression is a complex genetic trait. PLoS Genet.
PubMed: 18404214 DOI: 10.1371/journal.pgen.1000049

Bianconi E et al, 2013. An estimation of the number of cells in the human body. Ann Hum Biol.
PubMed: 23829164 DOI: 10.3109/03014460.2013.807878

Cho RJ et al, 1998. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell.
PubMed: 9702192 

Cho RJ et al, 2001. Transcriptional regulation and function during the human cell cycle. Nat Genet.
PubMed: 11137997 DOI: 10.1038/83751

Collins K et al, 1997. The cell cycle and cancer. Proc Natl Acad Sci U S A.
PubMed: 9096291 

Colman-Lerner A et al, 2005. Regulated cell-to-cell variation in a cell-fate decision system. Nature.
PubMed: 16170311 DOI: 10.1038/nature03998

Dueck H et al, 2016. Variation is function: Are single cell differences functionally important?: Testing the hypothesis that single cell variation is required for aggregate function. Bioessays.
PubMed: 26625861 DOI: 10.1002/bies.201500124

Elowitz MB et al, 2002. Stochastic gene expression in a single cell. Science.
PubMed: 12183631 DOI: 10.1126/science.1070919

Kaern M et al, 2005. Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet.
PubMed: 15883588 DOI: 10.1038/nrg1615

Kilfoil ML et al, 2009. Stochastic variation: from single cells to superorganisms. HFSP J.
PubMed: 20514130 DOI: 10.2976/1.3223356

Liberali P et al, 2015. Single-cell and multivariate approaches in genetic perturbation screens. Nat Rev Genet.
PubMed: 25446316 DOI: 10.1038/nrg3768

Malumbres M. 2014. Cyclin-dependent kinases. Genome Biol.
PubMed: 25180339 DOI: 10.1186/gb4184

Orlando DA et al, 2008. Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature.
PubMed: 18463633 DOI: 10.1038/nature06955

Rustici G et al, 2004. Periodic gene expression program of the fission yeast cell cycle. Nat Genet.
PubMed: 15195092 DOI: 10.1038/ng1377

Sakaue-Sawano A et al, 2008. Visualizing spatiotemporal dynamics of multicellular cell-cycle progression. Cell.
PubMed: 18267078 DOI: 10.1016/j.cell.2007.12.033

Snijder B et al, 2011. Origins of regulated cell-to-cell variability. Nat Rev Mol Cell Biol.
PubMed: 21224886 DOI: 10.1038/nrm3044

Spellman PT et al, 1998. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell.
PubMed: 9843569 

Whitfield ML et al, 2002. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell.
PubMed: 12058064 DOI: 10.1091/mbc.02-02-0030.

Zhivotovsky B et al, 2010. Cell cycle and cell death in disease: past, present and future. J Intern Med.
PubMed: 20964732 DOI: 10.1111/j.1365-2796.2010.02282.x