THE HUMAN CELL


One of the most prominent features of a eukaryotic cell is the nucleus, which is a complex and highly dynamic organelle. The nucleus was the first cell compartment to be discovered in 1833 by Robert Brown and is the largest organelle in the human cell. The nucleus consists of several non-membrane bound substructures and its main function is to store DNA and facilitate an isolated environment where controlled transcription and gene regulation is enabled. Example images of proteins localized to the nucleus can be seen in Figure 1.

Of all human proteins, 6263 (32%) have experimentally been shown to localize to the nucleus (Figure 2). A Gene Ontology (GO)-based functional enrichment analysis of the nuclear proteins shows highly enriched terms for biological processes related to RNA processing, transcription and cell cycle control. Approximately 62% (n=3899) of the nuclear proteins can be detected in additional cellular compartments, of which 12% (n=487) are other nuclear structures. The most common additional localizations except for the nucleoli are the cytosol and vesicles.

PDS5A - A-431
TP53BP1 - A-431
SRRM2 - A-431

Figure 1. Examples of proteins localized to the nucleoplasm and its substructures. PDS5A is thought to keep the sister chromatids in place during mitosis and also plays a role in DNA repair. PDS5A has been localized to the nucleoplasm (detected in A-431 cells). TP53BP1 is involved in DNA damage response and is localized to nuclear bodies (detected in A-431 cells). SRRM2 is known to be involved in pre-mRNA splicing and is localized to nuclear speckles (detected in A-431 cells).

  • 32% (6263 proteins) of all human proteins have been experimentally detected in the nucleoplasm by the Human Protein Atlas.
  • 2598 proteins in the nucleoplasm are supported by experimental evidence and out of these 797 proteins are validated by the Human Protein Atlas.
  • 3899 proteins in the nucleoplasm have multiple locations.
  • 758 proteins in the nucleoplasm show a cell to cell variation. Of these 620 show a variation in intensity and 154 a spatial variation.
  • Proteins are mainly involved in RNA processing, transcription and cell cycle control.

Figure 2. 32% of all human protein-coding genes encode proteins localized to the nucleoplasm. Each bar is clickable and gives a search result of proteins that belong to the selected category.

The structure of the nucleoplasm


Substructures

  • Nucleus: 1928
  • Nucleoplasm: 3749
  • Nuclear speckles: 446
  • Nuclear bodies: 483

    The size of the human nucleus varies depending on cell type and cell cycle phase, but is usually around 10 μm in diameter. The nucleus mainly contains DNA and proteins interacting with DNA. To make room for the DNA, it is winded around histones in complexes called chromatin. The most densely condensed chromatin, the heterochromatin, is usually organized in the nuclear periphery while the less packed euchromatin is dispersed throughout the whole nucleus (Spector DL. 1993). A majority of the nuclear proteins are localized to the entire nucleoplasm where they give rise to a smooth or punctate staining pattern. A selection of proteins localized to the nucleus that would be suitable as nuclear markers, can be found in Table 1. Highly expressed nuclear proteins are summarized in Table 2. However, the nucleus is a complex organelle and consists of several non-membrane bound sub compartments collectively called nuclear bodies. Except for the nucleolus, the most prominent ones are nuclear speckles, Cajal bodies (CB), Gemini of Cajal bodies (gems) and promyelocytic leukemia bodies (PML bodies) (Lamond AI et al, 1998). Images showing the different nuclear substructures can be seen in Figure 3.

    Table 1. Selection of proteins suitable as markers for the nucleus or its substructures.

    Gene

    Description

    Substructure

    PARP1 Poly (ADP-ribose) polymerase 1 Nucleus
    SRRM2 Serine/arginine repetitive matrix 2 Nuclear speckles
    RBM25 RNA binding motif protein 25 Nuclear speckles
    XRCC6 X-ray repair complementing defective repair in Chinese hamster cells 6 Nucleoplasm
    HNRNPC Heterogeneous nuclear ribonucleoprotein C (C1/C2) Nucleoplasm
    TAF15 TAF15 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 68kDa Nucleoplasm
    SMARCAD1 SWI/SNF-related, matrix-associated actin-dependent regulator of chromatin, subfamily a, containing DEAD/H box 1 Nucleoplasm
    CTBP1 C-terminal binding protein 1 Nucleoplasm
    SMARCC2 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 2 Nucleoplasm
    PDS5A PDS5 cohesin associated factor A Nucleoplasm

    Table 2. Highly expressed single localizing nuclear proteins across different cell lines.

    Gene

    Description

    Average TPM

    RPS19 Ribosomal protein S19 2759
    HNRNPA2B1 Heterogeneous nuclear ribonucleoprotein A2/B1 1352
    HNRNPA1 Heterogeneous nuclear ribonucleoprotein A1 1344
    TPI1 Triosephosphate isomerase 1 1005
    HMGB1 High mobility group box 1 957
    HMGN2 High mobility group nucleosomal binding domain 2 878
    H2AFZ H2A histone family, member Z 847
    HNRNPC Heterogeneous nuclear ribonucleoprotein C (C1/C2) 840
    H3F3A H3 histone, family 3A 794
    H3F3B H3 histone, family 3B (H3.3B) 745

    Nuclear speckles
    Nuclear speckles are formed in interchromatin granule clusters (IGCs) and contain pre-messenger RNA (pre-mRNA) splicing factors such as small nuclear ribonucleoprotein particles (snRNPs) (Swift H, 1959; Lamond AI et al, 2003). These granules are clustered by a fibril and can be seen directly by electron microscopy (Thiry M, 1995). The appearance of nuclear speckles varies between cell lines, but they all share an irregular, mottled, pattern which dynamically may change in both size and shape over time. A selection of the proteins localized to nuclear speckles, appropriate for acting as markers for the structure, can be found in Table 1.

    Nuclear bodies
    CB and gems are usually found in close proximity to each other and are hence difficult to differentiate. CB contain, similarly to the nuclear speckles, snRNPs while gems mainly contain an snRNP-interacting protein called survival of motor neuron (SMN) (Sleeman JE et al, 1999; Darzacq X et al, 2002; Jády BE et al, 2003; Liu Q et al, 1996; Lefebvre S et al, 1995; Fischer U et al, 1997). The PML bodies consist of an outer shell formed by the PML protein whereas the interior is rather dynamic and could contain a variety of different proteins Lallemand-Breitenbach V et al, 2010). All the above are visible as distinct spots of approximately 1 μm in size that are scattered throughout the nucleoplasm. The bodies vary in size and number dependent on both cell line and type of nuclear body but are difficult to differentiate without the use of co-localizing protein markers.

    LSM2 - SK-MEL-30
    CTBP1 - A-431
    NOSIP - U-2 OS


    RBM25 - HaCaT
    NPAT - CACO-2
    DAXX - A-431

    Figure 3. Examples showing the different nuclear substructures and staining patterns. LSM2 is a protein that might be involved in pre-mRNA splicing and shows a nucleoplasmic punctate staining pattern (detected in SK-MEL-30 cells). CTBP1 is a corepressor targeting various transcription factors and shows a smooth nucleoplasmic staining pattern (detected in A-431 cells). NOSIP is an E3 ubuquitin-protein regulating several catalytic processes and is localized to the nucleus (detected in U-2 OS cells). RBM25 is involved in pre-mRNA splicing activities and has been shown to localize to nuclear speckles (detected in HaCaT cells). NPAT is a known Cajal body protein and is required for proper G1/S transition. In the Cell Atlas, NPAT localizes to nuclear bodies (detected in CACO-2 cells). DAXX is a transcription corepressor involved in a number of different nuclear activities and is known to localize to several nuclear substructures such as PML bodies and centromeres. In the Cell Atlas, DAXX localizes to nuclear bodies (detected in A-431 cells).

    The function of the nucleoplasm


    The main function of the nucleus is to store the cell's genetic material, but also to regulate gene expression on a transcriptional level in order to control cellular functions such as cell growth and division. Since the nucleus is membrane enclosed and isolated from the rest of the cell, DNA replication and transcription can be controlled without interfering with the translation that occurs in the cytoplasm (Swift H, 1959). Despite the fact that the nuclear substructures are not membrane bound, highly specific tasks are carried out in these regions, which are further described in the subsections below.

    Nuclear speckles
    Nuclear speckles are due to the high content of pre-mRNA splicing proteins thought to be functioning as a storage place for these splicing factors (Lamond AI et al, 2003; Melcák I et al, 2000) as well as a regulatory site for transcription and pre-mRNA processing, even though transcription does not occur within the speckles but rather in close proximity (Spector DL et al, 1991: Misteli T et al, 1997; Cmarko D et al, 1999).

    Nuclear bodies
    CB probably functions as a modification site of snRNPs into fully functional splicing factors before they enter other parts of the cell (Sleeman JE et al, 1999; Darzacq X et al, 2002; Jády BE et al, 2003). The closely related gems play an important role in the synthesis of cytoplasmic snRNP (Liu Q et al, 1996; Lefebvre S et al, 1995; Fischer U et al, 1997). As previously mentioned, gems contain the SMN1 protein which has been found to be responsible for the onset of spinal muscular atrophy (SMA). SMA is one of the most lethal autosomal recessive disorders and genetic defects in the SMN gene could cause progressive muscle and mobility impairments (Lefebvre S et al, 1995). The interior of the PML bodies have been found to be highly diverse and have been suggested to perform an ever-growing number of tasks in the cell, ranging from apoptosis regulation to anti-viral protection, but much about the function remains to be unraveled (Lallemand-Breitenbach V et al, 2010).

    Gene Ontology (GO) analysis of the proteins mainly localized to the nucleus shows functions that are well in-line with already known functions for the structure. The enriched terms for the GO domain Biological Process are related to RNA splicing, transcriptional processes and chromatin modification (Figure 4a). Enrichment analysis of the GO domain Molecular Function, gives hits for terms related to DNA binding activities and transcriptional regulations, such as mismatched DNA binding and pre-mRNA binding (Figure 4b).

    Figure 4.a Gene Ontology-based enrichment analysis for the nucleoplasm proteome showing the significantly enriched terms for the GO domain Biological Process. Each bar is clickable and gives a search result of proteins that belong to the selected category.

    Figure 4.b Gene Ontology-based enrichment analysis for the nucleoplasm proteome showing the significantly enriched terms for the GO domain Molecular Function. Each bar is clickable and gives a search result of proteins that belong to the selected category.

    Nucleoplasmic proteins with multiple locations


    Of the nuclear proteins identified in the Cell Atlas, approximately 62% (n=3899) also localize to other cell compartments (Figure 5). Of these 487, 12% are other nuclear structures. The network plot shows that the most common locations shared with the nucleus are the cytosol, nucleoli and vesicles. Given that the nucleus is involved both in import and export of proteins to the cytoplasm and other compartments of the cell, these dual locations could highlight proteins functioning in nuclear trafficking but also their function in various signaling cascades. Interactions with other cellular compartments known to interact with the nucleus are significantly overrepresented, for example the nucleoli and the cytosol, while compartments such as the plasma membrane is significantly underrepresented as a shared location with the nucleus. Examples of multilocalizing proteins within the nucleoplasmic proteome can be seen in Figure 6.

    Figure 5. Interactive network plot of nuclear proteins with multiple localizations. The numbers in the connecting nodes show the proteins that are localized to the nucleus and to one or more additional locations. Only connecting nodes containing more than one protein and at least 0.5% of proteins in the nuclear proteome are shown. The circle sizes are related to the number of proteins. The cyan colored nodes show combinations that are significantly overrepresented, while magenta colored nodes show combinations that are significantly underrepresented as compared to the probability of observing that combination based on the frequency of each annotation and a hypergeometric test (p≤0.05). Note that this calculation is only done for proteins with dual localizations. Each node is clickable and results in a list of all proteins that are found in the connected organelles.

    IPO7 - A-431
    RRAGC - U-2 OS
    SENP3 - MCF7

    Figure 6. Examples of multilocalizing proteins in the nuclear proteome. The examples show common or overrepresented combinations for multilocalizing proteins in the nuclear proteome. IPO7 is functioning in the nuclear import of proteins and is known to be located at both the nucleoplasmic and cytoplasmic side of the nuclear pore complex (detected in A-431 cells). RRAGC is shuttling between the nucleus and the cytoplasm. It plays a crucial role in the initiation of the TOR signaling cascade where it is required for the amino acid induced relocalization of mTORC1 into the lysosomes (detected in U-2 OS cells). SENP3 is located in both the nucleoli and the nucleoplasm known to interact with sumoylated proteins regulating the transcriptional capacity in the cell and is also required for rRNA processing (detected in MCF7 cells).

    Expression levels of nucleoplasm proteins in tissue


    The transcriptome analysis (Figure 7) shows that nuclear proteins are more likely to be expressed in all tissues and less likely to be tissue enhanced or enriched, compared to all other genes with protein data in the Cell Atlas.

    Figure 7. Bar plot showing the distribution of expression categories, based on the gene expression in tissues, for nucleoplasm-associated protein-coding genes compared to all genes in the Cell Atlas. Asterisk marks statistically significant deviation(s) (p≤0.05) from all other organelles based on a binomial statistical test. Each bar is clickable and gives a search result of proteins that belong to the selected category.

    Relevant links and publications


    Cmarko D et al, 1999. Ultrastructural analysis of transcription and splicing in the cell nucleus after bromo-UTP microinjection. Mol Biol Cell.
    PubMed: 9880337 

    Darzacq X et al, 2002. Cajal body-specific small nuclear RNAs: a novel class of 2'-O-methylation and pseudouridylation guide RNAs. EMBO J.
    PubMed: 12032087 DOI: 10.1093/emboj/21.11.2746

    Fischer U et al, 1997. The SMN-SIP1 complex has an essential role in spliceosomal snRNP biogenesis. Cell.
    PubMed: 9323130 

    Jády BE et al, 2003. Modification of Sm small nuclear RNAs occurs in the nucleoplasmic Cajal body following import from the cytoplasm. EMBO J.
    PubMed: 12682020 DOI: 10.1093/emboj/cdg187

    Lallemand-Breitenbach V et al, 2010. PML nuclear bodies. Cold Spring Harb Perspect Biol.
    PubMed: 20452955 DOI: 10.1101/cshperspect.a000661

    Lamond AI et al, 1998. Structure and function in the nucleus. Science.
    PubMed: 9554838 

    Lamond AI et al, 2003. Nuclear speckles: a model for nuclear organelles. Nat Rev Mol Cell Biol.
    PubMed: 12923522 DOI: 10.1038/nrm1172

    Lefebvre S et al, 1995. Identification and characterization of a spinal muscular atrophy-determining gene. Cell.
    PubMed: 7813012 

    Liu Q et al, 1996. A novel nuclear structure containing the survival of motor neurons protein. EMBO J.
    PubMed: 8670859 

    Melcák I et al, 2000. Nuclear pre-mRNA compartmentalization: trafficking of released transcripts to splicing factor reservoirs. Mol Biol Cell.
    PubMed: 10679009 

    Misteli T et al, 1997. Protein phosphorylation and the nuclear organization of pre-mRNA splicing. Trends Cell Biol.
    PubMed: 17708924 DOI: 10.1016/S0962-8924(96)20043-1

    Sleeman JE et al, 1999. Newly assembled snRNPs associate with coiled bodies before speckles, suggesting a nuclear snRNP maturation pathway. Curr Biol.
    PubMed: 10531003 

    Spector DL et al, 1991. Associations between distinct pre-mRNA splicing components and the cell nucleus. EMBO J.
    PubMed: 1833187 

    Spector DL. 1993. Macromolecular domains within the cell nucleus. Annu Rev Cell Biol.
    PubMed: 8280462 DOI: 10.1146/annurev.cb.09.110193.001405

    SWIFT H. 1959. Studies on nuclear fine structure. Brookhaven Symp Biol.
    PubMed: 13836127 

    Thiry M. 1995. The interchromatin granules. Histol Histopathol.
    PubMed: 8573995