Wagner and Valentin were the first to describe the nucleolus in two independent publications in the 1830s. The nucleolus is a nuclear sub-compartment that varies in size and number depending on cell type. The main function of the nucleolus is to synthesize and assemble ribosomes for later transport to the cytoplasm, where translation takes place. The nucleolus is also involved in cell cycle regulation and cellular stress responses. Example images of proteins localized to the nucleoli can be seen in Figure 1.
In the Cell Atlas, 1361 genes (7% of all protein-coding human genes) have been shown to encode proteins that localize to nucleoli (Figure 2). A Gene Ontology (GO)-based functional enrichment analysis of the nucleolar proteins shows enrichment of terms for biological processes related to rRNA processing. Approximately 89% (n=1211) of the nucleolar proteins localize to other cellular compartments in addition to nucleoli, with 34% (n=465) only localizing to other nuclear compartments. The most common additional localization outside the nuclear meta compartment is mitochondria.
Figure 1. Examples of proteins localized to the nucleoli. UTP6 is suggested to be involved in processing of pre rRNA (detected in A-431 cells). RPF1 is a protein believed to be required for ribosome biogenesis (detected in SK-MEL-30 cells). NIFK is known to localize to the nucleoli, but its function is still unclear (detected in U-2 OS cells).
Figure 2. 7% of all human protein-coding genes encode proteins localized to the nucleoli. Each bar is clickable and gives a search result of proteins that belong to the selected category.
The structure of nucleoli
The nucleoli are non-membrane enclosed, highly conserved, sub-organelles within the nucleus. They are formed around nucleolus organizer regions (NORs) consisting of ribosomal DNA (rDNA) and are structurally organized into three different sub regions; the fibrillar center (FC), the dense fibrillar component (DFC) and the granular component (GC) (Boisvert FM et al. (2007); Scheer U et al. (1999)). A selection of proteins localized to the nucleoli that are suitable as nucleoli markers can be found in Table 1.
Table 1. Selection of proteins suitable as markers for the nucleoli or its substructures.
A majority of the nucleolar proteins show staining throughout the whole nucleolar area, while roughly 20% display a more refined staining pattern. The staining of fibrillar centers and/or dense fibrillar component appears as clusters of spots for most cell lines while for others, for example MCF-7 and U-251, only one larger spot is seen. Some proteins localize to the rim of the nucleolus, which is visible as a thin circle around the nucleolus and could be associated to either the GC or the perinucleolar heterochromatin surrounding the nucleolus (Németh A et al. (2011)). A recent study suggests that the protein MKI67, which is localized to the nucleoli rim, functions like a surfactant to create non-membranous barriers in the cell. Therefore, proteins with similar staining patterns could have a similar function (Cuylen S et al. (2016); Stenström L et al. (2020)). MKI67 and other immunofluorescent images of different nucleolar substructures can be seen in Figure 3. The size of the nucleolus has also been suggested to correlate with the proliferative ability of cells (Derenzini M et al. (2000)). Upon entry into mitosis, rRNA transcription and RNP processing shuts down and the nucleoli are disassembled. In telophase and early G1, the nucleolar organization is re-established.
Figure 3. Examples of the morphology of the nucleoli in different cell lines as well as the nucleolar substructures and staining patterns. Immunofluorescent staining of KRI1 in HEK 239, MCF-7 and U-2 OS cells. NOLC1 might play a role in maintaining the structure of the fibrillar center and the dense fibrillar component in the nucleoli. NOLC1 is localized to the fibrillar center (detected in HEK293 cells). UBTF is involved in the activation of RNA polymerase I and is localized to the fibrillar center (detected in U-2 OS cells). MKI67 has been found to maintain mitotic chromosome integrity and is a well-known cellular proliferation marker.
The function of nucleoli
The nucleolus is responsible for the synthesis, processing and assembly of ribosomes, a complex process controlled in the nucleolar sub regions (Boisvert FM et al. (2007); Scheer U et al. (1999); Németh A et al. (2011)). The border between the FC and the DFC contains proteins from the RNA polymerase I complex and is the region where pre-ribosomal RNA (pre-rRNA) is transcribed from rDNA. The pre-rRNA is later modified by proteins in the DFC followed by assembly of the ribosome subunits in the GC (Scheer U et al. (1999)). As is the case for the majority of organelles, the proteome of the nucleolus is dynamic and has been shown to consist of multiple overlapping sets of proteins that are interchanging dependent on the cell state. The need for translational capacity varies with different cell cycle phases, and transcriptional capacity is heavily dependent on the amount of ribosomes available. In addition to being responsible for ribosome assembly, the nucleolus has also been found to comprise proteins involved in cell cycle regulation and cellular stress responses (Boisvert FM et al. (2007); Visintin R et al. (2000)).
Several genetic disorders such as Werner syndrome, fragile X syndrome and Treacher Collins syndrome have been linked to nucleolar proteins (Marciniak RA et al. (1998); Tamanini F et al. (2000); Willemsen R et al. (1996); Isaac C et al. (2000)). Moreover, the nucleolar size increases with the proliferative ability of cells, suggesting that the nucleoli play an important role in development of cancer and could therefore be a potential target for cancer therapy (Drygin D et al. (2010)).
Gene Ontology (GO) analysis of the proteins mainly localized to the nucleoli shows functions that are well in-line with already known functions for the structure. The enriched terms for the GO domain Biological Process are related to rRNA processing and ribosome assembly (Figure 5a), while enrichment analysis of the GO domain Molecular Function gives enrichment for RNA binding activities (Figure 5b). A list of highly expressed nucleolar proteins are summarized in Table 2.
Figure 5a. Gene Ontology-based enrichment analysis for the nucleolar proteome showing the significantly enriched terms for the GO domain Biological Process. Each bar is clickable and gives a search result of proteins that belong to the selected category.
Figure 5b. Gene Ontology-based enrichment analysis for the nucleolar proteome showing the significantly enriched terms for the GO domain Molecular Function. Each bar is clickable and gives a search result of proteins that belong to the selected category.
Table 2. Highly expressed single localized nucleolar proteins across different cell lines.
Nucleolar proteins with multiple locations
Of the nucleolar proteins identified in the Cell Atlas, approximately 89% (n=1211) also localize to other cell compartments (Figure 6). 34% (n=465) of the nucleolar proteins only localize to other nuclear structures. The network plot shows that the most common locations shared with nucleoli are nucleoplasm, cytosol and mitochondria. Given that the nucleoli are responsible for synthesis and assembly of ribosomes that later are exported to the cytoplasm, many of the proteins localized to both the nucleoli and the cytoplasmic structures are most likely involved in translation. The number of proteins localized to the nucleoli and the nucleoplasm as well as the nucleoli and mitochondria are seen more often than expected with the current distribution of multilocalizing proteins, while nucleolar proteins that additionally localize to vesicles, the Golgi apparatus, the cytosol or the centrosomes are significantly underrepresented. Examples of multilocalizing proteins within the nucleolar proteome can be seen in Figure 7.
Figure 6. Interactive network plot of nucleolar proteins with multiple localizations. The numbers in the connecting nodes show the proteins that are localized to the nucleoli and to one or more additional locations. Only connecting nodes containing more than one protein and at least 0.5% of proteins in the nucleolar proteome are shown. The circle sizes are related to the number of proteins. The cyan colored nodes show combinations that are significantly overrepresented, while magenta colored nodes show combinations that are significantly underrepresented as compared to the probability of observing that combination based on the frequency of each annotation and a hypergeometric test (p≤0.05). Note that this calculation is only done for proteins with dual localizations. Each node is clickable and results in a list of all proteins that are found in the connected organelles.
Figure 7. Examples of multilocalizing proteins in the nucleolar proteome. The examples show common or overrepresented combinations for multilocalizing proteins in the nucleolar proteome. EXOSC10 is known to be involved in multiple RNA processing pathways in the nucleolus, nucleus and the cytoplasm (detected in U-2 OS cells). APTX is known to be involved in DNA repair and is localized to the nucleoplasm and the nucleoli (detected in PC-3 cells). SPATS2L is known to localize to the nucleoli and into cytoplasmic stress granules during oxidative stress but the function is unknown (detected in U-2 OS cells).
Expression levels of nucleoli proteins in tissue
Transcriptome analysis and classification of genes into tissue distribution categories (Figure 8) shows that genes encoding proteins that localize to nucleoli are more likely to either be detected in a single tissue or detected in all tissues, compared to all genes presented in the Cell Atlas. Significantly lower portions of nucleoli-associated genes are detected in some or many of the tissues. Thus, these genes tend to either be ubiquitously expressed, or show a strict tissue-specific expression.
Figure 8. Bar plot showing the percentage of genes in different tissue distribution categories for nucleoli-associated protein-coding genes compared to all genes in the Cell Atlas. Asterisk marks a statistically significant deviation (p≤0.05) in the number of genes in a category based on a binomial statistical test. Each bar is clickable and gives a search result of proteins that belong to the selected category.
Relevant links and publications
Clegg JS., Properties and metabolism of the aqueous cytoplasm and its boundaries. Am J Physiol. (1984)