The multilocalizing proteome
The immunofluorescence (IF)-based approach used in the Cell Atlas allows a simultaneous analysis of the protein distribution in all organelles. This enables the possibility to study the spatial distribution of proteins in their cellular context and the identification of all proteins that are located to more than one organelle, which can be called "multilocalizing proteins" (MLPs).
Figure 1 shows example images of MLPs representing common combinations of locations and gives an idea of the cellular roles of MLPs. The most common case is that MLPs are located at multiple sites at the same time, however a single-cell variation in the spatial distribution may occur. The different localization of MLPs can depend on certain intrinsic factors such as the cell cycle position, on extrinsic factors such as growth factors, or even be due to cell line-specific variations. For example,
ZNF554 is a solely nuclear protein in RT4 and SH-SY5Y cells, but becomes a MLP in U-2 OS due to its prominent location in the nucleoli.
Figure 1. Examples of MLPs identified in the Cell Atlas.
IPO7 mediates the import of proteins from the cytosol to the nucleus and can cross the nuclear membrane rapidly in both directions (detected in A-431 cells).
RPL19 is a component of the ribosomal 60S subunit and was identified in nucleoli, where ribosomes are assembled, and in the cytosol and endoplasmic reticulum, where protein synthesis takes place (detected in A-431 cells).
CCDC51 encodes an uncharacterized protein located in the mitochondria and nucleoplasm (detected in U-2 OS).
KIAA1522 encodes an uncharacterized protein identified in the plasma membrane and nucleoplasm (detected in HaCaT cells).
ITM2B is a transmembrane protein processed in the Golgi apparatus and vesicles. The resulting small peptide is secreted (detected in RT4 cells).
ENO1 is a well described moonlighting protein. It has several functions in different compartments including a role in glycolysis in the cytosol, or as surface protein in the plasma membrane (detected in U-2 OS cells).
MLPs in the Cell Atlas
Approximately half of the genes in the Cell Atlas (52%, n=6282) encode MLPs (Figure 2). Of these MLPs, around 28% (n=1742) can be found at three or more locations. The distribution of single and multilocalizing proteins for each organelle is shown in Figure 3 and Table 1. Although around half of the human proteome consists of MLPs, the percentage of MLPs in the individual organelle proteomes is often much higher, because of the double counting of MLPs. Hence, it is noteworthy that the proteomes of mitochondria and endoplasmic reticulum contain mainly single localizing proteins. On the other hand, organelles such as plasma membrane, cytosol, nucleus, and nucleoli share the majority of their proteins with other organelles. This might be related to the known biological function of these organelles, requiring proteins that operate across the borders of organelles in order to regulate metabolic reactions or gene expression, or transmit information from the surrounding environment. In contrast, the endoplasmic reticulum and mitochondria are more self-contained regarding their biological function.
- 6282 proteins are multilocalizing
Figure 2. Bar plot showing the number of protein-coding genes for single or multilocalizing proteins.
Figure 3. Bar plot showing the distribution of proteins localized to one or multiple organelles. Note that proteins localized to different substructures of organelles (e.g. nuclear bodies and nucleoplasm) are considered multilocalizing.
Table 1. Detailed information about single and multilocalizing proteins in the proteome of organelles and substructures.
The number of MLPs is large. To get a better overview of the proteome of MLPs, organelles can be grouped into three meta-compartments, and genes encoding MLPs can be aligned on a circular plot (Figure 4). The meta-compartments are "Nucleus" (nuclear and nucleolar structures), "Cytoplasm" (cytosol, mitochondria, and the different types of cytoskeleton), and the "Secretory Pathway" (endoplasmic reticulum, Golgi apparatus, vesicles, plasma membrane). This reveals subordinate organization patterns of the MLPs. For instance, in the meta-compartments cytoplasm and nucleus the most common combination is between the predominant organelles cytosol and nucleoplasm respectively and the fine structures within. The MLPs in the secretory pathway exhibit a more sequential pattern likely reflecting the directional protein trafficking. Across meta-compartments, secretory pathway and nucleus share a strikingly high number of MLPs, despite not being in direct physical contact with each other. Individual cytoscape plots of each organelle (Figure 5, at end of the page) show that dual locations to the nucleoplasm and the Golgi apparatus or vesicles are overrepresented. This indicates that the proteomes of organelles in the secretory pathway are more versatile and should not be simplified to their role in protein secretion.
Why does the cell have MLPs?
MLPs present several advantages for the cell, some which are crucial for cellular survival. Shuttle proteins for instance constantly switch their location in order to transport other proteins between organelles and so their multilocalization is inseparably tied to their function. For example, members of the importin family transport proteins from the cytosol to the nucleus and hence are found in both organelles
(Lange A et al, 2007, see also Figure 1). Another advantage of MLPs is that proteins required for reactions that are not limited to a single discrete subcellular compartment have only to be evolved once, e.g. mitochondria and peroxisomes share some enzymes in their lipid metabolism
(Ashmarina LI et al, 1999). A switch of the subcellular location is also essential for quick cellular responses upon a changing environment. For example, receptors such as
ERBB2 located in the plasma membrane move to the nucleus after stimulation and change the expression pattern, a translocation that has been correlated to a worse prognosis in cancer
(Wang SC et al, 2009).
Some of the MLPs are more than just multilocalizing. In addition, they are also multifunctional proteins. These proteins do not fit in the paradigm of "one gene - one protein - one function", as they have more than one function, which might correlate with their presence at different locations. The existence of multifunctional proteins adds another dimension to cellular complexity and offers new starting points in systems biology, because they are involved in multiple pathways or serve as regulators of transcription
(Jeffery CJ. 2015). A special class of multifunctional proteins are moonlighting proteins. This term refers to people who work in different jobs during daylight and moonlight, and like their human counterpart, moonlighting proteins have completely different jobs in the cells. However, in comparison to other multifunctional proteins, a protein is only defined as moonlighting if it is multifunctional due to any of the following reasons: gene fusions, RNA splice variants, post-translational modifications or pleiotropic effects
(Jeffery CJ. 1999). An example of a moonlighting protein is
ENO1 (Figure 1) that acts in the cytosol as well as in plasma membrane fulfilling different functions.
The Human Protein Atlas does not provide functional studies of proteins and therefore cannot determine if a MLP is multifunctional. However, the description of proteins at multiple locations is an important step in the discovery of multifunctional and moonlighting proteins and the spatial information provided by the Cell Atlas could be integrated into existing prediction models
(Chapple CE et al, 2015).
Figure 5. Cytoscape plots of MLPs.
The interactive and clickable plots show the number of all shared MLPs between the individual organelles, including MLPs with additional locations. Only connecting nodes containing more than one protein and at least 0.5% of all human proteins are shown. The circle sizes are related to the number of proteins. The cyan colored nodes show combinations that are significantly overrepresented, while magenta colored nodes show combinations that are significantly underrepresented as compared to the probability of observing that combination based on the frequency of each annotation and a hypergeometric test (p≤0.05).
Each node is clickable and results in a list of all proteins that are found in the connected organelles.
Relevant links and publications
Ashmarina LI et al, 1999. 3-Hydroxy-3-methylglutaryl coenzyme A lyase: targeting and processing in peroxisomes and mitochondria. J Lipid Res.
Chapple CE et al, 2015. Extreme multifunctional proteins identified from a human protein interaction network. Nat Commun.
PubMed: 26054620 DOI: 10.1038/ncomms8412
Jeffery CJ. 2015. Why study moonlighting proteins? Front Genet.
PubMed: 26150826 DOI: 10.3389/fgene.2015.00211
Jeffery CJ. 1999. Moonlighting proteins. Trends Biochem Sci.
Lange A et al, 2007. Classical nuclear localization signals: definition, function, and interaction with importin alpha. J Biol Chem.
PubMed: 17170104 DOI: 10.1074/jbc.R600026200
Wang SC et al, 2009. Nuclear translocation of the epidermal growth factor receptor family membrane tyrosine kinase receptors. Clin Cancer Res.
PubMed: 19861462 DOI: 10.1158/1078-0432.CCR-08-2813