The multilocalizing proteome
The immunofluorescence (IF)-based approach used in the Cell Atlas allows analysis of protein distribution in all organelles and cellular substructures simultaneously. This allows us to study spatial distribution of proteins in their cellular context and identify proteins that localize to more than one compartment, which can be called "multilocalizing proteins" (MLPs).
Figure 1 shows example images of MLPs representing common combinations of locations and gives an idea of the cellular roles of MLPs. The most common case is that MLPs are located at multiple sites at the same time. However, single-cell variation in the spatial distribution may also occur. The different localization of MLPs can depend on certain intrinsic factors such as the cell cycle position, on extrinsic factors such as growth factors, or even be due to cell line-specific variations. For example, ZNF554 is a solely nuclear protein in RT4 and SH-SY5Y cells, but becomes a MLP in U-2 OS due to its additional prominent location in the nucleoli.
Figure 1. Examples of MLPs identified in the Cell Atlas. IPO7 mediates the import of proteins from the cytosol to the nucleus and can cross the nuclear membrane rapidly in both directions (detected in A-431 cells). RPL19 is a component of the ribosomal 60S subunit and was identified in nucleoli, where ribosomes are assembled, and in the cytosol and endoplasmic reticulum, where protein synthesis takes place (detected in A-431 cells). CCDC51 encodes an uncharacterized protein located in the mitochondria and nucleoplasm (detected in U-2 OS). KIAA1522 encodes an uncharacterized protein identified in the plasma membrane and nucleoplasm (detected in HaCaT cells). ITM2B is a transmembrane protein processed in the Golgi apparatus and vesicles. The resulting small peptide is secreted (detected in RT4 cells). ENO1 is a well described moonlighting protein. It has several functions in different compartments including a role in glycolysis in the cytosol, and as a surface protein in the plasma membrane (detected in U-2 OS cells).
MLPs in the Cell Atlas
Approximately half of the genes in the Cell Atlas (54%, n=6647) encode MLPs (Figure 2). Of these MLPs, around 29% (n=1960) can be found at three or more locations. The distribution of single and multilocalizing proteins for each organelle is shown in Figure 3 and Table 1. Although around half of the human proteome consists of MLPs, the percentage of MLPs in the individual organelle proteomes is often much higher, because of the double counting of MLPs. Hence, it is noteworthy that the proteomes of mitochondria and endoplasmic reticulum contain mainly single localizing proteins. On the other hand, organelles such as plasma membrane, cytosol, nucleus, and nucleoli share the majority of their proteins with other organelles. This might be related to the known biological function of these organelles, requiring proteins that operate across the borders of organelles in order to regulate metabolic reactions or gene expression, or transmit information from the surrounding environment. In contrast, the endoplasmic reticulum and mitochondria are more self-contained regarding their biological function.
Figure 2. Bar plot showing the number of protein-coding genes for single or multilocalizing proteins.
Figure 3. Bar plot showing the distribution of proteins localized to one or multiple organelles. Note that proteins localized to different substructures of organelles (e.g. nuclear bodies and nucleoplasm) are considered multilocalizing.
Table 1. Detailed information about single and multilocalizing proteins in the proteome of organelles and substructures.
The number of MLPs is large. To get a better overview of the multilocalizing proteome, organelles can be grouped into three meta-compartments, and genes encoding MLPs can be aligned on a circular plot (Figure 4). The meta-compartments are "Nucleus" (nuclear and nucleolar structures), "Cytoplasm" (cytosol, mitochondria, and the different types of cytoskeleton), and "Secretory Pathway" (endoplasmic reticulum, Golgi apparatus, vesicles, plasma membrane). This reveals subordinate organization patterns of the MLPs. For instance, in the meta-compartments cytoplasm and nucleus the most common combination is between the predominant organelles cytosol and nucleoplasm respectively and the fine structures within. The MLPs in the secretory pathway exhibit a more sequential pattern likely reflecting the directional protein trafficking. Across meta-compartments, the secretory pathway and nucleus share a strikingly high number of MLPs, despite not being in direct physical contact with each other. Individual cytoscape plots of each organelle (Figure 5, at end of the page) show that dual locations to the nucleoplasm and the Golgi apparatus or vesicles are overrepresented. This indicates that the proteomes of organelles in the secretory pathway are more versatile and should not be simplified to their role in protein secretion.
Figure 4. Circular plot with the identified proteins of each compartment presented and sorted by meta-compartments (red: Nucleus, blue: Cytoplasm, yellow: Secretory Pathway). Multilocalizing proteins appearing more than once in the plot are connected by a line
Why does the cell have MLPs?
MLPs present several advantages for the cell, some of which are crucial for cellular survival. Shuttle proteins for instance constantly switch their location in order to transport other proteins between organelles and so their multilocalization is inseparably tied to their function. For example, members of the importin family transport proteins from the cytosol to the nucleus and hence are found in both organelles (Lange A et al, 2007, see also Figure 1). Another advantage of MLPs is that proteins required for reactions that are not limited to a single discrete subcellular compartment have only to be evolved once, e.g. mitochondria and peroxisomes share some enzymes in their lipid metabolism (Ashmarina LI et al, 1999). A switch of the subcellular location is also essential for quick cellular responses upon a changing environment. For example, receptors such as ERBB2 located in the plasma membrane move to the nucleus after stimulation and change the expression pattern, a translocation that has been correlated to a worse prognosis in cancer (Wang SC et al, 2009).
Some of the MLPs are more than just multilocalizing. In addition, they are also multifunctional proteins. These proteins do not fit in the paradigm of "one gene - one protein - one function", as they have more than one function, which might correlate with their presence at different locations. The existence of multifunctional proteins adds another dimension to cellular complexity and offers new starting points in systems biology, because they are involved in multiple pathways or serve as regulators of transcription (Jeffery CJ. 2015). A special class of multifunctional proteins are moonlighting proteins. This term has been used for people who work in different jobs during daylight and moonlight, and like their human counterpart, moonlighting proteins have two or more completely different jobs. However, in comparison to other multifunctional proteins, a protein is only defined as moonlighting if it is multifunctional due to any of the following reasons: gene fusions, RNA splice variants, post-translational modifications or pleiotropic effects (Jeffery CJ. 1999). An example of a moonlighting and multilocalizing protein is ENO1 (Figure 1) that acts in the cytosol as well as in plasma membrane fulfilling different functions.
The Human Protein Atlas does not provide functional studies of proteins and therefore cannot determine if a MLP is multifunctional. However, the description of proteins at multiple locations is an important step in the discovery of multifunctional and moonlighting proteins and the spatial information provided by the Cell Atlas could be integrated into existing prediction models (Chapple CE et al, 2015).
Figure 5. Cytoscape plots of MLPs. The interactive and clickable plots show the number of all shared MLPs between the individual organelles, including MLPs with additional locations. Only connecting nodes containing more than one protein and at least 0.5% of all human proteins are shown. The circle sizes are related to the number of proteins. The cyan colored nodes show combinations that are significantly overrepresented, while magenta colored nodes show combinations that are significantly underrepresented as compared to the probability of observing that combination based on the frequency of each annotation and a hypergeometric test (p≤0.05). Each node is clickable and results in a list of all proteins that are found in the connected organelles.
Relevant links and publications
Lange A et al, 2007. Classical nuclear localization signals: definition, function, and interaction with importin alpha. J Biol Chem.