All transcripts of all genes have been analyzed regarding the location(s) of corresponding protein based on prediction methods for signal peptides and transmembrane regions.
Genes with at least one transcript predicted to encode a secreted protein, according to prediction methods or to UniProt location data, have been further annotated and classified with the aim to determine if the corresponding protein(s) are secreted or actually retained in intracellular locations or membrane-attached.
Remaining genes, with no transcript predicted to encode a secreted protein, will be assigned the prediction-based location(s).
The annotated location overrules the predicted location, so that a gene encoding a predicted secreted protein that has been annotated as intracellular will have intracellular as the final location.
The Tissue Atlas contains information regarding the expression profiles of human genes both on the mRNA and protein level. The protein expression data from 44 normal human tissue types is derived from antibody-based protein profiling using immunohistochemistry.
The RNA specificity category is based on mRNA expression levels in the analyzed samples based on a combination of data from HPA, GTEX and FANTOM5. The categories include: tissue enriched, group enriched, tissue enhanced, low tissue specificity and not detected.
The RNA distribution category is based on mRNA expression levels in the analyzed samples based on a combination of data from HPA, GTEX and FANTOM5. The categories include: detected in all, detected in many, detected in some, detected in single and not detected.
A summary of the overall protein expression pattern across the analyzed normal tissues. The summary is based on knowledge-based annotation.
"Estimation of protein expression could not be performed. View primary data." is shown for genes analyzed with a knowledge-based approach where available RNA-seq and gene/protein characterization data has been evaluated as not sufficient in combination with immunohistochemistry data to yield a reliable estimation of the protein expression profile.
Distinct nuclear expression in the uterine tract and endometrial glands.
Summary of data presented in the Cell Atlas and a representative image of subcellular localization. The Cell Atlas provides RNA expression data derived from RNA sequencing of a large panel of cell lines and protein localization data derived from antibody-based profiling by immunofluorescence confocal microscopy, using a subset of cell lines selected based on RNA expression.
The main location is characterized by presence in all tested cell lines and/or increased intensity compared to other locations. It is highlighted in the illustration to the right. If available, links to overrepresentation analyses in Reactome, a free, open-source, curated and peer reviewed biological pathway database, are provided. An analysis is done for the corresponding gene set of the proteome localizing to the main and additional locations of the protein on this page, respectively.
Localized to the Nucleoplasm
Show complete data for human cells assay. The location(s) are highlighted in the illustration on the right.
Localized to the nucleoplasm.
RNA cell specificityi
The cell lines in the Human Protein Atlas have been analyzed by RNA-seq to estimate the transcript abundance of each protein-coding gene. The RNA-seq data was then used to classify all genes according to their cell line-specific expression into one of six different categories, defined based on the total set of all NX values in all analyzed cell lines.
Classification of genes according to distribution of their RNA expression among the cell lines within the HPA. The categories include: detected in all, detected in many, detected in some, detected in single and not detected.
Summary of data presented in the Pathology Atlas , including mRNA and protein expression data from 17 different forms of human cancer, as well as correlation analysis of mRNA expression and patient survival. To the far left, a representative image of protein expression, based on immunohistochemistry, is shown. Next to it, a representative image of a Kaplan-Meier plot, based on correlation analysis. Images are clickable and redirect to pages with more Pathology Atlas data.
The regional specificity category is based on mRNA expression levels in the analyzed brain samples, grouped into 10 main brain regions and calculated for the three different species. The human brain expression profile is based on a combination of data from GTEX and FANTOM5. The specificity categories include: regionally enriched, group enriched, regionally enhanced, low regional specificity and not detected. The classification rules are the same used for the tissue specificity category.
The regional distribution category is based on mRNA expression detected above cut off or not in the analyzed brain samples, grouped into 10 main brain regions and calculated for the three different species. The human brain expression is based on a combination of data from GTEX and FANTOM5. The distribution categories include: detected in all, detected in many, detected in some, detected in single and not detected. The classification rules are the same used for the tissue distribution category.
The RNA specificity category is based on mRNA expression levels in the analyzed samples based on data from HPA. The categories include: cell type enriched, group enriched, cell type enhanced, low cell type specificity and not detected.
The RNA distribution category is based on mRNA expression levels in the analyzed samples based on data from HPA. The categories include: detected in all, detected in many, detected in some, detected in single and not detected.
The blood-based immunoassay category applies to actively secreted proteins and is based on plasma or serum protein concentrations established with enzyme-linked immunosorbent assays, compiled from a literature search. The categories include: detected and not detected, where detection refers to a concentration found in the literature search.
Not analysed since only proteins predicted to be actively secreted to blood is analysed here
Detection or not of the gene in blood, based on spectral count estimations from a publicly available mass spectrometry-based plasma proteomics data set obtained from the PeptideAtlas.
Gene information from Ensembl and Entrez, as well as links to available gene identifiers are displayed here. Information was retrieved from Ensembl if not indicated otherwise.
MSX1 (HGNC Symbol)
HOX7, HYD1, OFC5
Msh homeobox 1 (HGNC Symbol)
Entrez gene summary
This gene encodes a member of the muscle segment homeobox gene family. The encoded protein functions as a transcriptional repressor during embryogenesis through interactions with components of the core transcription complex and other homeoproteins. It may also have roles in limb-pattern formation, craniofacial development, particularly odontogenesis, and tumor growth inhibition. Mutations in this gene, which was once known as homeobox 7, have been associated with nonsyndromic cleft lip with or without cleft palate 5, Witkop syndrome, Wolf-Hirschom syndrome, and autosomoal dominant hypodontia. [provided by RefSeq, Jul 2008]
The protein browser displays the antigen location on the target protein(s) and the features of the target protein. The tabs at the top of the protein view section can be used to switch between the different splice variants to which an antigen has been mapped.
At the top of the view, the position of the antigen (identified by the corresponding HPA identifier) is shown as a green bar. A yellow triangle on the bar indicates a <100% sequence identity to the protein target.
Below the antigens, the maximum percent sequence identity of the protein to all other proteins from other human genes is displayed, using a sliding window of 10 aa residues (HsID 10) or 50 aa residues (HsID 50). The region with the lowest possible identity is always selected for antigen design, with a maximum identity of 60% allowed for designing a single-target antigen (read more).
The curve in blue displays the predicted antigenicity i.e. the tendency for different regions of the protein to generate an immune response, with peak regions being predicted to be more antigenic.The curve shows average values based on a sliding window approach using an in-house propensity scale. (read more).
If a signal peptide is predicted by a majority of the signal peptide predictors SPOCTOPUS, SignalP 4.0, and Phobius (turquoise) and/or transmembrane regions (orange) are predicted by MDM, these are displayed.
Low complexity regions are shown in yellow and InterPro regions in green. Common (purple) and unique (grey) regions between different splice variants of the gene are also displayed (read more), and at the bottom of the protein view is the protein scale.
The protein information section displays alternative protein-coding transcripts (splice variants) encoded by this gene according to the Ensembl database.
The ENSP identifier links to the Ensembl website protein summary, while the ENST identifier links to the Ensembl website transcript summary for the selected splice variant. The data in the UniProt column can be expanded to show links to all matching UniProt identifiers for this protein.
The protein classes assigned to this protein are shown if expanding the data in the protein class column. Parent protein classes are in bold font and subclasses are listed under the parent class.
The Gene Ontology terms assigned to this protein are listed if expanding the Gene ontology column. The length of the protein (amino acid residues according to Ensembl), molecular mass (kDalton), predicted signal peptide (according to a majority of the signal peptide predictors SPOCTOPUS, SignalP 4.0, and Phobius) and the number of predicted transmembrane region(s) (according to MDM) are also reported.
Predicted intracellular proteins Intracellular proteins predicted by MDM and MDSEC Transcription factors Helix-turn-helix domains Disease related genes Mapped to neXtProt neXtProt - Evidence at protein level Protein evidence (Peptideatlas)
GO:0000122 [negative regulation of transcription from RNA polymerase II promoter] GO:0000902 [cell morphogenesis] GO:0000977 [RNA polymerase II regulatory region sequence-specific DNA binding] GO:0000981 [RNA polymerase II transcription factor activity, sequence-specific DNA binding] GO:0000982 [transcription factor activity, RNA polymerase II proximal promoter sequence-specific DNA binding] GO:0001227 [transcriptional repressor activity, RNA polymerase II transcription regulatory region sequence-specific DNA binding] GO:0001228 [transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific DNA binding] GO:0001701 [in utero embryonic development] GO:0001837 [epithelial to mesenchymal transition] GO:0002039 [p53 binding] GO:0003007 [heart morphogenesis] GO:0003198 [epithelial to mesenchymal transition involved in endocardial cushion formation] GO:0003677 [DNA binding] GO:0005634 [nucleus] GO:0005654 [nucleoplasm] GO:0006351 [transcription, DNA-templated] GO:0006355 [regulation of transcription, DNA-templated] GO:0006366 [transcription from RNA polymerase II promoter] GO:0007275 [multicellular organism development] GO:0007507 [heart development] GO:0007517 [muscle organ development] GO:0008285 [negative regulation of cell proliferation] GO:0009952 [anterior/posterior pattern specification] GO:0010463 [mesenchymal cell proliferation] GO:0021983 [pituitary gland development] GO:0023019 [signal transduction involved in regulation of gene expression] GO:0030308 [negative regulation of cell growth] GO:0030326 [embryonic limb morphogenesis] GO:0030509 [BMP signaling pathway] GO:0030513 [positive regulation of BMP signaling pathway] GO:0030900 [forebrain development] GO:0030901 [midbrain development] GO:0034504 [protein localization to nucleus] GO:0035115 [embryonic forelimb morphogenesis] GO:0035116 [embryonic hindlimb morphogenesis] GO:0035326 [enhancer binding] GO:0035880 [embryonic nail plate morphogenesis] GO:0042474 [middle ear morphogenesis] GO:0042475 [odontogenesis of dentin-containing tooth] GO:0042476 [odontogenesis] GO:0042481 [regulation of odontogenesis] GO:0042733 [embryonic digit morphogenesis] GO:0043066 [negative regulation of apoptotic process] GO:0043392 [negative regulation of DNA binding] GO:0043517 [positive regulation of DNA damage response, signal transduction by p53 class mediator] GO:0043565 [sequence-specific DNA binding] GO:0045892 [negative regulation of transcription, DNA-templated] GO:0045944 [positive regulation of transcription from RNA polymerase II promoter] GO:0048863 [stem cell differentiation] GO:0050821 [protein stabilization] GO:0051154 [negative regulation of striated muscle cell differentiation] GO:0051216 [cartilage development] GO:0060021 [palate development] GO:0060325 [face morphogenesis] GO:0060349 [bone morphogenesis] GO:0060536 [cartilage morphogenesis] GO:0061180 [mammary gland epithelium development] GO:0061312 [BMP signaling pathway involved in heart development] GO:0071316 [cellular response to nicotine] GO:0090427 [activation of meiosis] GO:1902255 [positive regulation of intrinsic apoptotic signaling pathway by p53 class mediator] GO:2000678 [negative regulation of transcription regulatory region DNA binding] GO:2001055 [positive regulation of mesenchymal cell apoptotic process]