The Human Tissue Specific Proteome

All, approximately 20000, human genes are classified according to their expression across all major organs and tissue types in the human body. Few of the genes are strictly tissue specific, however, the genes with an elevated expression in particular tissues are interesting as a starting point to understand their biology and function, and underlying mechanisms for disease.

  • A total of 10992 genes are elevated in at least one of the analyzed tissues of which:
  • 3107 are tissue enriched genes
  • 1691 are group enriched genes
  • 6194 are enhanced genes


Transcriptome analysis of all major organs and tissue types in the human body can be visualized with regard to specificity and distribution of transcribed mRNA molecules across all putative 20090 protein coding genes (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in a particular tissue compared to other tissues. The analysis includes 10992 genes, and 8233 genes with low tissue specificity (read more in The housekeeping proteome). Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in a particular tissue compared to any other tissue.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in a particular tissue compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes have, or do not have, detectable levels (nTPM≥1) of transcribed mRNA molecules. As evident in Table 1, all elevated genes are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in all 37 analyzed tissues. (B) The distribution of all genes across the six categories based on transcript detection (nTPM≥1) in all 37 analyzed tissues.


Table 1.The number of genes in the subdivided categories of elevated expression in all 37 analyzed tissues.

Distribution in the 36 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Specificity
Tissue enriched 8671331733176 3107
Group enriched 0895662134 1691
Tissue enhanced 185108031221807 6194
Total 1052330645172117 10992

The amount of tissue elevated genes is highly variable between the analyzed tissue types (see Table 2 below). Testis shows the highest number of tissue enriched genes (n=911), followed by the brain (n=517) and liver (n=273). When taking into consideration all tissue elevated genes, the brain however has a slightly higher number than the testis. A large number of enriched genes in testis is considered to be due to the highly specialized processes occurring during spermatogenesis. Many of these genes likely have a shared expression with oocytes in the female ovaries. Oocytes are however difficult to analyze because of the complex kinetics of female germ cell development, including first rounds of meiosis, which in females occur at the embryonic stage. As expected, tissues that have similar functions and morphology often have higher numbers of shared group enriched genes.

In addition to previously known proteins, the analysis also identified a large number of genes with tissue elevated expression patterns that were previously poorly characterized and with no or only scarce evidence of existence at protein level. The combined RNA and antibody-based profiling can thus be used to confirm the physiological functions of such protein coding genes lacking previous annotation. These proteins are interesting starting points for further in-depth studies to gain a better understanding of the molecular mechanisms of the various cellular phenotypes that define the function of each respective tissue and organ.


Table 2. The tissue elevated genes.

Tissue Tissue
enriched
Group
enriched
Tissue
enhanced
Total
elevated
Choroid plexus 16 188 232 436
Brain 517 637 1555 2709
Retina 124 236 399 759
Pituitary gland 19 119 160 298
Thyroid gland 14 31 131 176
Parathyroid gland 28 44 134 206
Adrenal gland 25 66 137 228
Lung 18 47 132 197
Salivary gland 43 89 200 332
Esophagus 23 101 320 444
Tongue 3 234 259 496
Stomach 36 77 209 322
Intestine 120 237 530 887
Liver 273 174 534 981
Gallbladder 3 15 68 86
Pancreas 61 78 182 321
Kidney 56 136 226 418
Urinary bladder 4 36 159 199
Testis 911 303 773 1987
Epididymis 93 72 149 314
Prostate 14 31 84 129
Seminal vesicle 6 14 55 75
Breast 19 39 83 141
Vagina 0 68 93 161
Cervix 0 28 103 131
Endometrium 3 13 82 98
Fallopian tube 28 100 187 315
Ovary 4 28 121 153
Placenta 66 52 170 288
Heart muscle 34 121 199 354
Skeletal muscle 56 276 601 933
Smooth muscle 0 7 44 51
Adipose tissue 3 30 156 189
Skin 179 117 297 593
Bone marrow 106 165 637 908
Lymphoid tissue 202 288 974 1464
Total 3107 1691 6194 10992


Tissue elevated genes

The comprehensive analysis presented here has identified 10992 human genes that display a tissue elevated expression pattern across the human body. By combining the analysis with antibody-based protein profiling using immunohistochemistry, the exact location of the corresponding protein expression pattern at a cellular and subcellular level can be provided. Examples of protein expression patterns of tissue elevated genes are presented below.

Brain

  • GFAP (Glial fibrillary acidic protein) - astrocyte intermediate filament protein
  • MBP (Myelin basic protein) - a major constituent of the myelin sheath
  • ELAVL3 (ELAV like RNA binding protein 3) - neural-specific RNA-binding protein


GFAP - cerebral cortex

MBP - hippocampus

ELAVL3 - cerebral cortex

Retina

  • RHO (Rhodopsin) – involved in phototransduction in rod photoreceptors
  • ARR3 (Arrestin 3) – involved in phototransduction in cone photoreceptors


RHO - retina

ARR3 - retina

Endocrine tissues

  • FSHB (Follicle stimulating hormone beta subunit) – hormone inducing egg and sperm production
  • TG (Thyroglobulin) - substrate for the synthesis of thyroid hormones
  • HSD3B2 (Hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 2) - involved in the biosynthesis of hormonal steroids


FSHB - pituitary gland

TG - thyroid gland

HSD3B2 - adrenal gland

Lung

  • SFTPA1 (Surfactant protein A1) - involved in surfactant homeostasis and the defense against respiratory pathogens
  • SFTPB (Surfactant protein B) - involved in surfactant homeostasis and the defense against respiratory pathogens


SFTPA1 - lung

SFTPB - lung

Proximal digestive tract

  • STATH (Statherin) - inhibits precipitation of calcium phosphate salts in the saliva
  • KRT4 (Keratin 4) - expressed in differentiated layers of mucosal and esophageal epithelia


STATH - salivary gland

KRT4 - esophagus

Gastrointestinal tract

  • PGA4 (Pepsinogen 4, group I (pepsinogen A)) - enzyme for digestion of dietary proteins
  • DEFA5 (Defensin alpha 5) - antimicrobial and cytotoxic peptide involved in host defense
  • KRT20 (Keratin 20) - maintains keratin filament organization in intestinal epithelia


PGA4 - stomach

DEFA5 - duodenum

KRT20 - colon

Liver & gallbladder

  • ALB (Albumin) - plasma protein
  • CYP2A13 (Cytochrome P450 member) - involved in drug metabolism, cholesterol and steroid synthesis
  • CHST4 (Carbohydrate sulfotransferase 4) - an enzyme involved in the modification of glycan structures


ALB - liver

CYP2A13 - liver

CHST4 - gallbladder

Pancreas

  • AMY2A (Amylase, alpha 2A) - an enzyme that digests carbohydrates, secreted by exocrine cells
  • INS (Insulin) - involved in lowering of blood glucose, secreted by beta cells
  • GCG (Glucagon) - involved in the elevation of blood glucose, secreted by alpha cells


AMY2A - pancreas

INS - pancreas

GCG - pancreas

Kidney & urinary bladder

  • SLC22A13 (Solute carrier family 22 member 13) - membrane-bound organic anion transporter
  • NPHS2 (Podocin) - involved in the regulation of glomerular permeability
  • UPK2 (Uroplakin 2) - membrane protein preventing cell rupture during bladder distention


SLC22A13 - kidney

NPHS2 - kidney

UPK2 - urinary bladder

Male tissues

  • DMRT1 (Doublesex- and mab-3-related transcription factor 1) - involved in meiosis
  • SEMG1 (Semenogelin I) - a predominant protein in semen
  • KLK3 (Kallikrein related peptidase 3) - also called PSA, is used clinically to diagnose prostate cancer


DMRT1 - testis

SEMG1 - seminal vesicle

KLK3 - prostate

Female tissues

  • CSH1 (Chorionic somatomammotropin hormone 1 ) - hormone important for growth control during pregnancy
  • OVGP1 (Oviductal glycoprotein 1) - mucus protein important in mucociliary transport of the fertilized ovum
  • PWWP3B (PWWP domain containing 3B) - a protein with a mutated melanoma-associated antigen 1 domain, associated with cancer


CSH1 - placenta

OVGP1 - fallopian tube

PWWP3B - ovary

Muscle tissues

  • TNNI3 (Troponin I3, cardiac type) - mediates muscle relaxation
  • TNNT2 (Troponin T2, cardiac type) - mediates muscle contraction
  • MYH7 (Myosin heavy chain 7) - expressed in slow type I muscle fibers


TNNI3 - heart muscle

TNNT2 - heart muscle

MYH7 - skeletal muscle

Connective & soft tissue

  • FABP4 (Fatty acid binding protein 4) - involved in fatty acid uptake, transport, and metabolism
  • PLIN1 (Perilipin 1) - coats lipid storage droplets in adipocytes


FABP4 - adipose tissue (soft tissue)

PLIN1 - adipose tissue (breast)

Skin

  • KRT1 (Keratin 1) - involved in squamous differentiation and skin barrier function
  • KRT27 (Keratin 27) - plays a role in hair formation
  • CASP14 (Caspase 14) - involved in keratinocyte differentiation and cornification


KRT1 - skin

KRT27 - hair

CASP14 - skin

Bone marrow & lymphoid tissues

  • MPO (Myeloperoxidase) - major component of neutrophil azurophilic granules
  • CD8B (CD8b molecule) - plays a critical role in thymic selection of CD8+ T-cells
  • CD22 (CD22 molecule) - mediates interactions between B-cells


MPO - bone marrow

CD8B - thymus

CD22 - lymph node


Group enriched proteins

The 1691 genes identified as group enriched reflect genes with shared expression in 2-5 tissues. Many of these genes encode proteins that are expressed in cell types that have similar functions across several tissues, such as proteins expressed in immune cells (present in many organs but especially lymphoid tissues and the gastrointestinal tract) tissues), proteins involved in squamous cell differentiation (e.g. cervix, esophagus and skin), glandular cell function in the gastrointestinal tract (duodenum, small intestine and colon) or cilia movement (testis and fallopian tube). The schematic network plot below shows the distribution between group enriched genes in different tissues.

Figure 2. An interactive network plot of the tissue enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of tissue enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.


Immune cells can be found in both lymphoid organs and organs infiltrated by immune cells, such as the intestine. Consequently, genes important for immune cell function are often enriched in both lymphoid tissues and the intestine. One such gene is CD19, encoding a co-receptor for the B-cell antigen receptor complex on B-cell lymphocytes essential for their differentiation and proliferation, including antibody production, in response to antigens.


CD19 - tonsil

CD19 - appendix

CD19 - colon

Squamous epithelia are found in many parts of the body as dry skin or wet mucosa, acting as a robust barrier against various chemical and mechanical stresses. Desmocollin 3, DSC3, encoding a protein important in cell-cell junctions and cellular adhesion, is group enriched in squamous epithelia, such as the esophagus and skin exemplified below.


DSC3 - esophagus

DSC3 - skin

Mucus has several functions in the body related to transportation and barrier functions. The function of the mucus in the salivary gland is related to food and pathogens, while the mucus in the cervix is involved in for example transportation and blockage of sperm during sexual reproduction. MUC16 is a mucus component and is group enriched in both the mucus-producing salivary gland and cervix.


MUC16 - salivary gland

MUC16 - cervix

The fallopian tube shares many elevated genes with testis. The common denominator is the utilization of cilia, or the structurally similar flagellum, for essential organ functions. DNAI2, a dynein protein, constitutes a motor protein component of motile cilia of multiciliated cells as well as the flagellum (tail) of the sperm. By pulling on the microtubule structure of the cilium/flagellum, the motor protein creates motion and in the case of the sperm, sperm motility. In the immunohistochemistry images below, expression of DNAI2 can be seen in a subset of cilia in the fallopian tube (left and middle image), as well as in the flagellum of spermatids and cytoplasm of differentiating spermatocytes (right image).


DNAI2 - fallopian tube

DNAI2 - fallopian tube ciliated cells

DNAI2 - testis


Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Bergman J et al., The human adrenal gland proteome defined by transcriptomics and antibody-based profiling. Endocrinology. (2016)
PubMed: 27901589 DOI: 10.1210/en.2016-1758

Edqvist PH et al., Expression of human skin-specific genes defined by transcriptomics and antibody-based profiling. J Histochem Cytochem. (2015)
PubMed: 25411189 DOI: 10.1369/0022155414562646

Lindskog C et al., The human cardiac and skeletal muscle proteomes defined by transcriptomics and antibody-based profiling. BMC Genomics. (2015)
PubMed: 26109061 DOI: 10.1186/s12864-015-1686-y

Sjöstedt E et al., Defining the Human Brain Proteome Using Transcriptomics and Antibody-Based Profiling with a Focus on the Cerebral Cortex. PLoS One. (2015)
PubMed: 26076492 DOI: 10.1371/journal.pone.0130028

Zieba A et al., The Human Endometrium-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling. OMICS. (2015)
PubMed: 26488136 DOI: 10.1089/omi.2015.0115

O'Hurley G et al., Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer. PLoS One. (2015)
PubMed: 26237329 DOI: 10.1371/journal.pone.0133449

Habuka M et al., The Urinary Bladder Transcriptome and Proteome Defined by Transcriptomics and Antibody-Based Profiling. PLoS One. (2015)
PubMed: 26694548 DOI: 10.1371/journal.pone.0145301

Andersson S et al., The transcriptomic and proteomic landscapes of bone marrow and secondary lymphoid tissues. PLoS One. (2014)
PubMed: 25541736 DOI: 10.1371/journal.pone.0115911

Habuka M et al., The kidney transcriptome and proteome defined by transcriptomics and antibody-based profiling. PLoS One. (2014)
PubMed: 25551756 DOI: 10.1371/journal.pone.0116125

Mardinoglu A et al., Defining the Human Adipose Tissue Proteome To Reveal Metabolic Alterations in Obesity. J Proteome Res. (2014)
PubMed: 25219818 DOI: 10.1021/pr500586e

Kampf C et al., Defining the human gallbladder proteome by transcriptomics and affinity proteomics. Proteomics. (2014)
PubMed: 25175928 DOI: 10.1002/pmic.201400201

Lindskog C et al., The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 25169055 DOI: 10.1096/fj.14-254862

Gremel G et al., The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol. (2014)
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7

Kampf C et al., The human liver-specific proteome defined by transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 24648543 DOI: 10.1096/fj.14-250555

Djureinovic D et al., The human testis-specific proteome defined by transcriptomics and antibody-based profiling. Mol Hum Reprod. (2014)
PubMed: 24598113 DOI: 10.1093/molehr/gau018

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Danielsson A et al., The human pancreas proteome defined by transcriptomics and antibody-based profiling. PLoS One. (2014)
PubMed: 25546435 DOI: 10.1371/journal.pone.0115421

Microscopical images of normal tissue - Tissue Dictionary (Human Protein Atlas)

GTEx Portal

Fantom

UniProt

Allen Brain Atlas