The lung-specific proteome

The lung is a respiratory organ essential for breathing and responsible for the gaseous exchange between air and blood. The branching airways end in alveoli, which is where the gaseous exchange occurs. The cell types in lung tissue are dominated by alveolar cells, bronchial epithelium, alveolar macrophages, endothelial cells and interstitial cells. Transcriptome analysis shows that 71% (n=14381) of all human proteins (n=20162) are expressed in the lung and 195 of these genes show an elevated expression in the lung compared to other tissue types.

  • 195 elevated genes
  • 17 enriched genes
  • 42 group enriched genes
  • Lung has most group enriched gene expression in common with lymphoid tissue

The lung transcriptome

Transcriptome analysis of the lung can be visualized with regard to the specificity and distribution of transcribed mRNA molecules (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in the lung compared to other tissues. Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in lung compared to any other tissues.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in lung compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes have, or do not have, detectable levels (nTPM≥1) of transcribed mRNA molecules in the lung compared to other tissues. As evident in Table 1, all genes elevated in lung are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one-third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in lung as well as in all other tissues. (B) The distribution of all genes across the six categories, based on transcript detection (nTPM≥1) in lung as well as in all other tissues.

As shown in Figure 1, 195 genes show some level of elevated expression in the lung compared to other tissues. The three categories of genes with elevated expression in lung compared to other organs are shown in Table 1. In Table 2, the 12 genes with the highest enrichment in lung are defined.

Table 1. The number of genes in the subdivided categories of elevated expression in lung.

Distribution in the 36 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Tissue enriched 13121 17
Group enriched 016242 42
Tissue enhanced 1299313 136
Total 24812916 195

Table 2. The 12 genes with the highest level of enriched expression in lung. "Tissue distribution" describes the transcript detection (nTPM≥1) in lung as well as in all other tissues. "mRNA (tissue)" shows the transcript level in lung as nTPM values. "Tissue specificity score (TS)" corresponds to the fold-change between the expression level in lung and the tissue with the second-highest expression level.

Gene Description Tissue distribution mRNA (tissue) Tissue specificity score
SFTPA1 surfactant protein A1 Detected in some 6866.0 1049
SFTPC surfactant protein C Detected in many 17877.0 795
SFTPA2 surfactant protein A2 Detected in many 10463.2 405
SCGB3A2 secretoglobin family 3A member 2 Detected in many 1229.1 123
SFTPB surfactant protein B Detected in many 3453.9 31
SFTPD surfactant protein D Detected in many 492.1 25
AGER advanced glycosylation end-product specific receptor Detected in many 704.7 19
NAPSA napsin A aspartic peptidase Detected in many 1011.1 11
MS4A15 membrane spanning 4-domains A15 Detected in some 42.5 11
RTKN2 rhotekin 2 Detected in many 75.2 10
SCGB1A1 secretoglobin family 1A member 1 Detected in many 2979.8 9
SLC34A2 solute carrier family 34 member 2 Detected in many 729.3 8

Protein expression of genes elevated in lung

In-depth analysis of the genes elevated in lung, using antibody-based protein profiling, allowed us to visualize the expression patterns of the corresponding proteins within the lung. The analysis showed expression in alveolar cells, ciliated and mucus-secreting cells in the respiratory mucosa, as well as in endothelial cells and macrophages.

Proteins specifically expressed in alveolar cells of the lung

Alveolar cells make up the alveolar structure and are essential for normal respiration. A

lveolar cells produce surfactant, a lipoprotein complex crucial for the gaseous exchange between air and blood and for lowering surface tension which prevents alveolar collapse. Surfactant is also important for protecting the lungs from infection. Examples of proteins associated with the production and maintenance of pulmonary surfactant include SFTPA1, SFTPC and NAPSA.




Proteins specifically expressed in macrophages of the lung

Airborne microorganisms entering the lungs are digested and destroyed by macrophages, which play an important role in host defense. One example of a protein expressed in macrophages is MRC1, which mediates endocytosis of pathogenic viruses, bacteria and fungi. Other examples include MARCO, a scavenger receptor, part of the innate antimicrobial immune system, and that may bind both Gram-negative and Gram-positive bacteria, and MCEMP1 a protein with unknown function.




Proteins specifically expressed in ciliated cells of the lung

Ciliated cells are found along the bronchi, where they help free the airways from inhaled contaminants. One example of a protein expressed in ciliated cells is Cytochrome P450 4B1(CYP4B1). CYP4B1 in animals can activate a number of protoxicants and procarcinogens, however human CYP4B1's function is still unclear.


Proteins specifically expressed in mucus-secreting cells of the lung

Mucus-secreting cells are present in both bronchial epithelium and peribronchial glands. The secreted mucus is important for maintaining a suitable environment for ciliary function and protection against airborne infectious agents and solid particles. One example of a protein expressed in mucus-secreting cells is SCGB1A1, implicated in anti-inflammation and epithelial regeneration after oxidant-induced injury. Defects in SCGB1A1 are associated with asthma.


Proteins specifically expressed in endothelial cells of the lung

Up to 30% of the cells in the lung are endothelial cells, outlining the alveoli and participating in the gaseous exchange. One example of a protein expressed in lung endothelial cells is PRX. PRX encodes a protein suggested to be required for maintenance of peripheral nerve myelin sheath, also playing a role in axon–glial interaction. Distinct expression in lung endothelial cells has previously not been described.


Gene expression shared between lung and other tissues

There are 42 group enriched genes expressed in lung. Group enriched genes are defined as genes showing a 4-fold higher average level of mRNA expression in a group of 2-5 tissues, including lung, compared to all other tissues.

To illustrate the relation of lung tissue to other tissue types, a network plot was generated, displaying the number of genes with a shared expression between different tissue types.

Figure 2. An interactive network plot of the lung enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of lung enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 4 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.

Lung shares most group enriched gene expression with lymphoid tissue, as well as with many other tissues. One example of a group enriched gene in the lung and thyroid gland is NK2 homeobox 1 (NKX2-1). NKX2-1 is a transcription factor suggested to regulate lung surfactant homeostasis and early development of lung structures, while in the thyroid it is involved in regulating the expression of genes important for thyroid hormone production. Mutations in NKX2-1 genes have been associated with breathing difficulty and reduced thyroid gland function (hypothyroidism). Immunohistochemistry shows staining in thyroid glandular cells and respiratory epithelial cells.

NKX2-1 - bronchus

NKX2-1 - thyroid gland

Lung function

The lungs are one of the largest organs in the human body. They are responsible for supplying the circulatory system with oxygen, which will then be transported to all other organs in the body. Inhaled air passes through the nose or mouth via trachea to the bronchi, and further through bronchioli, before finally reaching the alveoli of the lungs. This is where the gaseous exchange occurs; oxygen is exchanged for carbon dioxide which is transported in the opposite direction and exhaled.

The physiological function of the lung is regulated by a complex molecular concert of specialized cell types, such as alveolar cells, macrophages, ciliated cells, mucus-secreting cells and endothelial cells.

Lung histology

The pulmonary alveolus, where the gaseous exchange takes place, is composed of a continuous layer of epithelial cells overlying a thin interstitium. Two morphologically distinct cell types, alveolar cells type I and type II, line the alveoli. Alveolar macrophages are also present on the epithelial surface. The interstitium contains capillaries involved in the exchange of gas, as well as connective tissue and a variety of cells involved in alveolar shape and defense. The trachea, bronchi and bronchioli are air-filled branching tubes that include basal cells, neuroendocrine cells, ciliated cells, serous cells, Clara cells and goblet cells.

The histology of human lung including detailed images and information about the different cell types can be viewed in the Protein Atlas Histology Dictionary.


Here, the protein-coding genes expressed in lung are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in lung.

Transcript profiling was based on a combination of two transcriptomics datasets (HPA and GTEx), corresponding to a total of 14590 samples from 50 different human normal tissue types. The final consensus normalized expression (nTPM) value for each tissue type was used for the classification of all genes according to the tissue-specific expression into two different categories, based on specificity or distribution.

Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Lindskog C et al., The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 25169055 DOI: 10.1096/fj.14-254862

Histology dictionary - the lung