The lung-specific proteome

The lung is a respiratory organ essential for breathing and responsible for the gaseous exchange between air and blood. The branching airways end in alveoli, which is where the gaseous exchange occurs. The cell types in lung tissue are dominated by alveolar cells, bronchial epithelium, alveolar macrophages, endothelial cells and interstitial cells. Transcriptome analysis shows that 76% (n=15021) of all human proteins (n=19670) are expressed in the lung and 239 of these genes show an elevated expression in the lung compared to other tissue types.

  • 239 elevated genes
  • 13 enriched genes
  • 61 group enriched genes
  • Lung has most group enriched gene expression in common with blood

The lung transcriptome

Transcriptome analysis of the lung can be visualized with regard to specificity and distribution of transcribed mRNA molecules (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in the lung compared to other tissues. Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in lung compared to any other tissues.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in lung compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes that have, or do not have, detectable levels (NX≥1) of transcribed mRNA molecules in the lung compared to other tissues. As evident in Table 1, all genes elevated in lung are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in lung as well as in all other tissues. (B) The distribution of all genes across the six categories, based on transcript detection (NX?1) in lung as well as in all other tissues.

As shown in Figure 1, 239 genes show some level of elevated expression in the lung compared to other tissues. The three categories of genes with elevated expression in lung compared to other organs are shown in Table 1. In Table 2, the 12 genes with the highest enrichment in lung are defined.

Table 1. Number of genes in the subdivided categories of elevated expression in lung.

Distribution in the 37 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Tissue enriched 1930 13
Group enriched 028303 61
Tissue enhanced 0421149 165
Total 17914712 239

Table 2. The 12 genes with the highest level of enriched expression in lung. "Tissue distribution" describes the transcript detection (NX?1) in lung as well as in all other tissues. "mRNA (tissue)" shows the transcript level in lung as NX values. "Tissue specificity score (TS)" corresponds to the fold-change between the expression level in lung and the tissue with second highest expression level.

Gene Description Tissue distribution mRNA (tissue) Tissue specificity score
SFTPC surfactant protein C Detected in single 664.5 718
SFTPA2 surfactant protein A2 Detected in some 501.0 225
SFTPA1 surfactant protein A1 Detected in some 411.8 66
SCGB1A1 secretoglobin family 1A member 1 Detected in some 508.4 18
SFTPB surfactant protein B Detected in some 757.8 17
SCGB3A2 secretoglobin family 3A member 2 Detected in some 248.0 8
AGER advanced glycosylation end-product specific receptor Detected in some 178.6 8
SFTA2 surfactant associated 2 Detected in many 104.5 8
SFTPD surfactant protein D Detected in some 127.0 6
LAMP3 lysosomal associated membrane protein 3 Detected in many 123.9 6
CACNA2D2 calcium voltage-gated channel auxiliary subunit alpha2delta 2 Detected in many 86.1 6
RTKN2 rhotekin 2 Detected in some 67.6 4

Protein expression of genes elevated in lung

In-depth analysis of the genes elevated in lung, using antibody-based protein profiling, allowed us to visualize the expression patterns of the corresponding proteins within the lung. The analysis showed expression in alveolar cells, ciliated and mucus-secreting cells in the respiratory mucosa, as well as in endothelial cells and macrophages.

Proteins specifically expressed in alveolar cells of the lung

Alveolar cells make up the alveolar structure and are essential for normal respiration. A

lveolar cells produce surfactant, a lipoprotein complex crucial for the gaseous exchange between air and blood and for lowering surface tension which prevents alveolar collapse. Surfactant is also important for protecting the lungs from infection. Examples of proteins associated with the production and maintenance of pulmonary surfactant include SFTPA1, SFTPC and NAPSA.




Proteins specifically expressed in macrophages of the lung

Airborne microorganisms entering the lungs are digested and destroyed by macrophages, which play an important role in host defense. One example of a protein expressed in macrophages is MRC1, which mediates endocytosis of pathogenic viruses, bacteria and fungi. Other examples include MARCO, a scavenger receptor, part of the innate antimicrobial immune system, and that may bind both Gram-negative and Gram-positive bacteria, and MCEMP1 a protein with unknown function.




Proteins specifically expressed in ciliated cells of the lung

Ciliated cells are found along the bronchi, where they help free the airways from inhaled contaminants. One example of a protein expressed in ciliated cells is SNTN, which is suggested to be a molecular component of the ciliary tip structures, making the distal portion of the cilia stiffer, and thus allowing for better airway clearance.


Proteins specifically expressed in mucus-secreting cells of the lung

Mucus-secreting cells are present in both bronchial epithelium and peribronchial glands. The secreted mucus is important for maintaining a suitable environment for ciliary function and protection against airborne infectious agents and solid particles. One example of a protein expressed in mucus-secreting cells is SCGB1A1, implicated in anti-inflammation and epithelial regeneration after oxidant-induced injury. Defects in SCGB1A1 are associated with asthma.


Proteins specifically expressed in endothelial cells of the lung

Up to 30% of the cells in the lung are endothelial cells, outlining the alveoli and participating in the gaseous exchange. One example of a protein expressed in lung endothelial cells is PRX. PRX encodes a protein suggested to be required for maintenance of peripheral nerve myelin sheath, also playing a role in axon–glial interaction. Distinct expression in lung endothelial cells has previously not been described.


Gene expression shared between lung and other tissues

There are 61 group enriched genes expressed in lung. Group enriched genes are defined as genes showing a 4-fold higher average level of mRNA expression in a group of 2-5 tissues, including lung, compared to all other tissues.

In order to illustrate the relation of lung tissue to other tissue types, a network plot was generated, displaying the number of genes with shared expression between different tissue types.

Figure 2. An interactive network plot of the lung enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of lung enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.

One example of a group enriched gene in the lung and thyroid gland is NK2 homeobox 1 (NKX2-1). NKX2-1 is a transcription factor suggested to regulate lung surfactant homeostasis and early development of lung structures, while in the thyroid it is involved in regulating the expression of genes important for thyroid hormone production. Mutations in NKX2-1 genes have been associated with breathing difficulty and reduced thyroid gland function (hypothyroidism). Immunohistochemistry shows staining in thyroid glandular cells and respiratory epithelial cells.

NKX2-1 - bronchus

NKX2-1 - thyroid gland

Lung function

The lungs are one of the largest organs in the human body. They are responsible for supplying the circulatory system with oxygen, which will then be transported to all other organs in the body. Inhaled air passes through the nose or mouth via trachea to the bronchi, and further through bronchioli, before finally reaching the alveoli of the lungs. This is where the gaseous exchange occurs; oxygen is exchanged for carbon dioxide which is transported in the opposite direction and exhaled.

The physiological function of the lung is regulated by a complex molecular concert of specialized cell types, such as alveolar cells, macrophages, ciliated cells, mucus-secreting cells and endothelial cells.

Lung histology

The pulmonary alveolus, where the gaseous exchange takes place, is composed of a continuous layer of epithelial cells overlying a thin interstitium. Two morphologically distinct cell types, alveolar cells type I and type II, line the alveoli. Alveolar macrophages are also present on the epithelial surface. The interstitium contains capillaries involved in the exchange of gas, as well as connective tissue and a variety of cells involved in alveolar shape and defense. The trachea, bronchi and bronchioli are air-filled branching tubes that include basal cells, neuroendocrine cells, ciliated cells, serous cells, Clara cells and goblet cells.

The histology of human lung including detailed images and information about the different cell types can be viewed in the Protein Atlas Histology Dictionary.


Here, the protein-coding genes expressed in lung are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in lung.

Transcript profiling was based on a combination of three transcriptomics datasets (HPA, GTEx and FANTOM5, corresponding to a total of 483 samples from 37 different human normal tissue types. The final consensus normalized expression (NX) value for each tissue type was used for classification of all genes according to the tissue specific expression into two different categories, based on specificity or distribution.

Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Yu NY et al., Complementing tissue characterization by integrating transcriptome profiling from the Human Protein Atlas and from the FANTOM5 consortium. Nucleic Acids Res. (2015)
PubMed: 26117540 DOI: 10.1093/nar/gkv608

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Lindskog C et al., The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 25169055 DOI: 10.1096/fj.14-254862

Histology dictionary - the lung