The specialized epithelial cell-specific proteome
Epithelial cells form sheets of cells, epithelia, that line the outer and inner surfaces of the body and constitute the building blocks for glandular tissues. In addition to glandular and squamous epithelial cells, there are several other types of epithelial cells, specialized to the purpose of their environment.
Transcriptome analysis shows that 81% (n=16313) of all human proteins (n=20162) are detected in specialized epithelial cells and 2994 of these genes show an elevated expression in any specialized epithelial cells compared to other cell type groups. In-depth analysis of the elevated genes in specialized epithelial cells using scRNA-seq and antibody-based protein profiling allowed us to visualize the expression patterns of these proteins in specialized epithelial celltypes of the following tissues: lung, salivary gland, pancreas, liver, kidney, thymus, testis and ovary.
The specialized epithelial cell transcriptome
The scRNA-seq-based specialized epithelial cell transcriptome can be analyzed with regard to specificity, illustrating the number of genes with elevated expression in each specific specialized epithelial cell type compared to other cell types (Table 1). Genes with an elevated expression are divided into three subcategories:
Table 1. Number of genes in the subdivided specificity categories of elevated expression in the analyzed specialized epithelial cell types.
Alveolar cells type 1
As shown in Table 1, 335 genes are elevated in alveolar cells type 1 compared to other cell types. Gas exchange between the air in the lung alveoli and blood takes place via the alveolar cells type 1, which line the alveolar walls. Examples of proteins elevated in alveolar cells type 1 are aquaporin 4 (AQP4), a cell membrane-bound channel that regulates water homeostasis of the fluid lining the lung, and advanced glycosylation end-product specific receptor (AGER), a transmembrane receptors with a broad repertoire of ligands associated with inflammation, infection and aging.
Alveolar cells type 2
As shown in Table 1, 245 genes are elevated in alveolar cells type 2 compared to other cell types. Alveolar cells type 2 are located in lung alveoli and produce surfactants which are crucial for the gaseous exchange between air and blood and for lowering surface tension which prevents alveolar collapse. A gene with enriched expression in alveolar cells type 2 is surfactant protein C (SFTPC), which encodes a surfactant protein. Another example is aspartic peptidase napsin A (NAPSA), a protease that may play a role in the proteolytic processing of surfactant protein B.
Salivary duct cells
As shown in Table 1, 126 genes are elevated in salivary duct cells compared to other cell types. Ductal epithelial cells in the salivary gland can be found throughout the exocrine tissue forming the structure of the gland, guiding the saliva from the acini to the oral cavity. An example of a protein found in salivary duct cells with elevated expression compared to other cell types is the tight junction complex protein claudin 1 CLDN1. Another gene with enhanced expression in salivary ducts is the nitric oxide generating enzyme nitric oxide synthase 1 (NOS1) which has, among other functions, antimicrobial activity.
As shown in Table 1, 186 genes are elevated in ductal epithelial cells compared to other cell types. Ductal epithelial cells in the pancreas can be found throughout the exocrine tissue, guiding the secretions from the acini to the duodenum. Cystic fibrosis transmembrane conductance regulator (CFTR) is an example of a gene with elevated expression in the ductal epithelium of the pancreas. CFTR functions as an ion channel transporting Cl- and HCO3- ions out from the cell in an ATP-dependent manner. Another example is the gamma-glutamyltransferase 1 (GGT1). GGT1 is expressed in cells that are involved in regulating secretion and absorption.
As shown in Table 1, 849 genes are elevated in hepatocytes compared to other cell types. The hepatocytes are the main cell type in the liver and responsible for many of the body's metabolic processes as well as the breakdown of toxic substances. Examples of hepatocyte enhanced genes are retinol dehydrogenase 16 (RDH16) and hydroxyacid oxidase 1 (HAO1) involved in lipid metabolism.
As shown in Table 1, 210 genes are elevated in cholangiocytes compared to other cell types. Cholangiocytes are the epithelial cells of the bile duct system in the liver. Genes with enhanced expression in cholangiocytes are for example transcription factor hepatocyte nuclear factor 1-beta (HNF1B) and transport protein aquaporin 1 (AQP1) which forms a channel for water to move through across the osmotic gradient.
Proximal tubular cells
As shown in Table 1, 721 genes are elevated in proximal tubular cells compared to other cell types. Approximately 60% of the filtered Na+, Cl-, K+, Ca2+, H2O and more than 90% of the filtered HCO3- are absorbed along the proximal tubule. This is also the segment that normally reabsorbs virtually all the filtered glucose and amino acids. An additional function is the secretion of numerous organic anions and cations. Examples of proteins that are elevated in the proximal part of the renal tubules are agmatinase (AGMAT), an enzyme involved in the processing of urea and amino acids, and N-acetyltransferase 8 (NAT8), an enzyme that catalyzes the acetylation of cysteine S-conjugates to form mercapturic acids as a part of a detoxification of a wide variety of reactive electrophiles.
Distal tubular cells
As shown in Table 1, 366 genes are elevated in distal tubular cells compared to other cell types. Both the distal tubule and collecting duct are the sites where critical regulatory hormones such as aldosterone and vasopressin regulate acid and potassium excretion and determine the final urinary concentration of K+, Na+, and Cl-. Proteins elevated in distal tubules include solute carrier family 12 member 1 (SLC12A1), one of several potassium, sodium, and calcium transporters essential for regulating the contents and volume of urine. Another example is transmembrane protein 52B (TMEM52B), the function of which is not completely characterized but TMEM52B is highly elevated in distal tubular cells.
Collecting duct cells
As shown in Table 1, 226 genes are elevated in collecting duct cells compared to other cell types. Collecting ducts cells are the main site of salt and water transport, as well as acid-base regulation. Aquaporin 2 (AQP2) and aquaporin 3 (AQP3) are members of the aquaporin gene family with elevated expression in collecting ducts. These two genes encode water-specific channel proteins that facilitate the reabsorption of water molecules from the urine. Another example of a protein elevated in collecting ducts is FXYD domain containing ion transport regulator 4 (FXYD4). It encodes a protein that regulates the transport of ions across the cell membrane.
As shown in Table 1, 249 genes are elevated in mesothelial cells compared to other cell types. Mesothelial cells are specialized epithelial cells that form a thin sheet that covers the inner cavities and organs of the body, including the pleural, peritoneal and pericardial cavities. The thymus is positioned next to the heart and the pericardial cavity, covered by a mesothelium. Two proteins with elevated expression in mesothelial cells are the enzyme prostaglandin I2 synthase (PTGIS), involved in metabolism of eicosanoids, and mesothelin (MSLN) that plays a role in cellular adhesion.
As shown in Table 1, 297 genes are elevated in sertoli cells compared to other cell types. Sertoli cells constitute the seminiferous epithelium in testis, interspersed between the germ cells. They play an important role in spermatogenesis, where they are often referred to as nursing cells since their function is to nourish the developing sperm cells. Sertoli cells also play a central role in the control of spermatogenesis by transducing hormonal signals, e.g. activation and stimulation by follicle stimulating hormone (FSH). Examples of elevated genes in Sertoli cells are cannabinoid receptor 1 (CNR1) which is a G-protein coupled receptor for endogenous cannabinoids and epididymal peptidase inhibitor (EPPIN) that has an essential role in male reproduction and fertility by providing antimicrobial protection.
As shown in Table 1, 254 genes are elevated in granulosa cells compared to other cell types. Granulosa cells are follicle cells surrounding the oocytes in the ovaries and are believed to originate from ovarian surface epithelium. Their main function is to support the growth and maturation of the oocyte and support eventual pregnancy following ovulation through the production of hormones and growth factors. ELK1 and FOXL2 encodes transcription factors that shows enriched mRNA expression in granulosa cells as well as clear nuclear expression in granulosa cells of ovarian follicles.
Specialized epithelial cell function
Epithelial cells form sheets of cells, epithelia, that line the outer and inner surfaces of the body and constitute the building blocks for glandular tissues. Hence, epithelial cells are found in many parts of the body, including skin, airways, the digestive tract, glandular tissues and organs, as well as the urinary and reproductive systems. The wide range of functions of epithelial cells can be broadly divided into two main categories, being in charge of the transfer of compounds in or out of the body, as well as being a protective barrier against invading pathogens and physical, chemical or biological abrasion.
The histology of organs that contain specialized epithelial cells, including interactive images, is described in the Protein Atlas Histology Dictionary.
Here, the protein-coding genes expressed in specialized epithelial cells are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in different specialized epithelial cell types.
The transcript profiling was based on publicly available genome-wide expression data from scRNA-seq experiments covering 29 tissues and peripheral blood mononuclear cells (PBMCs). All datasets (unfiltered read counts of cells) were clustered separately using louvain clustering, resulting in a total of 557 different cell type clusters. The clusters were then manually annotated based on a survey of known tissue and cell type-specific markers. The scRNA-seq data from each cluster of cells was aggregated to mean normalized protein-coding transcripts per million (nTPM) and the normalized expression value (nTPM) across all protein-coding genes. A specificity and distribution classification was performed to determine the number of genes elevated in these single cell types, and the number of genes detected in one, several or all cell types, respectively.
It should be noted that since the analysis was limited to datasets from 29 tissues and PBMC only, not all human cell types are represented. Furthermore, some cell types are present only in low amounts, or identified only in mixed cell clusters, which may affect the results and bias the cell type specificity.
Relevant links and publications
Uhlén M et al., Tissue-based map of the human proteome. Science (2015)