The epithelial cell-specific proteome
Epithelial cells line the outer and inner surfaces of the body, forming a barrier between the self and non-self, and are involved in a range of functions, including protection against invasion and abrasion, secretion and absorption. Epithelial cells are classically classified according to their cellular and multicellular structure as well as their function, herein divided into three broad groups of glandular, squamous and other types of epithelial cells. Transcriptome analysis shows that 85% (n=16774) of all human proteins (n=19670) are detected in epithelial cells and 6081 of these genes show an elevated expression in any epithelial cells compared to other cell type groups.
The epithelial cell transcriptome
The scRNA-seq-based epithelial cell transcriptome can be analyzed with regard to specificity, illustrating the number of genes with elevated expression in each specific epithelial cell type compared to other cell types (Table 1). Genes with an elevated expression are divided into three subcategories:
Table 1. Number of genes in the subdivided specificity categories of elevated expression in the analyzed epithelial cell types.
Protein expression of genes elevated in epithelial cells
In-depth analysis of the elevated genes in epithelial cells using scRNA-seq and antibody-based protein profiling allowed us to visualize the expression patterns of these proteins in different types of epithelial cells: glandular, squamous and other types of epithelial cells.
Proteins elevated in glandular epithelial cells
Glandular epithelial cells are multifunctional cells with the main functions of secretion and absorption. Glandular epithelia are found at the surface of inner cavities of the gastrointestinal tract, lung, kidney, uterus and glandular organs such as prostate and pancreas. Several types of glandular epithelial cells are analyzed below:
Enterocytes - intestine
As shown in Table 1, 604 genes are elevated in enterocytes compared to other cell types. Enterocytes are simple columnar epithelial cells in the small and large intestine. These cells facilitate the uptake of nutrients from the intestinal lumen. An example of a protein with elevated expression in enterocytes is intestinal alkaline phosphatase (ALPI) that has an important role in maintaining a healthy gut microbiome. Another example is cell surface A33 antigen (GPA33), a glycoprotein involved in cell-cell signaling observed in 95% of colon cancers.
Mucus-secreting cells - intestine
As shown in Table 1, 336 genes are elevated in mucus-secreting cells compared to other cell types. Mucus-secreting cells, also known as goblet cells, lay interspersed between the enterocytes in the small intestine and in larger quantities in the large intestine. These cells secrete glycoprotein-rich mucin to create a protective mucus layer over the intestinal epithelia. Examples of glycoproteins that have elevated expression in mucus-secreting cells include Ca2+-independent lectin regenerating islet-derived protein 4 (REG4) and Ca2+-dependent lectin intelectin-1 (ITLN1). Both of these proteins are involved in inflammatory responses by binding to microbial carbohydrate chains.
Paneth cells - intestine
As shown in Table 1, 412 genes are elevated in paneth cells compared to other cell types. Paneth cells are epithelial cells located in the intestinal crypts that secrete antimicrobial peptides and immunomodulating proteins to regulate the intestinal microbiome. For example the protein encoded by C10orf99, which is suggested to function as a chemokine ligand for G-protein coupled receptor 15 (GPR15). Another example of a paneth cell elevated protein is solute carrier family 22 member 18 antisense (SLC22A18AS), which is a potential transporter protein associated with Beckwith-Wiedemann Syndrome.
Ciliated cells - lung
As shown in Table 1, 1158 genes are elevated in ciliated cells compared to other cell types. Ciliated cells with motile cilia are found in many parts of the body, including the respiratory epithelium lining the bronchi and bronchioles, where they help free the airways from inhaled contaminants. A protein expressed in ciliated cells, specifically in ciliary rootlets, is family with sequence similarity 92 member B (FAM92B). It is suggested to have a possible role in the biogenesis of cilia. Another example is cilia and flagella associated protein 157 (CFAP157), which is mainly known for its importance for flagella function. In ciliated cells, it is expressed in both ciliary rootlets and tips of cilia.
Club cells - lung
As shown in Table 1, 575 genes are elevated in club cells compared to other cell types. Clubs cells (also called Clara cells) are found in the respiratory bronchiole epithelium in the lung. They are proposed to have several roles, including secretion of the extracellular substance lining the respiratory bronchioles. A protein with elevated expression in club cells is secretoglobin family 1A member 1 (SCGB1A1), a small secreted molecule that is suggested to have numerous functions, e.g. anti-inflammation, and defects in the SCGB1A1 gene are related to susceptibility to asthma. Another example is secretory leukocyte peptidase inhibitor (SLPI), which is found in various bodily secretions including seminal plasma, cervical mucus, and bronchial secretions, where it protects epithelial tissues from serine proteases by inhibiting their activity.
Exocrine glandular cells - pancreas
As shown in Table 1, 347 genes are elevated in exocrine glandular cells compared to other cell types. Exocrine glandular cells are the major cell type in the pancreas. These cells secrete digestive enzymes and NaHCO3 into ducts leading to the duodenum. Examples of exocrine glandular cell specific genes are phospholipase A2 precursor (PLA2G1B), a Ca2+ dependent phospholipase and carboxypeptidase A1 precursor (CPA1), which is a protease that cleaves C-terminally branched-chain and aromatic amino acids.
Basal glandular cells - prostate
As shown in Table 1, 273 genes are elevated in basal glandular cells compared to other cell types. Basal cells in the prostate play an important part in the structural and luminal integrity of the prostate glands. Disruptions in these cells are associated with cancer related issues. Keratin 15 (KRT15) is involved in keeping the structural integrity of the basal cells. The KRT15 family of genes are in some cases associated with disease development in prostate cancer. Another gene that is expressed in the basal glandular cells of the prostate is the nerve growth factor receptor (NGFR), which is associated with several different functions throughout the body. The receptor is known to interact with BDNF in controlling nerve cell growth and apoptosis, however, its exact function in prostate basal glandular cells is yet to be investigated.
Glandular cells - prostate
As shown in Table 1, 233 genes are elevated in glandular cells compared to other cell types. The prostate is composed of prostatic glands and a non-glandular stroma. Within the glandular structures, there are secretory cells, which are separated from the basement membrane and stroma by a layer of basal cells. One example of proteins with elevated expression in the glandular cells is kallikrein related peptidase 3 (KLK3), generally referred to as prostate-specific antigen (PSA), a serine protease that is synthesized by glandular cells of the prostate. Under normal conditions, PSA is secreted into the extracellular fluid in small quantities and its function is believed to be important for liquefaction of seminal fluid in the seminal coagulum and to allow sperm to swim freely. Acid phosphatase, prostate (ACPP) is an enzyme that catalyzes the conversion of orthophosphoric monoester to alcohol and orthophosphate synthesized under androgen regulation and is secreted by the epithelial cells of the prostate gland.
Other glandular epithelial cells
Glandular epithelial cells are also found in other secreting organs and tissues of the body, including salivary glands in the mouth, the inner lining of the stomach and female tissues such as the breast and uterus. The glandular epithelial cells of salivary glands and the stomach secrete compounds necessary for food digestion, including mucin 7 (MUC7), a protein found in the protective and lubricating mucus produced by salivary glands, and hydrochloric acid, produced by parietal cells of the stomach with the help of ion transport proteins, such as ATPase H+/K+ transporting beta subunit (ATP4B). Other glandular epithelial cells play important roles in female tissues, including the milk-secreting glandular cells of the lactating breast that express proteins such as lactalbumin alpha (LALBA), a key enzyme in milk production, and the glandular cells of the uterine endocervix that excrete proteins such as secretory leukocyte peptidase inhibitor (SLPI), a protease inhibitor which protects epithelial surfaces from endogenous proteolytic enzymes with antimicrobial function.
Proteins elevated in squamous epithelial cells
Squamous epithelia consist of multiple layers of cells, with superficial layers of squamous (flat) cells and underlying replenishing cells. The innermost layer of epithelial cells, in contact with the underlying basal membrane, consists of cuboidal multipotent stem cells, called basal cells. Basal cells divide to renew the entire epithelial lining which is under repeated stress and abrasion from the environment causing the superficial layers to slough off. The daughter cells of basal cells slowly transition into squamous cornified (rigid) dead cells with a high content of glycogen as they become increasingly superficially located. There are two types of squamous epithelia, keratinized and non-keratinized. The keratinized type is dry at the surface and forms the surface of the skin, while the non-keratinized type is kept moist at the surface and found in the digestive system tissues and female tissues. Basal and suprabasal keratinocytes of the skin as well as other squamous epithelial cells are analyzed below:
Basal keratinocytes - skin
As shown in Table 1, 423 genes are elevated in basal keratinocytes compared to other cell types. Basal keratinocytes are stem cells found in the basal layer of epidermis. They are highly proliferative and responsible for the renewal of keratinocytes. One protein expressed in the basal layer is collagen type XVII alpha 1 chain (COL17A1), which may play a role in hemidesmosome integrity and the attachment of basal keratinocytes to the underlying basement membrane. Another example is keratin 15 (KRT15), which belongs to the keratin family of proteins and is involved in keeping the structural integrity of the basal cells. Mutations in KRT15 are associated with the Kindler syndrome, an autosomal recessive skin disease that leads to very fragile skin that blisters easily.
Suprabasal keratinocytes - skin
As shown in Table 1, 505 genes are elevated in suprabasal keratinocytes compared to other cell types. Suprabasal keratinocytes are post-mitotic keratinocytes that reside in the stratum spinosum-layer of the epidermis, where they keep differentiating, developing a larger cytoplasm and well-formed bundles of keratin intermediate filaments as they are pushed towards the stratum corneum. Examples of proteins expressed in the stratum spinosum include keratin 10 (KRT10), which plays a role in the establishment of the epidermal barrier on plantar skin, and caspase 14 (CASP14), which is thought to play a role in keratinocyte differentiation and required for cornification.
Other squamous epithelial cells
Squamous epithelia are also present in additional parts of the body where there is a need for a barrier that is capable of handling frequent mechanical stress, including digestive tissues and the vaginal tract. A variety of keratin intermediate filament proteins are expressed in squamous epithelia to provide structural integrity, including keratin 6A (KRT6A) and 13 (KRT13).
Proteins elevated in other epithelial cells
In addition to glandular and squamous epithelial cells, there are several other types of epithelial cells, specialized to the purpose of their environment. The specialization of epithelial cell types in liver, lung, kidney, pancreas, testis and urinary bladder, reflected by their divergent expression profiles, is analyzed below:
Cholangiocytes - liver
As shown in Table 1, 310 genes are elevated in cholangiocytes compared to other cell types. Cholangiocytes are the epithelial cells of the bile duct system in the liver. Genes with enhanced expression in cholangiocytes are for example transcription factor hepatocyte nuclear factor 1-beta (HNF1B) and transport protein aquaporin 1 (AQP1) which forms a channel for water to move through across the osmotic gradient.
Hepatocytes - liver
As shown in Table 1, 881 genes are elevated in hepatocytes compared to other cell types. The hepatocytes are the main cell type in the liver and responsible for many of the body's metabolic processes as well as the breakdown of toxic substances. Examples of hepatocyte enhanced genes are retinol dehydrogenase 16 (RDH16) and hydroxyacid oxidase 1 (HAO1) involved in lipid metabolism.
Alveolar cells type 1 - lung
As shown in Table 1, 652 genes are elevated in alveolar cells type 1 compared to other cell types. Gas exchange between the air in the lung alveoli and blood takes place via the alveolar cells type 1, which line the alveolar walls. Examples of proteins elevated in alveolar cells type 1 are aquaporin 4 (AQP4), a cell membrane-bound channel that regulates water homeostasis, and claudin 18 (CLDN18), a tight junction protein which prevents the passage of solutes and other molecules through the paracellular space in the epithelium.
Alveolar cells type 2 - lung
As shown in Table 1, 644 genes are elevated in alveolar cells type 2 compared to other cell types. Alveolar cells type 2 are located in lung alveoli and produce surfactants which are crucial for the gaseous exchange between air and blood and for lowering surface tension which prevents alveolar collapse. A gene with enriched expression in alveolar cells type 2 is surfactant protein C (SFTPC), which encodes a surfactant protein. Another example is napsin A aspartic peptidase (NAPSA), a protease that may play a role in the proteolytic processing of surfactant protein B.
Collecting duct cells - kidney
As shown in Table 1, 504 genes are elevated in collecting duct cells compared to other cell types. Collecting ducts cells are the main site of salt and water transport, as well as acid-base regulation. Aquaporin 2 (AQP2) and aquaporin 3 (AQP3) are members of the aquaporin gene family with elevated expression in collecting ducts. These two genes encode water-specific channel proteins that facilitate the reabsorption of water molecules from the urine. Another example of a protein elevated in collecting ducts is FXYD domain containing ion transport regulator 4 (FXYD4). It encodes a protein that regulates the transport of ions across the cell membrane.
Distal tubular cells - kidney
As shown in Table 1, 645 genes are elevated in distal tubular cells compared to other cell types. Both the distal tubule and collecting duct are the sites where critical regulatory hormones such as aldosterone and vasopressin regulate acid and potassium excretion and determine the final urinary concentration of K+, Na+, and Cl-. Proteins elevated in distal tubules include solute carrier family 12 member 1 (SLC2A1), one of several potassium, sodium, and calcium transporters essential for regulating the contents and volume of urine. Another example is transmembrane protein 52B (TMEM52B), the function of which is not completely characterized but TMEM52B is highly elevated in distal tubular cells.
Proximal tubular cells - kidney
As shown in Table 1, 744 genes are elevated in proximal tubular cells compared to other cell types. Approximately 60% of the filtered Na+, Cl-, K+, Ca2+, H2O and more than 90% of the filtered HCO3- are absorbed along the proximal tubule. This is also the segment that normally reabsorbs virtually all the filtered glucose and amino acids. An additional function is the secretion of numerous organic anions and cations. Examples of proteins that are elevated in the proximal part of the renal tubules are pyruvate kinase L/R (PKLR), a protein that catalyzes the transphosphorylation of phosphoenolpyruvate and ATP and dipeptidase 1 (DPEP1), a kidney membrane enzyme that hydrolyses a wide range of dipeptides.
Ductal cells - pancreas
As shown in Table 1, 229 genes are elevated in ductal epithelial cells compared to other cell types. Ductal epithelial cells in the pancreas can be found throughout the exocrine tissue transporting the secretions from the acini to the duodenum. Cystic fibrosis transmembrane conductance regulator (CFTR) is an example of an elevated gene in the ductal epithelium of the pancreas. CFTR functions as an ion channel transporting Cl- and HCO3- ions out from the cell in an ATP-dependent manner.
Sertoli cells - testis
As shown in Table 1, 524 genes are elevated in sertoli cells compared to other cell types. Sertoli cells constitute the seminiferous epithelium in testis, interspersed between the germ cells. They play an important role in spermatogenesis, where they are often referred to as nursing cells since their function is to nourish the developing sperm cells. Sertoli cells also play a central role in the control of spermatogenesis by transducing hormonal signals, e.g. activation and stimulation by follicle stimulating hormone (FSH). Examples of elevated genes in Sertoli cells are cannabinoid receptor 1 (CNR1) which is a G-protein coupled receptor for endogenous cannabinoids and epididymal peptidase inhibitor (EPPIN) that has an essential role in male reproduction and fertility by providing antimicrobial protection.
Urothelial cells - urinary bladder
As shown in Table 1, 443 genes are elevated in urothelial cells compared to other cell types. The urothelium, also known as transitional epithelium is one of the slowest cycling epithelia with a turnover rate of approximately 200 days. It consists of three layers: basal, intermediate and superficial, and is three to seven layers thick, depending on bladder distension. The superficial layer is the only layer and consists of fully differentiated cells which line the lumen as a protective barrier, these cells are called umbrella cells. These cells express transmembrane proteins called uroplakins, UPK2 and UPK1A, which are essential structural components on the apical surface that enhance the permeability barrier.
Epithelial cell function
Epithelial cells form sheets of cells, epithelia, that line the outer and inner surfaces of the body and constitute the building blocks for glandular tissues. Hence, epithelial cells are found in many parts of the body, including skin, airways, the digestive tract, glandular tissues and organs, as well as the urinary and reproductive systems. The wide range of functions of epithelial cells can be broadly divided into two main categories, being in charge of the transfer of compounds in or out of the body, as well as being a protective barrier against invading pathogens and physical, chemical or biological abrasion.
Transfer of compounds is a key process for glandular epithelial cells involved in absorption and secretion. The epithelial cells in the digestive system form vast surfaces to enable efficient absorption of the ingested food particles. The same food particles must be predigested into smaller constituents before they can be taken up by the absorptive epithelial cells, a process that is made possible by the secretion of compounds such as enzymes and acid. Mucus secretion is also another important secretory function of glandular epithelial cells that protect the epithelia and enable efficient transport of microorganisms, gametes, particles and smaller compounds in different areas of the body, such as the airways, the digestive system and the reproductive system.
To withstand the wear and tear from the environment, the epithelial cells form multilayered (stratified) squamous epithelia that are able to handle consistent abrasion. There are two types of stratified squamous epithelia: the dry keratinized type found in the top layer (epidermis) of skin and the moist non-keratinized type found lining the surface of inner cavities such as the mouth, esophagus and vagina.
The histology of organs that contain epithelial cells, including interactive images, is described in the Protein Atlas Histology Dictionary.
Here, the protein-coding genes expressed in epithelial cells are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in different epithelial cell types.
The transcript profiling was based on publicly available genome-wide expression data from scRNA-seq experiments covering 13 different normal tissues, as well as analysis of human peripheral blood mononuclear cells (PBMCs). All datasets (unfiltered read counts of cells) were clustered separately using louvain clustering and the clusters obtained were gathered at the end, resulting in a total of 192 different cell type clusters. The clusters were then manually annotated based on a survey of known tissue and cell type-specific markers. The scRNA-seq data from each cluster of cells was aggregated to average normalized protein-coding transcripts per million (pTPM) and the normalized expression value (nTPM) across all protein-coding genes. A specificity and distribution classification was performed to determine the number of genes elevated in these single cell types, and the number of genes detected in one, several or all cell types, respectively.
It should be noted that since the analysis was limited to datasets from 13 organs only, not all human cell types are represented. Furthermore, some cell types are present only in low amounts, or identified only in mixed cell clusters, which may affect the results and bias the cell type specificity.
Relevant links and publications
Uhlén M et al., Tissue-based map of the human proteome. Science (2015)