Immunohistochemistry - tissues

The Human Protein Atlas contains images of histological sections from normal and cancer tissues obtained by immunohistochemistry. Antibodies are labeled with DAB (3,3'-diaminobenzidine) and the resulting brown staining indicates where an antibody has bound to its corresponding antigen. The section is furthermore counterstained with hematoxylin to enable visualization of microscopical features.

Data generation

Tissue microarrays are used to show antibody staining in samples from 144 individuals corresponding to 44 different normal tissue types, and samples from 216 cancer patients corresponding to 20 different types of cancer (movie about tissue microarray production and immunohistochemical staining). Each sample is represented by 1 mm tissue cores, resulting in a total number of 576 images for each antibody. Normal tissues are represented by samples from three individuals each, one core per individual, except for endometrium, skin, soft tissue and stomach, which are represented by samples from six individuals each and parathyroid gland, which is represented by one sample. Protein expression is annotated in 76 different normal cell types present in these tissue samples. For cancer tissues, two cores are sampled from each individual and protein expression is annotated in tumor cells. A small fraction of the 576 images are missing for most antibodies due to technical issues. Specimens containing normal and cancer tissue have been collected and sampled from anonymized paraffin embedded material of surgical specimens, in accordance with approval from the local ethics committee. For selected proteins extended tissue profiling is performed in addition to standard tissue microarrays. Examined tissues include mouse brain, human lactating breast, eye, thymus and extended samples of adrenal gland, skin and brain.
Since specimens are derived from surgical material, normal is here defined as non-neoplastic and morphologically normal. It is not always possible to obtain fully normal tissues and thus several of the tissues denoted as normal will include alterations due to inflammation, degeneration and tissue remodeling. In rare tissues, hyperplasia or benign proliferations are included as exceptions. It should also be noted that within normal morphology there may exist interindividual differences and variations due to primary diseases, age, sex etc. Such differences may also affect protein expression and thereby immunohistochemical staining patterns. Samples from cancer are also derived from surgical material. Due to subgroups and heterogeneity of tumors within each cancer type, included cases represent a typical mix of specimens from surgical pathology. The inclusion of tumors is based on availability and representativity, however, an effort has been made to include high and low grade malignancies where such is applicable. In certain tumor groups, subtypes have been included, e.g. breast cancer includes both ductal and lobular cancer, lung cancer includes both squamous cell carcinoma and adenocarcinoma and liver cancer includes both hepatocellular and cholangiocellular carcinoma etc. Tumor heterogeneity and interindividual differences may be reflected in diverse expression of proteins resulting in variable immunohistochemical staining patterns.

Annotation

In order to provide an overview of protein expression patterns, all images of tissues stained by immunohistochemistry are manually annotated by a specialist followed by verification by a second specialist. Annotation of each different normal and cancer tissue is performed using fixed guidelines for classification of immunohistochemical results. Each tissue is examined for representability, and subsequently immunoreactivity in the different cell types present in normal or cancer tissues was annotated. Basic annotation parameters include an evaluation of i) staining intensity (negative, weak, moderate or strong), ii) fraction of stained cells (<25%, 25-75% or >75%) and iii) subcellular localization (nuclear and/or cytoplasmic/membranous). The manual annotation also provides two summarizing texts describing the staining pattern for each antibody in normal tissues and in cancer tissues.
The terminology and ontology used is compliant with standards used in pathology and medical science. SNOMED classification is used for assignment of topography and morphology. SNOMED classification also underlies the given original diagnosis from which normal as well as cancer samples were collected.
A histological dictionary used in the annotation is available as a PDF-document, containing images stained by immunohistochemistry using antibodies included in the Human Protein Atlas. The dictionary displays subtypes of cells distinguishable from each other and also shows specific expression patterns in different intracellular structures. Annotation dictionary: screen usage (15 MB), printing (95 MB).

Knowledge-based annotation

Knowledge-based annotation aims to create a comprehensive overview of protein expression patterns in normal human tissues. This is achieved by stringent evaluation of immunohistochemical staining pattern, RNA-seq data from internal and external sources and available protein/gene characterization data, with special emphasis on RNA-seq. Annotated protein expression profiles are performed using single antibodies as well as independent antibodies (two or more independent antibodies directed against different, non-overlapping epitopes on the same protein). For independent antibodies, the immunohistochemical data from all the different antibodies are taken into consideration. The immunohistochemical staining pattern in normal tissues is subjectively annotated according to strict guidelines. It is based on the experienced evaluation of positive immunohistochemical signals in the 76 normal cell types analyzed. The review also takes suboptimal experimental procedures and interindividual variations into consideration.
The final annotated protein expression is considered a best estimate and as such reflects the most probable histological distribution and relative expression level for each protein. To enable a protein expression profile, one or several of the following additional data sources is necessary; i) an independent antibody targeting another epitope of the same protein ii) RNA-seq data, and iii) available protein/gene characterization data. The result of the knowledge-based annotation is considered inconclusive when the information available at the time of analysis is evaluated as not sufficient for verification of the staining pattern and an estimation of the expected protein expression. The knowledge-based protein expression profiles are performed using fixed guidelines on evaluation and presentation of the resulting expression profiles. Standardized explanatory sentences are used when necessary to provide additional information required for full understanding of the expression profile. A reliability score, set as Enhanced, Supported, Approved, or Uncertain is set for each annotated protein expression profile based on evaluation of all available data.

Reliability score

A reliability score is manually set for all genes and indicates the level of reliability of the analyzed protein expression pattern based on knowledge-based evaluation of available RNA-seq data, protein/gene characterization data and immunohistochemical data from one or several antibodies designed towards non-overlapping sequences of the same gene. The reliability score is based on the 44 normal tissues analyzed, and is displayed on both the Tissue Atlas and the Pathology Atlas.

The reliability score is divided into Enhanced, Supported, Approved, or Uncertain. If there is available data from more than one antibody, the staining patterns of all antibodies are taken into consideration during the evaluation of the reliability score.

Enhanced
One or several antibodies targeting non-overlapping sequences of the same gene have obtained enhanced validation based on either orthogonal or independent antibody validation methods.

Supported
If one of the following criteria is fulfilled:

  • At least one antibody shows high or medium consistency between RNA levels and staining pattern, but the antibody does not qualify for Orthogonal validation and staining pattern is consistent with valid literature, or there is no valid literature available
  • At least one antibody has RNA consistency defined as “Cannot be evaluated” and staining pattern is consistent with valid literature
  • Paired antibodies (several antibodies targeting non-overlapping sequences) show similar staining pattern, but the antibodies do not qualify for Independent antibody validation and staining pattern is consistent with valid literature, or there is no valid literature availa

Approved
If one of the following criteria is fulfilled:

  • At least one antibody shows high or medium consistency between RNA levels and staining pattern and staining pattern is inconsistent with valid literature
  • At least one antibody shows low consistency between RNA levels and staining pattern and staining pattern is consistent with valid literature
  • At least one antibody has RNA consistency defined as “Cannot be evaluated” and staining pattern is partly consistent with valid literature, or consistent with limited literature
  • Paired antibodies show partly similar expression patterns

Uncertain
If one of the following criteria is fulfilled:

  • Only multi-targeting antibodies are available. Multi-targeting antibodies are used for genes where it was not possible to generate single-targeting antibodies due to high sequence identity among proteins belonging to different genes. These genes are in many cases closely related and belong to known gene families, and in these cases a multi-targeting antibody was produced that has >80% sequence identity to transcripts of the genes belonging to the family and low sequence identity to the transcripts of all other human genes.
  • At least one antibody shows low or very low consistency between RNA and staining pattern, or RNA consistency is defined as “Cannot be evaluated” and staining pattern is inconsistent with valid literature, or there is no valid literature available
  • Paired antibodies show dissimilar expression patterns

Multiplex immunohistochemistry/IF - tissues

As part of the Tissue Atlas resource, the multiplex immunohistochemistry(mIHC)/IF data was generated by staining tissue microarrays obtained from histological sections from normal tissues. The mIHC/IF tissue data displays high-resolution, 6-plex images of proteins labeled by indirect mIHC and in addition to conventional IHC, thus providing spatial information on protein expression patterns related to distinct single cells and cell types, or even cellular states and histological and biological structures embedded in the tissue.

Similarly to conventional IHC, in mIHC/IF, primary antibodies are first labeled with secondary antibodies coupled with horseradish peroxidase (HRP) (or similar). Further, the method utilizes tyramide signal amplification (TSA) where fluorescent tyramide molecules are catalyzed by HRP which creates a fluorescent precipitate on and proximal to the binding site. The ability to run several staining-stripping-cycles allows for tissue sections with up to 6 labeled proteins per slide. Lastly, the slides are counterstained with DAPI (4′,6-diamidino-2-phenylindole). In this setup, tissue microarrays consisting of doublet 1 mm cores from three patients are used to profile each protein.

Annotation

The protein localization is manually annotated by assessing the target of interest by estimating the fraction of cells that overlap with the panel antibodies and, when applicable, also annotating their subcellular localization. For each slide, the tissue cores are examined for representability as well. The annotation parameters include an evaluation of i) fraction of cells with expression of unknown protein that overlap with panel markers (<25%, 25-75% or >75%), and ii) subcellular localization (nuclear and/or cytoplasmic/plasma membrane/membrane) of the staining. The manual annotation also provides two summarizing texts describing the staining pattern for each antibody. The marker proteins, targetted by the panel antibodies, may be limited in their ability to label all cells of the intended cell type/structure, as defined in the literature.


Cilia panel

The panel for ciliated cells was developed with the aim to study the spatial protein expression of cilia proteins. For each unknown protein, the antibody targeting the protein is labeled with the available TSA-fluorophore (OPAL 520) not occupied by the marker proteins.

Cilia panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Cilia cell body AGR3 HPA053942 OPAL570 Cyan
Cilia cell nucleus FOXJ1 HPA005714 OPAL780 Magenta
Basal body CROCC HPA021191 OPAL690 Red
Cilia transition zone NPHP4 HPA065526 OPAL480 Yellow
Cilia axoneme DNAH9 HPA052641 OPAL620 White
Empty slot Unknown protein of interest - OPAL520 Green


Kidney panel

For kidney, a antibody panel was developed to characterize the spatial localization of kidney proteins mainly in renal tubules but also in podocytes. An endothelial cell marker was also added to distinguish non-podocytes in the glomerular compartment. For each unknown protein, the antibody targeting the protein is labeled with the available TSA-fluorophore (OPAL 520) not occupied by the marker proteins.

Kidney panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Collecting ducts AQP2 HPA046834 OPAL690 Cyan
Subset of distal tubules CASR HPA039686 OPAL570 Red
Proximal tubules ACSM2A/B HPA057699 OPAL620 White
Podocytes PTPRO HPA034525 OPAL480 Yellow
Endothelial cells CD34 HPA036722 OPAL780 Magenta
Empty slot Unknown protein of interest - OPAL520 Green


Pancreas panel

For pancreas, a antibody panel was developed to characterize the endocrine cells in the islets of Langerhans in more detail. For each unknown protein, the antibody targeting the protein is labeled with the available TSA-fluorophore (OPAL 520) not occupied by the marker proteins.

Pancreas panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Alpha cells GSG CAB000040 OPAL690 Cyan
PP cells PPY HPA032122 OPAL570 Red
Epsilon cells GHRL HPA014246 OPAL620 White
Delta cells SST CAB034105 OPAL480 Yellow
Beta cells INS HPA004932 OPAL780 Magenta
Empty slot Unknown protein of interest - OPAL520 Green


Salivary gland panel

The antibody panel for salivary gland was generated to profile the different glandular tissues (serous and mucus glands) and ductal structures (small ducts, large ducts and ionocytes). For each unknown protein, the antibody targeting the protein is labeled with the available TSA-fluorophore (OPAL 520) not occupied by the marker proteins.

Salivary gland panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Serous acini LPO HPA028688 OPAL480 Cyan
Mucus acini MUC5B CAB009396 OPAL780 White
Small ducts SLC13A2 HPA014963 OPAL690 Magenta
Large ducts ATP6V1B1 HPA031847 OPAL570 Red
Ionocytes FOXI1 HPA071469 OPAL620 Yellow
Empty slot Unknown protein of interest - OPAL520 Green


Testis panels

For testis, two panels have been developed where the aim was i) to capture the transition of spermatogonial stem cells to preleptotene spermatocytes (Spermatogonia panel), ii) to identify the expression of proteins during spermatocyte differentiation and meiosis (Spermatocytes panel), iii) to characterize the proteins during sperm transformation, a process called spermiogenesis (Spermatids panel), and iv) mapping out the proteins Sertoli-specific proteins (Sertoli cells panel). For each unknown protein, the antibody targeting the protein is labeled with the available TSA-fluorophore (OPAL 520) not occupied by the marker proteins.

Spermatogonia panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Spermatogonia 0 UTF1 CAB022384 OPAL480 Yellow
Spermatogonia 1 IRF2BPL HPA050862 OPAL620 White
Spermatogonia 2-3 DMRT1 HPA027850 OPAL690 Cyan
Spermaotogonia 4 CTCFL HPA001472 OPAL780 Magenta
Spermatocytes 1 BEND2 HPA013142 OPAL570 Red
Empty slot Unknown protein of interest - OPAL520 Green


Spermatocytes panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Spermatocytes 1 HELLS HPA063242 OPAL480 Yellow
Spermatocytes 2 SCML1 HPA035270 OPAL690 Cyan
Spermatocytes 3 TCFL5 HPA076419 OPAL780 Magenta
Spermatids early SUN5 HPA048529 OPAL620 White
Spermatids late PRM1 HPA055150 OPAL570 Red
Empty slot Unknown protein of interest - OPAL520 Green


Spermatids panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Spermatids early 1 LYAR HPA035881 OPAL780 Magenta
Spermatids early 2 OLAH HPA037948 OPAL690 Cyan
Spermatids late 1 C3 HPA020432 OPAL480 Yellow
Spermatids late 2 SPATA24 HPA044000 OPAL570 Red
Spermatids late 3 TPPP2 HPA004120 OPAL620 White
Empty slot Unknown protein of interest - OPAL520 Green


Sertoli cells panel

Cell type Marker protein Antibody Fluorescent label Pseudo-color
Sertoli cytoplasm DIAPH2 CAB015461 OPAL570 Red
Sertoli membrane CD99 CAB000020 OPAL690 White
Sertoli nuclei HMGN5 HPA000511 OPAL780 Magenta
Spermatogonia and spermatocytes DDX4 HPA037764 OPAL620 Cyan
Spermatids SPACA1 HPA043297 OPAL480 Yellow
Empty slot Unknown protein of interest - OPAL520 Green


Data reliability

For each antibody and protein, an internal reliability assessment is performed to ensure high quality data before release. The antibody staining pattern of the unknown protein is always reviewed against its corresponding conventional IHC staining pattern for reproducibility, and against available tissue and single-cell RNA-seq data, and protein/gene characterization data. This assessment should not be confused with the Reliability scoring performed for the tissue-wide analysis. The reproducibility of the panel the panel marker proteins are also assessed to ensure high quality of the annotation.