Brain - Methods summary
The Brain section of the Atlas gives an overview of protein expression and distribution in the mammalian brain. Externally and “In-house” generated data are integrated to explore regional protein expression in the human, pig and mouse brain. Protein expression data are based on quantification of messenger RNA using RNA sequencing techniques and in situ hybridization. Protein distribution data are generated using antibody-based immunohistochemistry and immunofluorescence techniques. The brain section can be utilized to create an overview of regional and cross species expression of proteins of interest or can be used to identify regional or functional clustered genes based on expression levels across regions of the brain.
Key publication: Sjöstedt E et al. (2020) “An atlas of the protein-coding genes in the human, pig, and mouse brain.” Science 367(6482):eaay5947
How has the data been generated?
Figure 1. Brain regions, areas and nuclei were micro-dissected from human, pig and mouse brains. RNA was extracted using QIAGEN RNEASY lipid tissue mini kit, RIN values were determined using QIAxcel and Qubit was used to measure mRNA concentrations. Samples were enriched for mRNA using poly(A) purification or ribosomal RNA depletion. After library preparation, the quality of the libraries was assessed using Qubit and Tapestation. Samples were sequenced using the illumina and MGI sequencing platfroms (PE100 or PE150).
RNA expression data
Transcriptomics data have been based on micro-dissected areas and regions of the human (n=217), mouse (n=17) and pig (n=30) brain. Human samples were provided by the Human Brain Tissue Bank of the Semmelweis University in Hungary. Following RNA extraction, messenger RNA was enriched using either a poly(A) purification step (Human & Mouse) or ribosomal RNA clean-up (Human & Pig) strategy. Samples were sequenced on the Illumina or MGI RNAseq platforms and reads were mapped to corresponding genes in the Ensembl version used in the Human Protein Atlas.
Immunofluorescence on mouse brain sections
Protein targets are selected based on their brain, brain regional or brain cell-type elevated expression. Antibodies against these targets were applied to a series of coronal mouse brain sections that cover all major brain regions and cell types of the mammalian brain. Sections were scanned on a fluorescence slide scanning microscope to generate complete overview images with microscopic resolution.
Immunohistochemistry on tissue microarrays
The protein expression tissue micro-array data, include cerebral cortex, hippocampal formation, caudate nucleus and cerebellum, and were derived from antibody-based protein profiling using immunohistochemistry (the Tissue section contains data on 44 human tissue types). Tissue microarrays of 1mm diameter samples were stained with primary antibodies, visualized with DAB (3,3'-diaminobenzidine) and counterstained with hematoxylin. Each brain region is represented by samples from three individuals. For selected proteins, additional brain tissues were stained, such as eye, cerebral cortex, hypothalamus, cerebellum and substantia nigra. Immunohistochemically stained sections from tissue microarrays were scanned to allow for subsequent analysis and presentation at the HPA web portal.
How has the data been analyzed?
Figure 2. All reads of all samples were mapped to selected Ensembl version using Kallisto pseudoalignment algorithms. Transcripts Per Million for all protein coding transcripts was calculated. Technical variation (cohorts, platfrom) and individual (donor) variation was removed using a data normalization approach. Each protein coding gene is classified based on expression in brain vs. peripheral tissues and between regions of the brain. Expression (normalized TPM) is calculated for all regions and subregions. Brain and regional elevated genes for human, pig and mouse are listed on the brain region pages.
RNA expression data
Transcript expression levels are determined by counting the reads that match each protein coding sequence. Normalization between experiments, platforms, species and individuals is performed based on the assumption the normal distribution of gene expression for all protein coding genes between samples is similar. Normalized data are presented on the gene summary pages as normalized Transcripts Per Million (nTPM). For the non-human species, only data on proteins with one-to-one human orthologues are presented. Pig data mapped to the pig genome are available in the Pig RNAatlas. Based on the anatomical and developmental organization of the brain, samples are grouped in 13 main regions of the central nervous system (olfactory bulb, cerebral cortex, hippocampal formation, amygdala, basal ganglia, thalamus, hypothalamus, midbrain, pons, cerebellum, medulla oblongata, spinal cord and white matter structures). For each region, protein expression is calculated as the maximum expression of any of the areas and subregions included in the group.
Mouse brain virtual microscope
Protein distribution maps for 303 proteins have been generated. These maps provide an overview of protein distribution in the many regions of the mouse brain, and also allow inspection of cells and cellular compartments.
Knowledge-based annotation of protein expression
Human brain tissue microarray images have been annotated for positivity in glial cells, neuronal cells, endothelial cells and neuropil. For cerebellum, Purkinje cells and cells in the granular and molecular layer have been annotated. Mouse IF data, based on a series of coronal sections have been annotated for cell type (astrocyte, microglia, oligodendrocyte, neurons, ependymal cells or endothelia) and subcellular distribution (soma, nucleus, endfeet, myelin sheath, dendrites, axons, or synapses). Fluorescence intensity is measured in 129 brain regions and subfields and summarized in the 13 main brain regions based on the maximum mean fluorescence intensity of any of its subregions and subfields.
What can you learn from the Brain section?
Figure 3. The gene summary pages provide an overview of protein expression in the brain. The brain is divided in 13 main regions each represented with a bar. For each main brain region, expression data for individual (sub)regions, areas and nuclei can be explored. For all one-to-one orthologues gene expression in the mouse and pig brain can be explored.
Mouse brain virtual microscope
A genome-wide classification of the protein-coding genes with regard to tissue distribution and specificity has been performed using between-sample normalized data (Tissue Section). In this comparison, brain was represented by the maximum expression levels in any of the 13 main regions of the brain. In the Brain section, regional distribution is classified by comparing gene expression across the 13 main regions of the central nervous system. The genes were classified according to specificity into (i) enriched genes with at least four-fold higher expression levels in one tissue type as compared with any other analysed tissue or brain regions; (ii) group-enriched genes with enriched expression in a small number of tissues or brain regions (2 to 5); and (iii) tissue-enhanced genes with only moderately elevated expression in brain or a brain region. In the figure, the number of tissue enriched and group-enriched genes are shown.
Assays and Annotations