The druggable proteome

Almost all pharmaceutical drugs today act by targeting proteins in the human body and affect their activity. Antagonists are drugs that inactivate the protein target, while drugs that activate the protein target are called agonists. Target proteins involved mainly belong to four protein families i.e. enzymes, transporters, ion channels and receptors. The current FDA approved drugs are directed to 754 separate human proteins that are directly related to the mechanism of action for the drug according to Drugbank (www.drugbank.ca). In the table below these proteins are categorized based on function into some major protein families (Table 1). In Figure 1, the distribution of cellular compartments for the targets, based on the combination of several prediction methods for transmembrane regions, signal peptides and a comprehensive annotation of predicted secreted genes, indicates that 67% of the targets are membrane-bound or secreted.

Table 1. Classification of targets for FDA approved drugs. The numbers will not add up to the target number, since protein targets can belong to more than one of these selected classes or none at all.

Protein class Number of genes
Enzymes 304
Transporters 182
Voltage-gated ion channels 55
G-protein coupled receptors 103
Nuclear receptors 21
CD markers 79

Figure 1. Cellular localization of targets for FDA approved drugs based on a variety of transmembrane and signal peptide prediction methods.

Defining the druggable proteome

A drug exerts its effect by interfering with any of the four types of macromolecules in the human body, i.e. proteins, polysaccharides, lipids and nucleic acids. Almost all approved drugs on the market today are directed against protein targets, since issues like toxicity and low specificity are more related to the three latter types. There are approximately 20000 human protein-coding genes, but not all proteins are suitable for drug interactions and even fewer are appropriate drug targets. The druggable proteome could be defined as the fraction of proteins which have the ability to bind a small molecule or antibody with required affinity, adequate chemical properties, and at the same time are potential drug targets i.e. linked to a disease.

Suitable drug targets should have a critical role in the disease process with less significant involvement in other important processes to limit potential side-effects, have an expression pattern allowing for drug efficacy by for example showing tissue-specific expression, and have structural and functional properties allowing for drug specificity.

Most drugs act on proteins involved in signal transduction, since almost all known diseases are linked to some dysfunction in these pathways. Signal transduction is the process of converting external signals at the cell membrane to specific responses inside the cell, which may result in e.g. gene expression, cell division, or cell death. Antibody-based drugs that usually cannot penetrate the plasma membrane of the cell are mostly directed against targets on the cell surface e.g. receptors, while small molecule drugs that can diffuse into cells act on targets found inside the cell. Among the FDA approved drugs directed against the above mentioned 754 proteins, the vast majority is small molecule drugs, as can be seen in the Venn diagram in Figure 2.


Figure 2. Venn diagram of the type of drugs directed against the 754 protein targets for FDA approved drugs. Lists of the protein target genes are obtained by clicking the numbers in the pie chart.

Examples of drug targets

TSHR

The thyroid stimulating hormone receptor protein (TSHR) is the target for the synthetic agonist Thyrotropin Alfa (brand name Thyrogen), which is a thyroid stimulating hormone used for detection of residual or recurrent thyroid cancer. TSHR is a G-protein coupled receptor localized in the cell membrane of glandular cells in the thyroid gland, as shown by immunohistochemical staining using the antibody CAB000473.

LIPF

The protein Gastric triacylglycerol lipase (LIPF) is the target for the small-molecule antagonist Orlistat (brand names Alli and Xenical), which is used to treat obesity by preventing the absorption of fats from the diet. LIPF is an enzyme that hydrolyzes triglycerides into absorbable free fatty acids in the intestine, here shown in the glandular cells in stomach by immunohistochemical staining with the antibody HPA045930.

CACNA1S

The protein CACNA1S is one of the targets for a number of calcium channel blockers that acts as vasodilators and are used as antihypertensive agents. CACNA1S is a voltage-sensitive calcium channel protein and the immunohistochemical staining with the antibody HPA048892 shows high expression in skeletal muscle where the protein plays an important role in excitation-contraction coupling.

FOLH1

The enzyme FOLH1 is the target for the antibody-based drug Capromab, which is used for diagnosis of prostate cancer and detection of intra-pelvic metastases. FOLH1 has both folate hydrolase and N-acetylated-alpha-linked-acidic dipeptidase (NAALADase) activity and is involved in prostate tumor progression. Immunohistochemical staining with the antibody HPA010593 shows strong staining of glandular cells in the prostate.

Potential drug targets

As stated earlier, dysfunction in signal transduction networks is present in most diseases and therefore knowledge of key signal transduction components and their links to disease could potentially constitute a base for identifying novel drug targets. By analyzing for example sequence properties, protein families, structural folds, biochemical aspects, similarity to other proteins and associated pathways of known targets, you might be able to make predictions that can be used to screen the genome for druggable proteins.

Currently, there are 4009 genes in the UniProt database having experimental evidence for being involved in various disease conditions, including cancer, neurologic, systemic and cardiovascular disease. Around 1326 of these might be interesting to investigate as potential drug targets in that they belong to known drug target protein classes i.e. enzymes, transporters, receptors and ion-channels, and are not yet targets for FDA approved or experimental drugs in the Drugbank database.

Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Wishart DS et al., DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. (2006)
PubMed: 16381955 DOI: 10.1093/nar/gkj067

Database with drug data linked to drug target information - DrugBank