Proteins secreted in the extracellular matrix

There are 234 proteins that constitute or have their function in the extracellular matrix (ECM). ECM is a non-cellular network that constitutes a complex arrangement of collagens, proteoglycans/glycosaminoglycans, elastin, fibronectin, laminins, and several other glycoproteins which surround all cells, tissues and organs. The two major types of ECMs are the pericellular and the interstitial matrices, where the pericellular matrices are in close contact with cells and interstitial matrices surround cells. Cells in contact with the ECM interact through specific cell surface receptors i.e. integrins, CD44, DDR (discoidin domain receptors), syndecans, glypicans and cell surface prostaglandins.

Several different cell types including epithelial cells, fibroblasts, immune cells and endothelial cells may synthesize and secrete matrix components, which makes the ECM a highly dynamic and complex structure that is continuously remodeled under controlled processes.

Functions of proteins secreted in extracellular matrix

All proteins that are secreted to extracellular matrix were classified according to function based on Uniprot molecular function and biological processes keywords. The annotations were prioritized in the following hierarchy: Blood coagulation, Complement pathway, Acute phase, Cytokine, Hormone, Neuropeptide, Growth factor, Receptor, Lectin, Transport, Developmental protein, Defence, Enzyme, Enzyme inhibitor, Transcription, Immunity, Cell adhesion. Each gene was assigned one function.

The results of the analysis are presented in Figure 1. Our analysis shows that two thirds of the proteins have a known function, while one third (n=71) of the proteins lack a distinct annotated function. The ECM does not only provide a scaffolding feature, but it also regulates other cellular processes such as growth, migration, differentiation, survival, homeostasis and morphogenesis, which is reflected in the figure shown below.

Figure 1. Number of proteins that are locally secreted to extracellular matrix, categorized according to function. Annotation was based on Uniprot molecular function and biological processes keywords. Each bar is clickable and gives a search result of proteins that belong to the selected category.

Tissue specificity and tissue distribution classification

The genes encoding ECM proteins were further analyzed with regard to mRNA expression and categorized according to tissue specificity and tissue distribution. Most genes were tissue enhanced (n=130) and of low tissue specificity (n=66) (Figure 2). Most of the encoded proteins were detected in many (n=179) or in some tissues (n=36) (Figure 3).

Figure 2. Number of genes encoding proteins that are locally secreted to extracellular matrix, categorized according to tissue specificity. Categories include: tissue enriched, defined as mRNA level in one tissue at least five-fold higher than in all other tissues; group enriched, defined as five-fold higher average mRNA level in a group of two to five tissues compared to all other tissues; tissue enhanced, defined as five-fold higher average mRNA level in one or more tissues compared to the mean mRNA level of all tissues; expressed in all, defined as ≥ 1 NX in all tissues; and not detected, defined as < 1 NX in all tissues.

Figure 3. Number of genes encoding proteins that are locally secreted to extracellular matrix, categorized according to tissue distribution. Categories include: detected in all, defined as n=100%; detected in many, defined as 31%=< n <100%; detected in some, defined as 1< n <31%; detected in single defined as single n=1; and not detected, n=0.

Origin of proteins secreted in the extracellular matrix

The analysis of gene expression showed that most ECM proteins were encoded by genes enriched in placenta (n=8), brain (n=3), endometrium (n=2), epididymis (n=2), bone marrow (n=1) and pituitary gland (n=1).

Figure 4. Number of tissue enriched genes encoding proteins that are locally secreted to extracellular matrix, according to the tissue with highest mRNA level. Each bar is clickable and gives a search result of proteins that belong to the selected category.

Matrix metalloproteinases

The ECM is constantly remodeled and controlled to be kept in homeostasis. Matrix metalloproteinases (MMPs) are classified into six groups and are all synthesized as a zymogen, an inactive precursor of an enzyme secreted or bound to the cell membrane. MMPs are the main enzymes involved in the degradation of the ECM, but they are also known to cleave cell surface receptors, releasing apoptotic ligands and inactivate chemokines and cytokines. Hence, their activity increases during physiological or pathological processes such as tissue repair, morphogenesis, angiogenesis or remodeling processes in e.g. disease or inflamed tissues. MMP8 and MMP13 are two examples of collagenase protein that degrade interstitial collagens.

MMP8 - spleen

MMP13 - cartilage


Proteoglycans consist of a core protein with one or more covalently bound glycosaminoglycan chains, which gives their characteristic to bind water and various biomolecules, and thus provide hydration and resistance to compression in the ECM. These biomolecules are found both in the cell membrane and in the ECM, and they may interact with e.g. growth factors, cytokines, chemokines and cell surface receptors through the core protein or their glycosaminoglycan chains (long unbranched polysaccharides). Two examples from the lectican family of proteoglycans include aggrecan (ACAN), a major proteoglycan in cartilage, and versican (VCAN), which is expressed in multiple tissues to limit the degree of cell adhesion to the ECM. Small leucine-rich proteoglycans (SLRPs), another family of proteoglycans, play important roles in both creating proper ECM structure and in signaling processes within the ECM. Biglycan (BGN) is a SLRP and a damage-associated molecular pattern (DAMP) molecule that lies dormant in the ECM as a structural component until the ECM is broken down, upon which biglycan binds and activates inflammatory receptors and induce sterile inflammation, a process commonly seen in tissues with the active remodeling of the ECM, such as the placenta and cancer tissues.

ACAN - cartilage

VCAN - placenta

BGN - placenta


Collagens are the most abundant proteins in the ECM with 28 different types, and constitute almost 30 percent of all proteins in humans. Collagens are secreted in the ECM mainly by fibroblasts and the collagen family can be divided into seven categories; fibrillar collagens, network-forming collagens, FACITs (fibril-associated collagens with interrupted triple helices), MACTISs (membrane-associated collagens with interrupted triple helices), anchoring fibrils, beaded-filament-forming collagens and MULTIPLEXIN (multiple triple-helix domains with interruptions). The most common type of collagen is type 1, a fibrillar collagen. These are mainly found in connective tissues such as skin, tendons, blood vessels, organs and bone. Fibrillar collagens are widely expressed and give the tissues their tensile strength; an example is COL2A1 which is expressed in cartilage. Examples of other types of collagens include COL4A2, a major structural component of basement membranes, and COL17A1, a MACTIS type of collagen involved in keratinocyte adhesion in the skin.

COL2A1 - cartilage

COL4A2 - kidney

COL17A1 - skin


Laminins are large glycoproteins constituted by three polypeptide chains, α-chain, β-chain, and γ-chain, and they are the major component in one of the basement membrane layers, the basal lamina. Fifteen laminins have been identified and every basement membrane contains at least one member of the laminin family. These may form networks through association with network forming collagens type IV such as fibronectin and entactin, and to receptors in the plasma membrane of cells adjacent to the basement membrane, but it can also bind with itself to form sheets. Thereby, laminins may regulate cellular activities such as cell differentiation, signaling, migration, adhesion and survival. LAMB2 and LAMB1 are examples of laminins expressed in the basement membrane.

LAMB2 - duodenum

LAMB1 - placenta


Fibronectin is a glycoprotein composed of two almost identical subunits, produced from a single gene which generates more than 20 variants through alternative splicing. Fibronectin exists in two forms; soluble plasma fibronectin which is a major protein component of blood plasma produced mainly by hepatocytes in the liver, and insoluble cellular fibronectin, which is a major component of the extracellular matrix secreted primarily by fibroblasts. Fibronectin is a very large structure, and mediates a variety of cellular interactions such as cell adhesion, growth and differentiation. As mentioned, fibronectin can be found in plasma and ECM, which FN1 is shown in the placenta and rectum.

FN1 - placenta

FN1 - rectum


Elastin is a protein with highly elastic property and the main component in elastic networks, but forms links with fibrillary proteins that act as a scaffold. It is derived from tropoelastin, a small soluble precursor that crosslinks to itself to become a larger complex. This elastic network will give organs and tissues such as blood vessels, skin, heart, bladder and lung its ability to stretch and recoil. Their ability also allows nerves to withstand deformation without breaking. Elastic fibers are of a stable nature with little to no turnover and assembled in early human development. Thus, when damaged in adults, the damaged tissues and fibers may lose some of its normal function, or at least, repaired improperly. ELN is a non-fibrillar collagen which in contrast to collagens, gives elasticity to tissues. A network of elastic fibers is shown below in the gallbladder and skin.

ELN - gallbladder

ELN - skin