I-type lectins (Siglecs)

Sequence alignment (CD33-related siglecs)
Sequence alignment (siglecs 1,2 and 4)
Domain organization
Interpro entry: Immunoglobulin
Structure of Siglec-7 CRD


I-type lectins are proteins which bind carbohydrate ligands though immunoglobulin superfamily domains (I-type CRDs).  A number of disparate immunoglobulin superfamily members may bind carbohydrate ligands, but the best characterized of the I-type lectins are members of the siglec family of sialic acid-binding cell surface adhesion receptors (Crocker et al., 1998).  Siglecs are type I transmembrane proteins in which a single N-terminal V-set immunoglobulin-like domain functions as a sialic acid-binding CRD, and is projected from the cell surface by a string of C-set immunoglobulin-like domains, the number of which varies between different members of the siglec family.  An unusual inter-domain disulphide bond, characteristic of siglecs, connects the V-set domain to the adjacent C-set domain.  Although all siglecs bind sialic acid, individual family members vary in linkage specificity and in the extent of recognition of other residues in a glycan ligand, and while some bind very restricted sets of ligands, others are less discriminating.  In some cases variations in ligand specificity have been rationalized by structural studies of the ligand binding site.  Siglecs have a conserved arginine residue which makes an essential electrostatic interaction to sialic acid.  A small number of siglec-like proteins which lack this conserved arginine have been documented, including human siglec 12.  Two hydrophobic aromatic residues also contribute to ligand binding.  In some siglecs (including siglec 7, left) residues in the loop between beta-strands C and C' are involved in fine-tuning binding specificity.

CD33 and related siglecs

The siglecs can be classified into two subgroups.  The CD33-related subgroup appears to be rapidly evolving, and has undergone complex expansion in different mammalian lineages.  Different nomenclature systems have been applied to primate and rodent siglecs, so that the CD33-related group includes CD33 (siglec 3) in both human and mouse, siglecs 5-12 in humans, and siglecs E, F, G and H in mice.  Siglecs in other mammals have also followed different evolutionary paths, making it hard to identify orthologues across species, and pairs of functional orthologues, suggested based on ligand specificity and/or expression pattern, may not be the most closely related in evolutionary terms.  The CD33-related siglecs have between 1 and 4 C-set domains and feature cytoplasmic tyrosine-based motifs involved in signalling and endocytosis.  Each is expressed on a specific combination of cells of the immune system.  B cells and monocytes each express a number of siglecs, but some members of the siglec family are also expressed on NK cells, neutrophils, basophils, eosinophils, dendritic cells and macrophages.  Expression on T cells is rare.  Many of the CD33-related siglecs have been discovered only recently, and their functions are not well defined, but the activities of these proteins are likely to involve inhibitory regulation of leukocyte function as well as other roles relating to immunity and inflammation.

The conserved siglecs: sialoadhesin, CD22 and MAG.

The second subgroup of siglecs consists of siglecs 1, 2 and 4 (more frequently known as sialoadhesin, CD22 and myelin-associated glycoprotein/MAG, respectively) in both human and mouse.  The proteins in this subgroup are evolutionarily conserved: MAG is conserved in vertebrates, while sialoadhesin and CD22 are conserved at least in mammals.  Sialoadhesin is the largest member of the siglec family, with 16 C-set domains.  It is expressed on macrophages and the long neck may allow it to interact with carbohydrate ligands on a variety of extracellular matrix and cell surface molecules.  CD22 is shorter, with 6 C-set domains, the N-terminal 4 of which are homologous to the equivalents in sialoadhesin.  It is found on B cells and, in addition to acting as an adhesion receptor, it has a number of tyrosine-based motifs in the cytoplasmic tail which mediate signalling processes regulating B cell activity.  MAG has 4 C-set domains and is produced in two alternatively-spliced forms, one with an intracellular tyrosine-based motif and one without.  It is involved in the development and maintenance of the nervous system and is the only siglec not found on cells of the immune system.


Structure of Sialoadhesin ligand binding site

Sugar binding site





Structure adapted from May AP et al. (1998) Mol. Cell 1:719



This page last updated:
Wednesday, 01 January 2014
Animal lectins home
Contact information: This site is supported by:
Kurt Drickamer
Division of Molecular Biosciences
Faculty of Natural Sciences
Imperial College London
Email: k.drickamer@imperial.ac.uk