Ficolins - sequence alignment

Alignment of ficolin and horseshoe crab tachylectin fibrinogen-like domains.  The extreme N-termini of the domains are omitted due to low similarity.  Regions of secondary structure are indicated above the alignment and coloured to correspond with the structure of TL-5A, left.  Beta-sheets are labelled b2-b12 and highlighted in blue, alpha-helices are labelled a3-a9 and highlighted in red.  Residues that are identical in over 75% of the aligned sequences are highlighted in green.  Residues that are similar in over 75% of the sequences are highlighted in cyan.  Residues which are less well conserved but are implicated in sugar binding by the structure of TL-5A (left) are highlighted in light green.  Conserved cysteine residues which are involved in disulphide bonds are highlighted in yellow.  Residues in the TL-5A crystal structure which make contacts with GlcNAc are indicated by S. Residues which coordinate the calcium ion are indicated by C.  At the non-conserved positions marked C, it is the main chain which coordinates the calcium ion.  The sequence of human fibrinogen beta is aligned beneath the ficolin sequences; [....] indicates insertions in fibrinogen that are not present in ficolins.
 
Accession numbers
Human (HS) Ficolin M O00602 L Q15485 H O75636
Mouse (MM) Ficolin A O70165 B O70497
Xenopus laevis (XL) Ficolin 1 Q7ZT75Q7ZT74Q7ZT73Q7ZT72
Gallus gallus (GG) Ficolin 2 XP_415571
Halocynthia roretzi (HR) Ficolin 1 Q95PA0 2 Q95P99 3 Q95P98 4 Q966W1
Tachypleus tridentatus (TT) TL-5A Q9U8W8 TL-5B Q9U8W7
 
NB The numbering of proteins as ficolin 1, ficolin 2 etc in different species does not imply an orthologous relationship between these proteins.
 
If alignments appear scrambled, please maximize the width of your browser window.

Structure of TL-5A CRD with bound GlcNAc

GlcNAc and Ca2+ are shown in dark blue.  Protein Data Bank structure ID: 1JC9.

       b2    a3    b3                  a4   b4     b5      a5          b6            b7       a6   b8
      
HS M  VLCDMDTDGGGWTVFQRRM---DGSVDFYRDWAAYKQGFGSQLGEFWLGNDNIHALTAQGSSELRVDLVDFEGNHQFAKYKSFKVADEAEKYKLVLG
HS L  VLCDMDTDGGGWTVFQRRV---DGSVDFYRDWATYKQGFGSRLGEFWLGNDNIHALTAQGTSELRVDLVDFEDNYQFAKYRSFKVADEAEKYNLVLG
HS H  VFCDMDTEGGGWLVFQRRQ---DGSVDFFRSWSSYRAGFGNQESEFWLGNENLHQLTLQGNWELRVELEDFNGNRTFAHYATFRLLGEVDHYQLALG
MM A  VLCDMDVDGGGWTVFQRRV---DGSIDFFRDWDSYKRGFGNLGTEFWLGNDYLHLLTANGNQELRVDLQDFQGKGSYAKYSSFQVSEEQEKYKLTLG
MM B  VLCDMDTDGGGWTVFQRRL---DGSVDFFRDWTSYKRGFGSQLGEFWLGNDNIHALTTQGTSELRVDLSDFEGKHDFAKYSSFQIQGEAEKYKLILG
XL 1  VLCDMETDGGGWTVFQRRS---DGSVDFFRDWDSYKRGFGLQQSEFWLGNENIHLLTSTGYFQLRIDLTDFEKKHTYAAYSGFSITGDSNNYALRLG
XL 2  VLCDMETDGGGWIVFQRRA---DGSVDFNRDWNSYKRGFGRKDSEFWLGNDNLHLLTATGNFQLRVDLTDFSDKSTYASYSNFSIAEESQSYTLSLR
XL 3  VLCDMETDGGGWIVFQRRM---DGSVDFFRDWNSYKKGFGRQDSEFWLGNDNLHLLTATGNFQLRVDLTDFDKNHTSASYSNFRIAGESRNYTLSLG
XL 4  VLCDMETDGGGWIVFQRRM---DGSVDFFQDWISYKRGFGRQDSEFWLGNNNLHLLTVTGSFQLRVDLTDFGNNRTSASYSDFRIAAEAQNYTLSLG
GG 2  VFCDMDTDGGGWIVFQRRL---DGSVNFLRDWNSYKRGFGNQLTEFWLGNDNLHFLTSLGTCELRVDLRDFDNNYYFAKYASFRVLGESEKYKLVLG
HR 1  VYCDLTSDGGGWTVFQRRM---DGSVDFYRGWNEYVNGFGEKNKEFWLGLETIHQLTKNGNYELRVDIGNWEGERRYAQYGTFSIAGSNDNYRLTVG
HR 2  VYCDLTSGGGGWTVFQRRM---DGSVDFYRGWDEYVNGFGEKDKEFWLGLETIHQLTKNGNYELRVDIGNWEGERRYAQYGTFSIAGSNDNYRLTVG
HR 3  VYCDLTSDGGGWTVFQRRM---DGSVDFYRGWNEYVNGFGEKDKEFWLGLETIHQLTKNGSYELRVDIGDWEGERRYAQYGSFSIAGSNDNYRLTVG
HR 4  VYCDLTSDGGGWIVFQRRM---DGSVDFYRGWNEYVNGFGENDKEFWLGLETIHQLTKNGNYELRVDIGDWEGERRYAQYGTFSISGSNDNYRLTVG
TT A  VYCDMETDGGGWTVIQRRGNYGNPSDYFYKPWKNYKLGFGNIEKDFWLGNDRIFALTNQRNYMIRFDLKDKENDTRYAIYQDFWIENEDYLYCLHIG
TT B  VFCDMETAGGGWTVIQRRGDFGQPIQNFYQTWESYKNGFGNLTKEFWLGNDIIFVLTNQDSVVLRVDLEDFEGGRRYAEAVEFLVRSEIELYKMSFK
FIB   VYCDMNTENGGWTVIQNRQ---DGSVDFGRKWDPYKQGFGN...EYWLGNDKISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTVQNEANKYQISVN

        b9        a7                          a8                     b10           a9    b11    b12   
                  S            C C C      C     S         S               SS           S           
HS M  AFVGGSAGNSLTGHNNNF-FSTKDQDNDVS-----SSNCAEKFQGAWWYADCHASNLNGLYLMGPHESYANGINWSA-AKGYKYSYKVSEMKVRPA
HS L  AFVEGSAGDSLTFHNNQS-FSTKDQDNDLN-----TGNCAVMFQGAWWYKNCHVSNLNGRYLRGTHGSFANGINWKS-GKGYNYSYKVSEMKVRPA
HS H  KFSEGTAGDSLSLHSGRP-FTTYDADHDSS-----NSNCAVIVHGAWWYASCYRSNLNGRYAVSDAAAHKYGIDWAS-GRGVGHPYRRVRMMLR
MM A  QFLEGTAGDSLTKHNNMS-FTTHDQDNDAN-----SMNCAALFHGAWWYHNCHQSNLNGRYLSGSHESYADGINWGT-GQGHHYSYKVAEMKIRAS
MM B  NFLGGGAGDSLTPHNNRL-FSTKDQDNDGS-----TSSCAMGYHGAWWYSQCHTSNLNGLYLRGPHKSYANGVNWKS-WRGYNYSCKVSE
XL TFIGGDAGDSLSIHNNMA-FSTKDRDNDAH----MAGNCAQNYKGAWWYESCHSSNLNGLYQQGEHSSSINGINWRT-GRG--YSTLTRCQK
XL 2  SFMGGDAGDSLSGHKNFS-FSTKDRDN--------KSNCAHTFKGGWWYETCHYSNLNGLYLHGNHTSYANGVNWST-GRG--YITHTRCPK
XL 3  TFTGGDAGDSLSGHKNKG-FSTKDRDNDSS----PSS-CAERYKGAWWYTSCHVSHLNGLYLGGKHSSSANGVNWRS-GRGFNYSYKVSEMKFRPQS
XL 4  TFTGGDAGDSLYGHKNKG-FSTKDRDNDSS----PAS-CAERYRGAWWYTSCHSSNLNGLYLRGNHSSFANGVNWKS-GRGYKYSYEVSEIKFRPQP
GG 2  DFLGGNAGDSLSYHKDMS-FSTADQDNDMS-----SFNCATAYKGAWWYNDCHYSNLNGMYWLGAHGSYADGINWKT-GKEYHYSHKRTEMKFRPI
HR 1  DY-SGTAGDSMTPRSNGQQFTTKDRDNDGS-----GGNCAVEWSGAWWYEKCHVSNLNGIYLVGGTGATSKNVAWYHWGNNHVYSFKFTEIKFRRKQN
HR 2  EY-SGTAGDSLIANHNGKQFSTKDRDNDEY-----GSNCAVQWSGAWWYKSCHYSNLNGIYLVRGTGATAKNVAWYHWGNNYVYSFKFTEIKFRKKQN
HR 3  EY-SGTAGDSMTPRSNGQQFSTKDRDNDGWA----AGHCAIDWSGAWWYGICHYSNLNGIYLVGGTGATPKNVAWYHWGNNHVYSFKFTEIKFRKKQK
HR 4  DY-SGTAGDSLIGHHNGQQFSTKDQDNDGN-----SGNCAVSYTGAWWYQSCYNSNLNGVYHVGGTGANDKNIAWWQWKNTHNYSYKFTEIKFRKKQN
TT A  NY-SGDAGNSFGRHNGHN-FSTIDKDHDTHE-----THCAQTYKGGWWYDRCHESNLNGLYLNGEHNSYADGIEWRAW-KGYHYSLPQVEMKIRPVEFNIIGN
TT B  TY-KGDAGDSLSQHNNMP-FTTKDRDNDKWE-----KNCAEAYKGGWWYNACHHSNLNGMYLRGPHEESAVGVNWYQW-RGHNYSLKVSEMKIRPIIFVPGEGLPK
FIB   KY-RGTAGNAL........FSTYDRDNDGWLTSDPRKQCSKEDGGGWWYNRCHAANPNGRYY......TDDGVVWMNW-KGSWYSMRKMSMKIRPFFPQQ

________________________________________________________________________________________________________

This page last updated:
Tuesday, 17 October 2006
Animal lectins home
Contact information: This site is supported by:
 
Kurt Drickamer
Division of Molecular Biosciences
Faculty of Natural Sciences
Imperial College London
 
Email: k.drickamer@imperial.ac.uk