Pfam is a large collection of multiple sequence alignments and hidden
Markov models covering many common protein domains and families.
For each family in Pfam you can:
Look at multiple alignments
View protein domain architectures
Examine species distribution
Follow links to other databases
View known protein structures
For more information on Pfam, on using this site, or on the changes
between Pfam releases 17.0 and 18.0, click here.
Pfam can be used to view the domain organisation
of proteins. A typical example is shown below. Notice that a single
protein can belong to several Pfam families.
75% of protein sequences have at least one match to Pfam. This number
is called the sequence coverage and is shown in the pie chart on the right.
Pfam is a database of two parts, the first is the
curated part of Pfam containing over 7973 protein families. To give
Pfam a more comprehensive coverage of known proteins we automatically
generate a supplement called Pfam-B. This contains a large number of
small families taken from the
PRODOM
database that do not overlap with Pfam-A. Although of lower quality
Pfam-B families can be useful when no Pfam-A families are found.
You can read the Pfam paper: The Pfam Protein Families DatabaseAlex Bateman, Lachlan Coin, Richard Durbin, Robert D. Finn, Volker Hollich, Sam Griffiths-Jones, Ajay Khanna, Mhairi Marshall, Simon Moxon, Erik L. L. Sonnhammer, David J. Studholme, Corin Yeats and Sean R. Eddy
Nucleic Acids Research(2004) Database Issue 32:D138-D141 (Reproduced with permission from NAR Online) You can also download the Pfam database and for instance search it locally using the HMMERhidden Markov model software. Hyperlink directly to the ftp site or View ftp site files