C.neoformans var grubii
Leptospira spp.


MLST was developed for Streptococcus dysgalactiae subspecies equisimilis (SDSE) by the laboratories of David McMillan (Queensland Institute of Medical Research), Mario Ramirez (Universidade de Lisboa) and Debra Bessen (New York Medical College), in cooperation with the lab of Bernard Beall (Centers for Disease Control and Prevention). The internet-accessible database, funded by the Wellcome Trust and hosted at Imperial College London, allows unambiguous comparison of data between different laboratories.

The SDSE MLST database currently contains information on over 150 isolates, obtained from cases of serious invasive disease and non-invasive disease.   Many of the > 50 recognised emm-types are represented in this set. Isolates having Lancefield group carbohydrate for the C, G and L serotypes are included.

Investigators carrying out MLST on this species are encouraged to submit their data to the curator so that allelic profiles and strain details can be added to the database.  In this way the MLST database becomes an increasingly useful resource for the SDSE community.

McMillan, D.J., D.E. Bessen, M. Pinho, C. Ford, G.S, Hall, J. Melo-Cristino, and M. Ramirez, 2010. Population genetics of Streptococcus dysgalactiae subspecies equisimilis reveals widely dispersed clones and extensive recombination. PLoS ONE, in press.

Ahmad Y, Gertz RE, Jr., Li Z, Sakota V, Broyles LN, et al. (2009) Genetic relationships deduced from emm and multilocus sequence typing of invasive Streptococcus dysgalactiae subsp. equisimilis and S. canis recovered from isolates collected in the United States. J Clin Microbiol 47: 2046-2054.

Acknowledging the use of the MLST databases in your publications

Please acknowledge the use of this site in your publications as follows: 'We acknowledge the use of the Streptococcus dysgalactiae subspecies equisimilis MLST database which is located at Imperial College London and is funded by the Wellcome Trust'.

The seven loci and the primers and conditions used for PCR

The SDSE MLST scheme uses internal fragments of seven housekeeping genes amplified by PCR using the primer pairs listed below. Alternative sets of primers were developed for two loci (gtr, murI).

Gene product function (locus)

Size of amplicon used for assigning alleles (bp)

Forward primers

Sequence (5’ to 3’) of forward primers

Reverse primers

Sequence (5’ to 3’) of reverse primers

glucose kinase (gki)






glutamine transport protein (gtr)






glutamine transport protein (gtr)






glutamate racemase (murI)






glutamate racemase (murI)






DNA mismatch repair protein (mutS)






transketolase (recP)






xanthine phosphoribosyl transferase (xpt)






acetoacetyl-coathioloase (atoB)






PCR conditions

The PCR reactions are performed in volumes of 50 mL, with an initial denaturation at 95oC for five min, followed by 28 cycles of 95oC for 1 min, 55oC for 1 min and 72oC for 1 min. The amplified DNA fragments are purified either by precipitation with polyethylene glycol or using a commercial PCR purification kit. The sequence of each fragment is obtained on both strands using the same primers as those in the initial PCR amplifications.

As the same primers are used for amplification and sequencing, it is important that only a single DNA fragment is amplified in the initial PCR. This may involve some optimisation of the annealing temperature and other PCR conditions in individual laboratories.

Obtaining an allelic profile and comparing your strains with those in our database

The allelic profile of a strain is based on the sequence of internal fragments of the seven housekeeping genes.  The sequences need to be trimmed so that they correspond exactly to the region that we use to define the alleles. The sequences of the seven loci from a typical SDSE can be obtained below and can be used to ensure that your sequences have been trimmed correctly.  The sequences must be obtained on both strands, and they must be 100% accurate, since even a single error may convert a known allele into a novel allele.

Click the name below to obtain a correctly trimmed sequence for that locus gki_ | gtr_ | murI | mutS | recP | xpt_ | atoB 

The SDSE database can be interrogated in a variety of ways:

The locus query options allow you to obtain an allele number for each of you sequences.  You can assign your alleles one locus at a time by selecting the single locus option or, by using the multiple locus option, you can cut and paste the correctly trimmed sequence for all seven loci of a query strain into the corresponding boxes. 

The software will check that the sequences are the correct length and that they do not contain any unrecognised characters.  A check is also made to see if the submitted sequence is at least 70% similar to another allele at that locus (in case you have cut and pasted a sequence into the wrong box, or selected the wrong locus from the drop down menu).  If the sequence corresponds to a known allele, the allele number will be returned.  If the sequence appears to be a new allele it should be compared with the sequence of the most similar allele for that locus to check that any nucleotide differences are real. If you are convinced you have a new allele, you should submit the sequence traces to the database curator ( who will check your data, and provide you with a new allele number, and add your new allele to the database.

The profile query options allow you to search the database for allelic profiles matching your own and to obtain information on strains with that allelic profile.   After you have obtained the allele numbers at each locus for your query strain, you can select allelic profile query and enter the seven integers.  If the allelic profile is in the database, the sequence type assigned to this allelic profile will be returned along with details of any SDSE isolates that are identical to the one you submitted.  You can also search for isolates that have allelic profiles that are similar to yours (e.g. isolates that have at least 4/7, 5/7 or 6/7 matches to the submitted allelic profile) and show relationships between your query strain and these strains by using the tree button.

Further details about strains that are identical, or similar, to the query strain can be obtained by clicking on the strain names.

There is also an option to perform a database query (e.g. to look at the details of all strains of a particular emm-type) or for more advanced querying.

If you have sequenced a large number of strains, options are available in the batch query menu to allow data from multiple strains to be entered simultaneously. 

For many of these pages, help boxes (?) are available with further details on how to enter and retrieve data.

Submitting your data to the MLST database

Each database is maintained by a curator and data can only be entered into a database by a curator.  The curators of the SDSE database are:
David McMillan:
Mario Ramirez:
Debra Bessen:


Submitting a new allele

Please send (preferably by email) two sequence trace files (one in each direction; note: these do not have to be edited) for the new allele to the database curator, along with the trimmed sequence (in a text file or within the body of the email) of the proposed new allele. 

Upon visual inspection of the trace files the curator will assign an allele number and enter the sequence of the new allele into the database.  If the curator believes the trace files do not clearly show the identity of the unique nucleotide(s) a number will not be assigned.  The curator will contact you explaining the reasons why this allele was not accepted and give you the opportunity to submit another trace file for this allele.

NOTE: Several SDSE alleles have 100% identity to S. pyogenes alleles, however, the allele number assignments are completely independent of one another.

Submitting a new allelic profile

To be assigned with a new ST designation you should submit the allelic profile and information on a representative strain with epidemiological data to the database curator, who will enter it in the MLST database and assign an ST number.  If the new allelic profile contains a new allele, sequence trace files need to be sent to the curator as described above.

It should be noted that submission of a new ST which is a novel combination of known alleles does not require the submission of sequence trace files.  There is, of course, the potential that one of these alleles has been sequenced incorrectly and the onus is on the submitter to ensure that the allelic profile is correct.  It is strongly recommended that if a new ST is identified that varies at only a single locus from a previously identified ST, sequencing of this variant locus is repeated. It is also strongly recommended that if a new emm-type/ST combination is identified that differs from a previously identified emm-type/ST combination, sequencing of the emm gene and one or more of the variant housekeeping loci is repeated. 
If you are submitting information on a number of strains at one time, a template excel form is available which can be used for submissions.  The template can be obtained from the database curator.

Submitting strain information

Investigators are strongly encouraged to submit ST and strain information on all their isolates, not just ones with new STs.  The database will be of most use to researchers if as much information as possible on as many isolates as possible is included. 

To submit information on isolates with previously reported STs a template excel form can be used.  This form can be obtained from the database curator.


Profile Query

Locus Query

Batch Query