Prof. Itay Mayrose Lab - Plant Evolution, bioinformatics, & comparative genomics

Clumpak - Cluster Markov Packager Across K



IMPORTANT NOTICE:

This is a temporary site for your convenience. Our permanent website is temporarily down due to serious security problems. If you run into problems please email evolseq@tauex.tau.ac.il and specify the job ID.
If you haven't received any results within a few hours, please contact us and re-submit. See Known Issues. Thank you!


Main Pipeline


CLUMPAK - Clustering Markov Packager Across K - was developed in order to aid users analyse the results of STRUCTURE-like programs. The software offers a few alternative modes of action, please go to the Help section for detailed about these modes.

The main pipeline offers a full pipeline for the summation and graphical representation of the results previously obtained by the user using a STRUCTURE-like program. The input files required by the main pipeline are the Q-matrices obtained by STRUCTURE, or by other STRUCTURE-like programs, properly formatted to match one of the two input formats supported by CLUMPAK. We currently support up to 5000 individuals in a dataset, due to graphical constraints. Since STRUCTURE format is fully supported, STRUCTURE runs do not have to be modified and can be submitted as is. STRUCTURE runs can also be truncated, such that each run is reduced to the Q-matrix part. See the Help section or the links below for examples. An optional input that affects the graphical representation of the results is a file containing population codes and names, in the order that the user wishes to see in the results. For further details on supported formats and how to zip input files please refer to the Help section.


Run Main Pipeline (click for Instructions):

The required input is the entire or part of the output of STRUCTURE (or STRUCTURE-like) runs, which were produced for the same data set for a range of K values. For example, the input might be made of 10 runs for each K value, with K ranging between 2 to 10. We currently support up to 5000 individuals in a dataset, due to graphical constraints. CLUMPAK fully supports STRUCTURE format, so STRUCTURE outputs do not require any modification. For other STRUCTURE-like programs, small modifications might be required. Please refer to the Help section for a detailed discussion of the formats supported.

If you are using Linux, you can use the command \91zip\92 to zip result files. If you are using Windows, you can use WinRAR with the \91zip\92 option to zip files. Result files can be zipped together as one zip regardless of their K value, or zipped separately for each K, and then zipped together to one final zip. Zip files can contain sub-folders or sub-zips, but CLUMPAK expects text files and zip files only. Other files will generate an error.

A second input file, which is optional and affects only the graphical display of the results, is a text file which contains two columns: the first column contains the population code (an integer) for each population, and the second column contains the population name. This format of this file is identical to the format of INFILE_LABEL_BELOW in DISTRUCT. If the file is found, the input order of populations on the rows of the file will be used for the left-right order of graphing of populations. If this file is not provided, the default is to print the population codes as labels.

If you are zipping your files on Windows, we recommend using WinRAR with the 'zip' option. Linux users can use the 'zip' command.


Upload zip file containing STRUCTURE/ADMIXTURE runs: ?

Please indicate the format of the uploaded result files:
? example
? example

Upload populations file (optional): ? example

Upload labels file for DISTRUCT (optional): ? example

Computational:

CLUMPP: default parameters are the LargeKGreedy algorithm, random input order, and 2000 repeats

Search method:
Full search
Greedy
LargeKGreedy

Please choose the number of random input orders repeats:

MCL: threshold for similarity scores:

Default (dynamic)
User Defined: (0≤X<1)

DISTRUCT: threshold for minimal cluster size:

Default
User Defined: (0≤X<1)

Graphical:

Upload a drawparams file (optional): ? example

Upload a colors file (optional): ? example

I agree that the data I submit will be used anonymously for performance evaluation ?