Welcome. Thank you for using PSMix program. This program uses maximum likelihood approach (EM algorithm) to infer population structure via mixture model using multilocus genotype data. 1. Install For windows: Download PSMix_1.0-6.zip into your local hard drive. Open Rgui (R graphical interface), go to menu "Packages\install packages from local (zip) files ...", then select the provided zip file. Go to menu "Packages\load package..." to load the package into R for use. For Linux/Unix: Download PSMix_1.0-6.tar.gz into your local hard drive. Install the R package from the source file using the following commands: first go to the directory where the downloaded file is stored and then use "R CMD INSTALL PSMix_1.0-6.tar.gz". For more options please see R help on "INSTALL". 2. Input Genotype data should be stored in a dataframe or matrix format, with columns being marker information and two consecutive rows denoting the genotypes of each individual. The following is a sample input file format 234 130 na 227 ... 232 130 na 217 ... where the first column is the allele information for the first marker, and the first two rows represent the genotype information for first individual. "na" for missing. 3. Usage PSMix(K = 2, Geno, itMax = 10000, eps = 1e-06, seed = 1, MarkerVar = FALSE) Arguments K: number of underlying populations Geno: genotype data itMax: maximum number of iterations in EM algorithm eps: convergence criterion seed: set up the seed for random number generation (reproducibility) MarkerVar: allow individual marker admixture. This feature is not fully implemented in this version. There is also a demo illuminating the use of the program using a subset of Pima-Surui dataset. To see the demo, simply type demo(struc). 4. Output A list containing the following components: PIk: estimated probability of belonging to each subpopulation for each individual (at each marker if MarkerVar is set "True" which corresponding the third model in the paper). Zimak: estimated probability of belonging to each population for each allele (estimated origin of allele for ith individual, at mth marker, on ath allele copy). Gkmj: estimated allele frequencies for each subpopulation. 5. Credit This program is developed based on the algorithm proposed in Nianjun Liu, Baolin Wu and Hongyu Zhao (2004). "Inference of Population Structure Using Mixture Model". Technical report, Division of Biostatistics, Department of Epidemiology and Public Health, Yale University. Please cite this paper if you use this program in your research for publication. 6. Support All questions and comments should be direct to Baolin Wu or Nianjun Liu. Email: baolin@biostat.umn.edu, nianjun.liu@yale.edu.