This page is outdated, please also see the man page.

How to run a full pass in trans?

In a full analysis in trans, you need to run separately the nominal pass and the permutation passes. To illustrate this, first download the example data below and go through the two steps; 1. nominal pass and 2. permutation pass.


Step1: Run a nominal pass

To do so, run:

QTLtools trans --vcf genotypes.chr22.vcf.gz --bed genes.simulated.chr22.bed.gz --nominal --threshold 1e-5 --out trans.nominal

This computes all pairwise test between variants and phenotypes (excluding those in cis) and reports the results in 3 output files:

These files are detailed below:

zcat trans.nominal.best.txt.gz | head
ENSG00000215270.3 -1 1.12926e-05
ENSG00000237438.2 -1 2.03908e-05
ENSG00000273203.1 -1 0.000215518
ENSG00000273442.1 -1 1.64291e-05
ENSG00000177663.9 -1 3.32289e-05
ENSG00000183307.3 -1 5.1272e-06
ENSG00000069998.8 -1 3.91133e-06
ENSG00000093072.11 -1 1.82101e-06
ENSG00000099954.14 -1 4.99988e-07
ENSG00000182902.9 -1 9.67087e-06
zcat trans.nominal.bins.txt.gz | head
0 0 0.000231095 1 0.996523 129237
1 0.000231095 0.000462191 0.996523 0.993047 130402
2 0.000462191 0.000693286 0.993047 0.989571 131480
3 0.000693286 0.000924382 0.989571 0.986094 130623
4 0.000924382 0.00115548 0.986094 0.982619 128835
5 0.00115548 0.00138657 0.982619 0.979143 129555
6 0.00138657 0.00161767 0.979143 0.975668 131499
7 0.00161767 0.00184876 0.975668 0.972193 130907
8 0.00184876 0.00207986 0.972193 0.968719 131048
9 0.00207986 0.00231095 0.968719 0.965245 129206
zcat trans.nominal.hits.txt.gz | head
ENSG00000100154.10 chr22 29075853 22_16367079 chr22 16367079 5.25462e-08 -1 0.282742 ENSG00000188064.5 chr22 46373009 22_16505849 chr22 16505849 5.21154e-07 -1 0.261513 ENSG00000211685.2 chr22 23264766 22_16519289 chr22 16519289 4.62478e-07 -1 0.262665 ENSG00000217442.3 chr22 51001334 22_16542420 chr22 16542420 9.67611e-06 -1 0.231457 ENSG00000254413.4 chr22 51021394 22_16935663 chr22 16935663 6.40882e-06 -1 -0.235939 ENSG00000211642.2 chr22 22569184 22_16965224 chr22 16965224 8.49673e-06 -1 0.232881 ENSG00000211642.2 chr22 22569184 22_16965490 chr22 16965490 8.49673e-06 -1 0.232881 ENSG00000211642.2 chr22 22569184 22_16967698 chr22 16967698 1.7548e-07 -1 0.271811 ENSG00000211685.2 chr22 23264766 22_17043332 chr22 17043332 3.24499e-07 -1 0.266049 ENSG00000100228.8 chr22 23487513 22_17043332 chr22 17043332 9.56054e-06 -1 0.231589

This trans nominal pass can be tuned using the following options:


Step2: Run a permutation pass

The permutation pass is relatively similar to the nominal one. For a single permutation, run:

QTLtools trans --vcf genotypes.chr22.vcf.gz --bed genes.simulated.chr22.bed.gz --threshold 1e-5 --permute --out trans.perm123 --seed 123

This command permutes all the phenotypes in a similar way so that correlation between phenotypes is unchanged. All pairwise tests between variants and phenotypes are then computed (excluding the cis-window) and reported in the same 3 files than for the nominal pass: trans.perm123.best.txt.gz, trans.perm123.bins.txt.gz and trans.perm123.hits.txt.gz.

YOU NEED TO CHANGE THE SEED FOR EACH PERMUTATION USING --seed

If you wish to run 20 permutations on 4 threads, you can proceed as follows:

for p in $(seq 1 20); do
    echo "trans --vcf genotypes.chr22.vcf.gz --bed genes.simulated.chr22.bed.gz --permute --out trans.perm$p --seed $p";
done | xargs -P4 -n10 QTLtools

Produce a QQplot from a nominal and a permutation pass

To do so, you can use (and modify to fit your needs) the example script script/plotTrans.R provided with QTLtools as follows:

Rscript ~/script/plotTrans.R QQplot.pdf trans.nominal.hits.txt.gz trans.nominal.bins.txt.gz trans.perm123.hits.txt.gz trans.perm123.bins.txt.gz

For the nominal and paermutation passes you went through, you should get something like this:


Estimate FDR for each nominal hit

To do so, you can use (and modify to fit your needs) the example script script/runFdrTrans.R provided with QTLtools as follows:

Rscript ~/script/runFDR_ftrans.R trans.nominal.hits.txt.gz trans.perm123.hits.txt.gz output.txt

As an output, you get a file output.txt which contains the data in trans.nominal.hits.txt.gz with an additional column that gives the FDR estimate for each hit. This value is obtained for each hit by counting the number of permutation hits with a smaller P-values; number divided by its rank. In other words, the permutation outcome is used to estimate how many smaller P-values are obtained when drawing from the null.

These two R scripts are given as examples, they need to be (1) adapted to your needs, (2) optimized to better scale with large amout of data and (3) extended to account for the results of multiple permutations. For instance, the QQplot can be populated with multiple curves corresponding to multiple permutations and the uncertainty of FDR estimates can be accounted for by averaging across multiple permutations. In practice, we usually run a 100 permutations which gives us reasonnable resolution.
Tuesday 12th July 2016