Epistemology

Permutational Hypothesis Testing

You and you friend rolled a die 10 times each. Your friend rolls were:

  • {1,2,3,4,5,1,2,3,4,5} with average value of 3

and your were:

  • {1,2,3,1,2,3,1,2,3,4} with average value of 2.2

Is it fair to say that your friend's die is biased to perform better?

i.e. you are have two hypotheses at hand:

  • your dice are equal (the Null Hypothesis)
  • his die is better (the Alternative Hypothesis)

The trick is to assume that your dice' distributions are equal and compute the probability that the scenario at hand could happen - i.e his avarage is better.

Since additional collection of samples is not possible you can employ this idea:

Overall combined sample is: {1,2,3,4,5,1,2,3,4,5,1,2,3,1,2,3,1,2,3,4} and, if dice distributions are equal, any 10 of these values could equally likely happen in YOUR dice roll. The mean difference is 3-2.2=0.8, so for each 184756 ways to choose 10 samples out of 20 lets compute how often difference of means is bigger than 0.8.

It happens with probability 0.1 which is 10% - i.e. getting diffence of 0.8 or bigger happens only in 10% of the cases - so it's NOT fair to assume his die is better, because 10% is not too small of a chance - collect more samples.

Here is the distribution for mean difference:

undefined

Note that nowhere we made the assumption that the die roll is uniform i.e. numbers 1-6 are equally likely to happen - this makes this method more powerful - it does not depend on any additional assumptions about uniformity or normality of the data at hand.

Here are additional permutational hypothesis testing ideas:

  • Equality of means:
    • example above
    • also look Mann-Whitney rank test
  • To test equality of mean to some value Mu:
    • assumption is: distribution is symmetric around Mu
    • if distribution is symmetric around Mu that means for every value Xi value Xi' reflected around Mu is equally likely
    • for example if Mu = 0, then for every "Xi" value "-Xi" is equally likely, compute all 2^n sign changes
  • Equality of paired distribution:
    • you have X1 and X2 which are paired (example: student exam score before and after 8 hours of sleep)
    • you want to compute if there are any changes Xi distributions
    • if there distributions are equal then pair X1i and X2i could swap places
    • compute probabilities for every 2^n outcomes