Sunday, December 29, 2013

Tukey End Count, a Simple, Robust Statistical Method

The Tukey End Count is a statistically based test that will answer the question: "Is A different than B?"  The Tukey End Count is mathematically rigorous, can be implemented with simple equipment and requires no calculations.  That is, it is very user friendly in the field.

How it works:

Random samples of approximately the same size (count) are collected from Population A and Population B.  The populations are quarantined or marked (spray paint can work) so there is no risk of mixing.

The samples are ordered by the measurement of interest.  It could be the weight of individual chestnuts (large nuts receive a big price premium).  It could be weight of acorns in the terminal 24" of  branch.  It could be length of terminal extension. 

Sidebar One: One cool thing about the Tukey End Count is that it is not necessary to measure the samples with precision equipment or to run statistical calculations.  You only need to be able to order the samples.  So you can use visual comparisons instead of a micrometer.  You can use a simple balance beam comparison instead of fancy, digital scales.

The data is used to construct a table that lists from large-to-small which population that the item was drawn from.


Suppose we are sorting for seedling height at the end of one growing season.  Assume 100 seeds were planted from population A and another 100 seeds were planted from population B.  Also assume that 20% are culled.  That is, only the 80 tallest seedlings from each seedlot of 100 seeds are graded.  From tallest to shortest the seedlot yields:

Tallest B-B-B-B-A-B....148 seedlings in the middle...B-A-A-A-A-A Shortest

This sample has a Tukey End Count of 9.  The 4 tallest seedlings were from seedlot B and the 5 shortest seedlings were from seedlot A.  4 + 5 = 9

As a practical matter, a Tukey End Count of 6 or more is sufficient to determine, with 95% confidence level, that the two populations are statistically different.

Sidebar Two: Another impressive thing about the Tukey End Count is the stability of "6" as a break-point.  A Tukey End Count of 6 still works even if you only have three from each population.  This degenerate case has been call  "The Shainin Six Pack test".  It should be clear that limiting the sample to 3+3 makes sense only when it is extremely expensive to collect more data.

Final note:  It is not necessary to order from large-to-small.  One will get the same Tukey End Count if the individual items are ordered from small-to-large. 

No comments:

Post a Comment