| 1. |
Jeff, You Mentioned Testing With Only A Handful Of Users. But Should Not Statistically Significant Results Be Based On Larger Samples? How Many Do You Recommend For A Site That Receives Millions Of Visitors In A Week? |
|
Answer» You can actually ACHIEVE statistical significance with only a few users. In the examples I used we had between 10 and 15 users and the results were statistically significant. It helps when you have the same users attempt the same TASKS on COMPARABLE interfaces (called a within subjects study). Statistical significance refers to results which are unlikely to be due to chance alone. With smaller sample sizes we are limited to detecting relatively large differences between designs (large differences in preference and performance). In early stage designs however, it is those larger noticeable differences we are most interested finding for decided the better design. Even for a website with millions of visitors (the examples also came from websites with millions of monthly visitors) or just a THOUSAND visitors the math works the same. For determining which sample size to use in an EVALUATION depends on what you are doing (comparing, estimating or finding usability problems). You can actually achieve statistical significance with only a few users. In the examples I used we had between 10 and 15 users and the results were statistically significant. It helps when you have the same users attempt the same tasks on comparable interfaces (called a within subjects study). Statistical significance refers to results which are unlikely to be due to chance alone. With smaller sample sizes we are limited to detecting relatively large differences between designs (large differences in preference and performance). In early stage designs however, it is those larger noticeable differences we are most interested finding for decided the better design. Even for a website with millions of visitors (the examples also came from websites with millions of monthly visitors) or just a thousand visitors the math works the same. For determining which sample size to use in an evaluation depends on what you are doing (comparing, estimating or finding usability problems). |
|