Many experiments and clinical trials are run with too few subjects. An underpowered study is a wasted effort because even substantial treatment effects are likely to go undetected. Even if the treatment substantially changed the outcome, the study would have only a small chance of finding a "statistically significant" effect.
When planning a study, therefore, you need to choose an appropriate sample size. The required sample size depends on your answers to these questions:
•How scattered do you expect your data to be?
•How willing are you to risk mistakenly finding a difference by chance?
•How big a difference are you looking for?
•How sure do you need to be that your study will detect a difference, if it exists? In other words, how much statistical power do you need?
The first question requires that you estimate the standard deviation you expect to see. If you can't estimate the standard deviation, you can't compute how many subjects you will need. If you expect lots of scatter, it is harder to discriminate real effects from random noise, so you'll need lots of subjects.
The second question is answered with your definition of statistical significance. Almost all investigators choose the 5% significance level, meaning that P values less than 0.05 are considered to be "statistically significant". If you choose a smaller significance level (say 1%), then you'll need more subjects.
The third and fourth questions are trickier. Everyone would prefer to plan a study that can detect very small differences, but this requires a large sample size. And everyone wants to design a study with lots of power, so it is quite certain to return a "statistically significant" result if the treatment actually works, but this too requires lots of subjects.
Rather than asking you to answer those last two questions, StatMate presents results in a table so you see the tradeoffs between sample size, power, and the effect size you can detect. You can look at this table, consider the time, expense and risk of your experiment, and decide on an appropriate sample size. Note that StatMate does not directly answer the question "how many subjects do I need?" but rather answers the related question "if I use N subjects, what information can I learn?". This approach to sample size calculations was recommended by Parker and Berman (1).
In some cases, StatMate's calculations may convince you that it is impossible to find what you want to know with the number of subjects you are able to use. This can be very helpful. It is far better to cancel such an experiment in the planning stage, than to waste time and money on a futile experiment that won't have sufficient power. If the experiment involves any clinical risk or expenditure of public money, performing such a study can even be considered unethical.
One benefit of larger sample size is you have more power to detect a specified effect, or with constant power can detect smaller effect sizes. But there is another reason to choose larger sample sizes when possible. With larger samples, you can better assess teh distribution of the data. Is the assumption of sampling from a Gaussian, or lognormal, distribution reasonable? With larger samples, it is easier to assess
1. R. A. Parker and N. G. Berman, Sample Size: More than Calculations, Am. Statistician 57:166-170, 2003.