Sample Size Calculation
A sample size calculation is always an educated guess. You have to make some estimates about the results that you will find in your study, note that sample size calculation is not an exact science - if you knew the exact results before you started you would not need to do the research!
Sample size calculations are different for each type of study, and this is one of the areas that you should discuss with your local statistician in the study design stage.
Information required for a sample size calculation
There are two factors that affect your sample size: (1) The amount of overlap that there is between the groups that you are trying to tell apart and (2) how sure that you want to be about your result.
1) Overlap between the groups
The more the groups overlap the more difficult it is to tell them apart and the large the sample needs to be. The amount of overlap depends on the size of the difference between the groups and the amount of variation (width of the distribution) within each group. The smaller the difference and the more variable the population the greater the number of subjects you need to study to find a difference. In technical terms the difference between the groups is the difference between means and the amount of variation is the standard deviation. You will need some idea of the difference between means and standard deviation before you see your statistician.
2) How sure you want to be of the result.
This depends on two factors called the Alpha (a) and Beta (b) values. The most familiar of these is a, which we are all used to looking at, as it is the same as the Îpâ value. In interpreting results the p (or a) value tells us the probability that the difference found was due to chance alone. This is the probability of a false positive result (so when a study finds a positive result we look to a, the p value, to tell us how likely this is to be due to chance alone).
The Beta value is just as important, but much less well known. b tells us the probability that a study will find a difference if one actually exists, which is often called the Îpowerâ of the study. This is the probability of a false negative result (so when a study finds a negative result we should look to the b value to tell us whether the study was powerful enough to pick up a positive result if one actually existed).
The bigger the sample size the less chance there is of a false negative or false positive result, so the statistician will want to know how big a chance of a false positive and a false negative result you are prepared to accept. It is pretty standard to use an a value of 5% (0.05) and a b value of 80% (0.8).
Getting the information that you need
Deciding on a and b is a straightforward process that you can undertake with your statistician. However the statistician will ask you for the size of the difference between the groups and their standard deviation. You will not be able to accurately answer this question (if you can there is no point in doing a research project!). There are two ways of making a guess: (1) with data from a pilot study or (2) by estimating the minimum clinically significant difference.
1) If you have followed our strong advice to do a pilot study or if there is published data in similar populations to the one that you are studying you may be able to estimate the size of the difference and the amount of variation.
2) You will probably have some idea of the minimum clinically significant difference that you are interested in. For example, in a trial of a therapy what is the minimum improvement that would make the new treatment worth adopting? If you design your study to be powerful enough to find this minimum worthwhile difference you will be sure that from a clinical point of view you have answered the question (you may miss subtle benefits from the new treatment, but if these are not clinically significant and would not make everyone change to the new treatment ö who cares!).
If you are using the minimum clinically significant difference in your sample size calculation you may have to assume that the standard deviation is the same in the study and control groups, but you will need some idea of the amount of variation in controls so you really cannot avoid getting some sort of pilot data!
Work out your sample size with a statistician. Go to the meeting with some pilot data or some idea of the minimum clinically significant difference that you are looking for. Have some information about the amount of variation in the population that you will be studying (ie. Guess the difference between means and the standard deviation). Have some idea about how powerful you want your study to be, that is how much chance you are willing to accept of producing a false negative or a false positive result (ie. Have some idea of your values for a and b).