Skip navigation

Category Archives: Sample Size

Article by Bartlett, James E. II, Kotrlik, Joe W., and Higgins, Chadwick C. (2001) in Information Technology, Learning, and Performance Journal, 19:1.

 

In one of my paper, I wrote:

The determination of the sample size was done with a thorough consideration over the following issues:

  1. The sample size must be representative to the population. In order to fulfill this criteria, the sample size was determined with the reference to the Krejcie and Morgan (1970) table, where the minimum sample size for population with N = 6000 is n = 361 (the sample proportion p is within ± 0.05 of the population proportion P with a 95% level of confidence).
  2. The sample size must be adequate for confirmatory factor analysis. Although there is no scientific answer on the minimum sample size for factor analysis, the sample size must be sufficiently big so that the selection factors can be made in a wide range of factor loadings value (Bartlett et al., 2001). With reference on numbers of rule-of-thumbs available in statistic textbook, the sample size n = 917[1] is sufficient as the number of items in this study is only 22. According to the most rigid rule, i.e. Rule-of-10, the minimum sample size for this study is supposedly 220 (22 items x 10 respondents per item = 220 respondents).
  3. The sample size must be sufficient for SEM analysis. With 22 items indicating 5 latent constructs, the measurement model proposed in this study consists of 56 parameters to be estimated. Therefore, the minimum sample size is supposedly 280 (Benter and Chou, 1987).
  4. With prediction that researchers might be getting only 40% responses, the researchers distributed over 1,000 questionnaires. At the end of the study, the researchers managed to get 917 complete responses which is 16.24% of the population, and very well above all.

 

Previously, before I read this paper few years back, I thought (and I have been taught) that sample size determination should be based on whether or not they’re sufficiently representative to the population. That’s it. Full stop! Only when I read this paper, I realized that there are few more aspects that need to be considered such as statistical analyses used, number of variables, etc.

 

This paper described common procedures for determining sample size for simple random and systematic random samples. It was based on the following arguments: (a) “two of the most consistent flaws (in business education research) included (1) disregard for sampling error when determining sample size, and (2) disregard for response and non-response bias”, and; (b) “sample size is one of the inter-related features of a study design that can influence the detection of significant differences, relationships or interactions… these designs try to minimize both alpha error (type I error) and beta error (type II error).

 

Determining the appropriate sample size should be based on:

  1. Primary Variables of Measurement – The researcher must make decisions as to which variables will be incorporated into formula calculations. For example, a study that takes a continuous variable measured by a seven-point scale as its primary variable will require less sample size compared to categorical variables as its primary.
  2. Error Estimation – two key factors have to be considered: (1) the risk the researcher is willing to accept in the study, commonly called the margin of error, or the error the researcher is willing to accept, and; (2) the alpha level, the level of acceptable risk the researcher is willing to accept that the true margin of error exceeds the acceptable margin of error. For research that intends to identify marginal relationships, differences or other statistical phenomena as a precursor to further studies, alpha at 0.1 is advisable whereas research that is critical where errors may cause substantial financial or personal harm,  alpha is suggested as level of 0.01. Margin of error for categorical data is 5%, whereas for continuous data is 3%.
  3. Variance Estimation – the critical component of sample size formulas is the estimation of variance in the primary variables of interest in the study. There are four ways of estimating population variances for sample size determinations: (1) take the sample in two steps, and use the results of the first step to determine how many additional responses are needed to attain an appropriate sample size based on the variance observed in the first step data; (2) use pilot study results; (3) use data from previous studies of the same or a similar population; or (4) estimate or guess the structure of the population assisted by some logical mathematical results.

After considering all the above aspects, researchers may refer to the following table to determine the sample size.

Bartlett, et. al. (2001)

Bartlett, et. al. (2001)

 

 

 

 

 

 

But the process is not supposed to end here. Researchers may need to consider the following aspects as well:

  1. Do the researchers plan to perform certain statistical analyses that requires minimum ratio of number of variables to sample size such as Regression Analysis, Factor Analysis and Structural Equation Modeling (SEM)? If they plan to perform Regression Analysis, the ratio of observations to independent variables should not fall below five (the conservative view suggested ten). If this minimum is not followed, there is a risk for over-fitting; the results will be too specific to the sample, and thus lack of generalizability. If they plan to perform Factor Analysis, the number of observations should not less than 100 [Note: there are few more rules of thumb in determining appropriate sample size for factor analysis, hope I can post them all in the coming entry.] The bigger the sample size chosen, the lower threshold of factor loading could be made. For example, assuming an alpha level of 0.05, a factor would have to load at a level of 0.75 or higher to be significant in a sample size of 50, while a factor would only have to load at a level of 0.30 to be significant in a sample size of 350. Description of minimum sample size for SEM is not provided in this paper, so I’ll prepare a separate entry in the future to specifically discuss the matter, insya-ALLAH.
  2. Sampling of Non-respondents – the researchers need to take a random sample of 10-20% of non-respondents to use in non-respondent follow-up analyses. If non-respondents are treated as a potentially different population, it does not appear that this recommendation is valid or adequate. Rather, the researcher could consider using Cochran’s formula to determine an adequate sample of non-respondents for the non-respondent follow-up response analyses.
  3. Budget, Time and Other Constraints – no comment!! 8) My lecturer used to tell me, “researching means sacrificing, no sacrifices means you research nothing!”


[1] Actually, decision to take sample size above the level suggested by the sample size determination table would increase the probability of type I error… syyy!!! 8)

Follow

Get every new post delivered to your Inbox.

Join 28 other followers