For pharmacists looking to become a boardcertified pharmacotherapy specialist, it is essential to study statistics in preparation for the exam. In this article two pharmacists who have passed the BCPS exam and have significant research experience provide a BCPS statistics review for pharmacists.
Authored By: Timothy P. Gauthier, Pharm.D., BCPSAQ ID & Tristan T. Timbrook, Pharm.D, MBA, BCPS
[Last updated: 9 July 2018]
In recent years the data provided by the American Association of Colleges of Pharmacy shows that within the profession of pharmacy there has been an increase in the number of pharmacy schools as well as an increase in the number of people graduating with doctor of pharmacy degrees.
With more competition for jobs pharmacists are seeking opportunities to achieve certifications to set themselves apart from within the applicant pool. The Board of Pharmacy Specialties (BPS) is one popular place pharmacists now look to for this purpose. BPS offers certification examinations for numerous pharmacy specialty areas and each has specific criteria for test eligibility. By far the most popular certification is that of a board certified pharmacotherapy specialist (BCPS), a designation more than 21,000 pharmacists have achieved.
In speaking with anyone who has passed the BCPS exam you will learn that having a basic understanding of statistics is essential to succeed on test day. In the most recent BCPS content outline there are three domains and domain #2 (drug information and evidence based medicine) identifies that pharmacists should have knowledge of biostatistical methods and interpretation, clinical vs. statistical significance, and protocol design, methodology, and biostatistical methods.
To provide an additional study tool for those seeking BCPS designation the following is provided. Here is a BCPS statistics review for pharmacists.
NOTE: Trial design is not described here in detail, but has a major impact on the validity of statistical test results and must always be considered. Additionally, this is a summary of select information and readers are referred to the suggested readings at the end of this article for more indepth information.
VARIABLE TYPES
To select the proper statistical test it is necessary to first identify the types of variables (i.e., data) that have been used. Data can be nominal, ordinal, or continuous (interval or ratio). Here are notes on the different variable types.
1. Nominal
 Data without order or indication of relative severity
 Is a discrete variable
 Not eligible for parametric tests
 Examples: sex, mortality, presence of a disease state
Picking a Test for NOMINAL Data 

Dependence 
Samples 
# Independent Variables → Test 
Independent Samples
Parallel Design 
2 
0 → X^{2}, Fisher’s Exact test (if <5)
1 → MantelHaenszel test >2 → Logistic Regression 
>3 
0 → X^{2}*
1 → X^{2}* >2 → Logistic Regression 

Dependent (Paired)
Crossover Design, Related 
2 
0 → McNemar Test
>1 → Repeated measures logistic regression 
>3 
0 → Cochran’s Q*
>1 → Repeated measures logistic regression 
*Bonferroni correction applied
2. Ordinal
 Data ranked in a specific order, but lacks a constant difference in magnitude change
 Is a discrete variable
 Not eligible for parametric tests
 Median is the most common measure of central tendency used to describe ordinal data
 Mean and standard deviation NOT used for reporting
 Commonly used in observational studies
 Examples: APACHEII score, trauma score, NYHA class
Picking a Test for ORDINAL Data 

Dependence 
Samples 
# Independent Variables → Test 
Independent Samples
Parallel Design 
2 
0 → Wilcoxon rank sum or MannWhitney U
1 → 2way ANOVA >2 → ANOVA ranks 
>3 
0 → KruskalWallis*^{¥}
1 → 2way ANOVA ranks >2 → ANCOVA ranks 

Dependent (Paired)
Crossover Design, Related 
2 
0 → Wilcoxon signed rank test
1 → 2way repeated ANOVA ranks >2 → Repeated measures regression 
>3 
0 → Friedman test
1 → 2way repeated ANOVA ranks >2 → Repeated measures regression 
*Bonferroni correction applied; ^{¥}Multiple Comparison Procedure applied
3. Interval
 Data ranked in a specific order that includes a constant difference in magnitude change between units and zero is an arbitrary value
 A continuous variable
 Mean (average) is the most common measure of central tendency used to describe continuous data
 Eligible for parametric tests if data is normally distributed
 Example: temperature
4. Ratio
 Data that is ranked in a specific order that includes a constant difference in magnitude change between units and zero is NOT an arbitrary value
 A continuous variable
 Mean (average) is the most common measure of central tendency used to describe continuous data
 Eligible for parametric tests if data is normally distributed
 Examples: blood pressure, heart rate, respiratory rate
Picking a Test for CONTINUOUS Data (Interval or Ratio Data) 

Dependence 
Samples 
# Independent Variables → Test 
Independent Samples
Parallel Design 
2 
0 → Student’s ttest
1 → 2way ANOVA >2 → ANCOVA 
>3 
0 → 1way ANOVA (MCP)
1 → 2way ANOVA >2 → ANCOVA 

Dependent (Paired)
Crossover Design, Related 
2 
0 → Paired Student’s ttest
1 → 2way repeated ANOVA ranks >2 → Repeated measures regression 
>3 
0 → ANOVA for repeated measures
1 → 2way repeated ANOVA ranks >2 → Repeated measures regression 
^{¥}Multiple Comparison Procedure applied
REVIEW OF SELECT STATISTICS DEFINITIONS
Null hypothesis: The hypothesis that there is no difference between groups in a study.
Pvalue: a measure of the probability from sample data that the difference between two estimates occurred by chance, if the estimates being compared were the really same.
Alpha (α): the chance of concluding there is a difference between groups when there is actually no difference.
Type I error: when you conclude there is a difference between groups, but there actually is no difference.
Beta (β): the chance of concluding there is no difference between groups when there actually is a difference.
Type II error: when you conclude there is no difference between groups, but there actually is a difference.
Power: the ability of a study to detect a significant difference between treatment groups.
Confidence interval: an estimate of the true treatment effect within a range.
Intenttotreat analysis: when the analysis includes all subjects randomized to a treatment arm, regardless of whether or not the subject completed the study.
Relative risk: compares the risk of an event in a group of individuals with a specific characteristic to the risk of that even in a group of individuals without that specific characteristic.
Absolute risk: the risk of developing a disease over a given time period.
Relative risk ratio: a ratio of the event rate in an intervention group versus the rate of that event in a control group.
Relative risk reduction: how much risk is reduced in the intervention group as compared to the control group.
Absolute risk reduction: the absolute difference in rates of an outcome between treatment and control groups.
Odds ratio: the chances that an outcome will occur in one group of subjects with an intervention, as compared to that outcome occurring in a group of subjects without the intervention.
Number needed to treat: the number of subjects that must be treated in order to benefit one subject.
Number needed to harm: the number of subjects that must be treated in order to harm one subject.
Selection bias: when one group within a study is different than the other(s) due to the manner in which the subjects were selected.
Publication bias: when the available literature favors one outcome. This is typically the result of researchers and journals only reporting favorable study results.
Recall bias: when the subject remembers an event differently from how it actually occurred.
Clinical significance: when the data analysis from the study produces results that change clinical practice.
STUDY TYPE BASICS
Crosssectional study: a snapshot of a population at a single point in time.
Observational study: a study that observes subjects and there is no active intervention or randomization of patients.
Case control study: a retrospective study that examines subjects with the outcome of interest (the cases) versus patients without the outcome of interest (the controls).
Cohort study: an observational study in which one group of subjects receives the intervention and one group does not.
Crossover study: when the intervention group and control group switch and each subject serves as their own comparator.
Followup study: observation of subjects that have not yet experienced an outcome until that outcome occurs.
Noninferiority study: aims to demonstrate the intervention is not worse than the comparator by more than a small prespecified amount, M or delta (Δ).
Randomized controlled study: a prospective study in which subjects are randomized into intervention and control groups.
COMMON STATISTICS ABBREVIATIONS
P = pvalue
CI = confidence interval
IQR = interquartile range
SD = standard deviation
OR = odds ratio
RR = relative risk
RRR = relative risk reduction
ARR = absolute risk reduction
NNT = number needed to treat
NNH = number needed to harm
STATISTICS EQUATIONS
Power = 1 – β
Outcome Yes 
Outcome No 

Intervention yes: 
A 
B 
Intervention no: 
C 
D 
OR = (A/C) / (B/D) or (AD/BC)
RR = A/(A+B) / C/(C+D) …aka… (event rate in the intervention group) / (event rate in the control group)
RRR = 1 – RR
ARR = A/(A+B) – C/(C+D) …aka… (event rate in the intervention group ) – (event rate in the control group)
NNT = 1/ARR
SOME STATISTICS BASICS
Pvalue
 The smaller the pvalue, the less likely something occurred by chance
 The larger the pvalue, the more likely something occurred by chance
 A statistically significant pvalue (e.g., P = 0.005) does not always translate into a difference that is clinically significant
 A pvalue of 0.01 means the chances the results are due to chance is 1 in 100
 Does not describe the size of an effect, only the strength of the results
Confidence interval
 A narrow confidence interval = less uncertainty
 A wide confidence interval = more uncertainty
Power
 The greater the power, the more reliable the study findings
 A common way to increase the power of a study is to increase the sample size
Sample size
 When an outcome is rare, a larger sample size is commonly needed to detect a difference between an intervention and a control group
 Magnitude of the association of the effect of treatment on outcome can decrease the needed sample size even in less common outcomes
 Sample size estimates should be calculated prior to starting a study
 Without an adequate sample size a study cannot produce valid results and is of insufficient power
Standard deviation
 Used as a measure of dispersion around the measure of central tendency for the data, mean
 The more the variability, the larger the standard deviation
 The less variability there is, the smaller the standard deviation
 When data are not evenly distributed it makes the standard deviation less reliable and in turn interquartile range, along with median, is commonly employed instead
 Cannot be used with nominal or ordinal data
 68% of data points are within 1 standard deviation and 95% of data points are within 2 standard deviations
Odds ratio
 Can be used in casecontrol studies and cohort studies
 Cannot be used to calculate a number needed to treat
 When the outcome is common, can exaggerate the risk
 An odds ratio < 1 means the outcome is less likely in the intervention group
 An odds ratio > 1 means the outcome is more likely in the intervention group
Relative risk
 A relative risk of 1 means there is no association between the intervention and the outcome
 A relative risk < 1 indicates a negative association between the intervention and the outcome
 A relative risk > 1 indicates a positive association between the intervention and the outcome
Relative risk reduction
 In the absence of an understanding for a subject’s baseline risk for the outcome, presenting benefits using relative risk reduction can be deceiving
 Can make treatments seem more substantial
 Can make toxicities seem more substantial
 Cannot be used in casecontrol studies
Absolute risk reduction
 Inverse of number needed to treat
 Yields a less exaggerated risk reduction than relative risk reduction
 Expressed in units of baseline risk
Number needed to treat/harm
 Strong treatment effects on positive and negative outcomes lead to small NNT and NNH, respectively, and viceversa.
 As NNT/NNH are derived from ARR estimates, they are also point effect estimates for a true population and therefore have confidence intervals, although not often reported.
MISCELLANEOUS NOTES
 Parametric tests generally have greater statistical power than nonparametric tests. Data must be normally distributed and continuous (i.e., ratio or interval) to use parametric tests.
 Combining study endpoints with low event rates into a composite endpoint can allow for enrolling fewer patients while increasing a study’s power, but not all endpoints are appropriate to combine and a significant result does not mean all endpoints are significant.
 Type II errors are common in clinical trials due to low patient enrollment.
 Univariate logistic regression can only identify variables that are potential independent predictors of an outcome, but these must be confirmed in multivariate logistical regression while taking into account validity of study design.
 A median value is not impacted by extreme outliers like a mean value.
 By definition, observational studies lack an active intervention.
 Randomized controlled trials have an active intervention, minimize effects of confounding, and have a greater publication impact.
HELPFUL RESOURCES
 Choosing the correct statistical test
 Sample Size Calculator
 NNTonline.net
 MEDCALC: free statistics calculators
 GraphPad: free statistics calculators
 Chisquare calculator
 Learn EBM by BMJ: How to calculate risk
SUGGESTED READINGS
Kier KL. Biostatistical applications in epidemiology. Pharmacotherapy. 2011; 31(1): 922.