A BCPS Statistics Review For Pharmacists

For pharmacists looking to become a board-certified pharmacotherapy specialist, it is essential to study statistics in preparation for the exam. In this article two pharmacists who have passed the BCPS exam and have significant research experience provide a BCPS statistics review for pharmacists.

Authored By: Timothy P. Gauthier, Pharm.D., BCPS-AQ ID & Tristan T. Timbrook, Pharm.D, MBA, BCPS

[Last updated: 9 July 2018]

In recent years the data provided by the American Association of Colleges of Pharmacy shows that within the profession of pharmacy there has been an increase in the number of pharmacy schools as well as an increase in the number of people graduating with doctor of pharmacy degrees.

With more competition for jobs pharmacists are seeking opportunities to achieve certifications to set themselves apart from within the applicant pool. The Board of Pharmacy Specialties (BPS) is one popular place pharmacists now look to for this purpose. BPS offers certification examinations for numerous pharmacy specialty areas and each has specific criteria for test eligibility. By far the most popular certification is that of a board certified pharmacotherapy specialist (BCPS), a designation more than 21,000 pharmacists have achieved.

Statistics study cheat sheets.

In speaking with anyone who has passed the BCPS exam you will learn that having a basic understanding of statistics is essential to succeed on test day. In the most recent BCPS content outline there are three domains and domain #2 (drug information and evidence based medicine) identifies that pharmacists should have knowledge of biostatistical methods and interpretation, clinical vs. statistical significance, and protocol design, methodology, and biostatistical methods.

To provide an additional study tool for those seeking BCPS designation the following is provided. Here is a BCPS statistics review for pharmacists.

NOTE: Trial design is not described here in detail, but has a major impact on the validity of statistical test results and must always be considered. Additionally, this is a summary of select information and readers are referred to the suggested readings at the end of this article for more in-depth information.

VARIABLE TYPES

To select the proper statistical test it is necessary to first identify the types of variables (i.e., data) that have been used. Data can be nominal, ordinal, or continuous (interval or ratio). Here are notes on the different variable types.

1. Nominal

Data without order or indication of relative severity
Is a discrete variable
Not eligible for parametric tests
Examples: sex, mortality, presence of a disease state

Picking a Test for NOMINAL Data
Dependence	Samples	# Independent Variables → Test
Independent Samples Parallel Design	2	0 → X², Fisher’s Exact test (if <5) 1 → Mantel-Haenszel test >2 → Logistic Regression
Independent Samples Parallel Design	>3	0 → X²* 1 → X²* >2 → Logistic Regression
Dependent (Paired) Cross-over Design, Related	2	0 → McNemar Test >1 → Repeated measures logistic regression
Dependent (Paired) Cross-over Design, Related	>3	0 → Cochran’s Q* >1 → Repeated measures logistic regression

*Bonferroni correction applied

2. Ordinal

Data ranked in a specific order, but lacks a constant difference in magnitude change
Is a discrete variable
Not eligible for parametric tests
Median is the most common measure of central tendency used to describe ordinal data
- Mean and standard deviation NOT used for reporting
Commonly used in observational studies
Examples: APACHE-II score, trauma score, NYHA class

Picking a Test for ORDINAL Data
Dependence	Samples	# Independent Variables → Test
Independent Samples Parallel Design	2	0 → Wilcoxon rank sum or Mann-Whitney U 1 → 2-way ANOVA >2 → ANOVA ranks
Independent Samples Parallel Design	>3	0 → Kruskal-Wallis*^¥ 1 → 2-way ANOVA ranks >2 → ANCOVA ranks
Dependent (Paired) Cross-over Design, Related	2	0 → Wilcoxon signed rank test 1 → 2-way repeated ANOVA ranks >2 → Repeated measures regression
Dependent (Paired) Cross-over Design, Related	>3	0 → Friedman test 1 → 2-way repeated ANOVA ranks >2 → Repeated measures regression

*Bonferroni correction applied; ^¥Multiple Comparison Procedure applied

3. Interval

Data ranked in a specific order that includes a constant difference in magnitude change between units and zero is an arbitrary value
A continuous variable
- Mean (average) is the most common measure of central tendency used to describe continuous data
Eligible for parametric tests if data is normally distributed
Example: temperature

4. Ratio

Data that is ranked in a specific order that includes a constant difference in magnitude change between units and zero is NOT an arbitrary value
A continuous variable
- Mean (average) is the most common measure of central tendency used to describe continuous data
Eligible for parametric tests if data is normally distributed
Examples: blood pressure, heart rate, respiratory rate

Picking a Test for CONTINUOUS Data (Interval or Ratio Data)
Dependence	Samples	# Independent Variables → Test
Independent Samples Parallel Design	2	0 → Student’s t-test 1 → 2-way ANOVA >2 → ANCOVA
Independent Samples Parallel Design	>3	0 → 1-way ANOVA (MCP) 1 → 2-way ANOVA >2 → ANCOVA
Dependent (Paired) Cross-over Design, Related	2	0 → Paired Student’s t-test 1 → 2-way repeated ANOVA ranks >2 → Repeated measures regression
Dependent (Paired) Cross-over Design, Related	>3	0 → ANOVA for repeated measures 1 → 2-way repeated ANOVA ranks >2 → Repeated measures regression

^¥Multiple Comparison Procedure applied

REVIEW OF SELECT STATISTICS DEFINITIONS

Null hypothesis: The hypothesis that there is no difference between groups in a study.

P-value: a measure of the probability from sample data that the difference between two estimates occurred by chance, if the estimates being compared were the really same.

Alpha (α): the chance of concluding there is a difference between groups when there is actually no difference.

Type I error: when you conclude there is a difference between groups, but there actually is no difference.

Beta (β): the chance of concluding there is no difference between groups when there actually is a difference.

Type II error: when you conclude there is no difference between groups, but there actually is a difference.

Power: the ability of a study to detect a significant difference between treatment groups.

Confidence interval: an estimate of the true treatment effect within a range.

Intent-to-treat analysis: when the analysis includes all subjects randomized to a treatment arm, regardless of whether or not the subject completed the study.

Relative risk: compares the risk of an event in a group of individuals with a specific characteristic to the risk of that even in a group of individuals without that specific characteristic.

Absolute risk: the risk of developing a disease over a given time period.

Relative risk ratio: a ratio of the event rate in an intervention group versus the rate of that event in a control group.

Relative risk reduction: how much risk is reduced in the intervention group as compared to the control group.

Absolute risk reduction: the absolute difference in rates of an outcome between treatment and control groups.

Odds ratio: the chances that an outcome will occur in one group of subjects with an intervention, as compared to that outcome occurring in a group of subjects without the intervention.

Number needed to treat: the number of subjects that must be treated in order to benefit one subject.

Number needed to harm: the number of subjects that must be treated in order to harm one subject.

Selection bias: when one group within a study is different than the other(s) due to the manner in which the subjects were selected.

Publication bias: when the available literature favors one outcome. This is typically the result of researchers and journals only reporting favorable study results.

Recall bias: when the subject remembers an event differently from how it actually occurred.

Clinical significance: when the data analysis from the study produces results that change clinical practice.

STUDY TYPE BASICS

Cross-sectional study: a snapshot of a population at a single point in time.

Observational study: a study that observes subjects and there is no active intervention or randomization of patients.

Case control study: a retrospective study that examines subjects with the outcome of interest (the cases) versus patients without the outcome of interest (the controls).

Cohort study: an observational study in which one group of subjects receives the intervention and one group does not.

Crossover study: when the intervention group and control group switch and each subject serves as their own comparator.

Follow-up study: observation of subjects that have not yet experienced an outcome until that outcome occurs.

Non-inferiority study: aims to demonstrate the intervention is not worse than the comparator by more than a small pre-specified amount, M or delta (Δ).

Randomized controlled study: a prospective study in which subjects are randomized into intervention and control groups.

COMMON STATISTICS ABBREVIATIONS

P = p-value

CI = confidence interval

IQR = inter-quartile range

SD = standard deviation

OR = odds ratio

RR = relative risk

RRR = relative risk reduction

ARR = absolute risk reduction

NNT = number needed to treat

NNH = number needed to harm

STATISTICS EQUATIONS

Power = 1 – β

	Outcome Yes	Outcome No
Intervention yes:	A	B
Intervention no:	C	D

OR = (A/C) / (B/D) or (AD/BC)

RR = A/(A+B) / C/(C+D) …aka… (event rate in the intervention group) / (event rate in the control group)

RRR = 1 – RR

ARR = A/(A+B) – C/(C+D) …aka… (event rate in the intervention group ) – (event rate in the control group)

NNT = 1/ARR

SOME STATISTICS BASICS

P-value

The smaller the p-value, the less likely something occurred by chance
The larger the p-value, the more likely something occurred by chance
A statistically significant p-value (e.g., P = 0.005) does not always translate into a difference that is clinically significant
A p-value of 0.01 means the chances the results are due to chance is 1 in 100
Does not describe the size of an effect, only the strength of the results

Confidence interval

A narrow confidence interval = less uncertainty
A wide confidence interval = more uncertainty

Power

The greater the power, the more reliable the study findings
A common way to increase the power of a study is to increase the sample size

Sample size

When an outcome is rare, a larger sample size is commonly needed to detect a difference between an intervention and a control group
Magnitude of the association of the effect of treatment on outcome can decrease the needed sample size even in less common outcomes
Sample size estimates should be calculated prior to starting a study
Without an adequate sample size a study cannot produce valid results and is of insufficient power

Standard deviation

Used as a measure of dispersion around the measure of central tendency for the data, mean
The more the variability, the larger the standard deviation
The less variability there is, the smaller the standard deviation
When data are not evenly distributed it makes the standard deviation less reliable and in turn inter-quartile range, along with median, is commonly employed instead
Cannot be used with nominal or ordinal data
68% of data points are within 1 standard deviation and 95% of data points are within 2 standard deviations

Odds ratio

Can be used in case-control studies and cohort studies
Cannot be used to calculate a number needed to treat
When the outcome is common, can exaggerate the risk
An odds ratio < 1 means the outcome is less likely in the intervention group
An odds ratio > 1 means the outcome is more likely in the intervention group

Relative risk

A relative risk of 1 means there is no association between the intervention and the outcome
A relative risk < 1 indicates a negative association between the intervention and the outcome
A relative risk > 1 indicates a positive association between the intervention and the outcome

Relative risk reduction

In the absence of an understanding for a subject’s baseline risk for the outcome, presenting benefits using relative risk reduction can be deceiving
- Can make treatments seem more substantial
- Can make toxicities seem more substantial
Cannot be used in case-control studies

Absolute risk reduction

Inverse of number needed to treat
Yields a less exaggerated risk reduction than relative risk reduction
Expressed in units of baseline risk

Number needed to treat/harm

Strong treatment effects on positive and negative outcomes lead to small NNT and NNH, respectively, and vice-versa.
As NNT/NNH are derived from ARR estimates, they are also point effect estimates for a true population and therefore have confidence intervals, although not often reported.

MISCELLANEOUS NOTES

Parametric tests generally have greater statistical power than non-parametric tests. Data must be normally distributed and continuous (i.e., ratio or interval) to use parametric tests.
Combining study endpoints with low event rates into a composite endpoint can allow for enrolling fewer patients while increasing a study’s power, but not all endpoints are appropriate to combine and a significant result does not mean all endpoints are significant.
Type II errors are common in clinical trials due to low patient enrollment.
Univariate logistic regression can only identify variables that are potential independent predictors of an outcome, but these must be confirmed in multivariate logistical regression while taking into account validity of study design.
A median value is not impacted by extreme outliers like a mean value.
By definition, observational studies lack an active intervention.
Randomized controlled trials have an active intervention, minimize effects of confounding, and have a greater publication impact.

HELPFUL RESOURCES