Applied Statistics for the Health Sciences: Core Concepts and Methods

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Descriptive Statistics vs Inferential Statistics

Overview of Statistical Types

Statistics can be broadly divided into two main types: descriptive statistics and inferential statistics. Understanding the distinction between these types is fundamental for analyzing and interpreting data in health sciences and other fields.

Descriptive Statistics: Summarize and describe the main features of a dataset. Examples include measures of central tendency (mean, median, mode) and measures of variability (standard deviation, range).
Inferential Statistics: Allow conclusions to be drawn about a population based on data from a sample. This includes hypothesis testing, estimation, and making predictions.
Example: Calculating the average blood pressure of a sample of patients (descriptive), then using that sample to estimate the average blood pressure of all patients in a hospital (inferential).

Samples and Populations

Definitions and Importance

In statistics, understanding the concepts of samples and populations is essential for designing studies and interpreting results.

Population: The entire group of individuals or items with a particular characteristic of interest. For example, all adults with hypertension in a city.
Sample: A subset of the population selected for analysis. For example, 100 adults with hypertension chosen randomly from the city.
Why use samples?
- More practical and cost-effective than studying entire populations.
- Allows for quicker data collection and analysis.
- Results from samples are used to make inferences about the population.
Statistics vs Parameters:
- Statistics: Numerical measures calculated from sample data (e.g., sample mean , sample standard deviation ).
- Parameters: Numerical measures describing the population (e.g., population mean , population standard deviation ).

Sampling Methods

Types of Sampling

Sampling methods determine how samples are selected from populations. The choice of method affects the representativeness and validity of statistical conclusions.

Simple Random Sampling: Every member of the population has an equal chance of being selected. Example: Randomly selecting 50 patients from a hospital database.
Opportunity Sampling: Selecting participants who are readily available. Example: Surveying students present in a classroom.
Snowball Sampling: Existing study participants recruit future participants from among their acquaintances. Example: Studying a rare disease by asking patients to refer others.
Volunteer Sampling: Participants self-select to be part of the study. Example: Posting an online survey and analyzing responses from those who choose to participate.

Comparison of Sampling Methods

Sampling Method	Description	Advantages	Disadvantages
Simple Random	Random selection from entire population	Minimizes bias; representative	May be costly or time-consuming
Opportunity	Uses available participants	Quick and easy	May not be representative
Snowball	Participants recruit others	Useful for hard-to-reach populations	Potential for bias
Volunteer	Participants self-select	Easy to implement	Potential for bias

Null Hypothesis Testing

Fundamental Concepts

Null hypothesis testing is a core method in inferential statistics, used to determine whether observed data provide sufficient evidence to reject a default assumption about a population.

Null Hypothesis (): Assumes no effect or no difference. Example: There is no difference in blood pressure between two treatments.
Alternative Hypothesis (): Assumes there is an effect or a difference. Example: There is a difference in blood pressure between two treatments.
Type I Error (): Incorrectly rejecting the null hypothesis when it is true (false positive).
Type II Error (): Failing to reject the null hypothesis when it is false (false negative).

Type I and Type II Errors Table

Decision	Reality: True	Reality: False
Reject	Type I Error	Correct Decision
Fail to Reject	Correct Decision	Type II Error

Statistical Significance and p-values

A p-value is the probability of obtaining results at least as extreme as those observed, assuming the null hypothesis is true.
If , results are typically considered statistically significant, and the null hypothesis is rejected.
Statistical significance does not necessarily imply practical significance.

Standard Normal Distribution

Many hypothesis tests are based on the standard normal distribution (also called the probability distribution).
The critical value of is often used as a threshold for significance.
Example: Z-test for comparing sample mean to population mean.

Confidence Intervals

Definition and Interpretation

A confidence interval (CI) provides a range of values within which the true population parameter (such as the mean) is likely to fall, with a specified level of confidence (commonly 95%).

Formula for 95% Confidence Interval:

= sample mean
= critical value from standard normal distribution (for 95%, )
= sample standard deviation
= sample size

SPSS and other statistical software can calculate confidence intervals and display them in error bar charts.
Interpretation: If a 95% CI for the mean blood pressure is [120, 130], we are 95% confident that the true mean lies within this interval.

Steps for Calculating a 95% CI with SPSS

Open the dataset in SPSS.
Select 'Analyze' → 'Compare Means' → 'Means'.
Choose the variable and grouping factor.
Click 'Options' and select 'Display means for groups' and 'Confidence interval'.
View the output table and error bar chart for the CI.

Graphing Confidence Intervals

Use error bar charts to visually represent confidence intervals around group means.
Helps in comparing means and assessing variability.

Summary Table: Key Concepts

Concept	Definition	Example
Descriptive Statistics	Summarize features of data	Mean, median, mode of blood pressure readings
Inferential Statistics	Draw conclusions about populations	Estimating population mean from sample
Sample	Subset of population	100 patients from a hospital
Population	Entire group of interest	All patients in a city
Confidence Interval	Range for population parameter	[120, 130] for mean blood pressure

Additional info: Expanded explanations, formulas, and tables have been added for academic completeness and clarity.