BackApplied Statistics for the Health Sciences: Core Concepts and Methods
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Descriptive Statistics vs Inferential Statistics
Overview of Statistical Types
Statistics can be broadly divided into two main types: descriptive statistics and inferential statistics. Understanding the distinction between these types is fundamental for analyzing and interpreting data in health sciences and other fields.
Descriptive Statistics: Summarize and describe the main features of a dataset. Examples include measures of central tendency (mean, median, mode) and measures of variability (standard deviation, range).
Inferential Statistics: Allow conclusions to be drawn about a population based on data from a sample. This includes hypothesis testing, estimation, and making predictions.
Example: Calculating the average blood pressure of a sample of patients (descriptive), then using that sample to estimate the average blood pressure of all patients in a hospital (inferential).
Samples and Populations
Definitions and Importance
In statistics, understanding the concepts of samples and populations is essential for designing studies and interpreting results.
Population: The entire group of individuals or items with a particular characteristic of interest. For example, all adults with hypertension in a city.
Sample: A subset of the population selected for analysis. For example, 100 adults with hypertension chosen randomly from the city.
Why use samples?
More practical and cost-effective than studying entire populations.
Allows for quicker data collection and analysis.
Results from samples are used to make inferences about the population.
Statistics vs Parameters:
Statistics: Numerical measures calculated from sample data (e.g., sample mean , sample standard deviation ).
Parameters: Numerical measures describing the population (e.g., population mean , population standard deviation ).
Sampling Methods
Types of Sampling
Sampling methods determine how samples are selected from populations. The choice of method affects the representativeness and validity of statistical conclusions.
Simple Random Sampling: Every member of the population has an equal chance of being selected. Example: Randomly selecting 50 patients from a hospital database.
Opportunity Sampling: Selecting participants who are readily available. Example: Surveying students present in a classroom.
Snowball Sampling: Existing study participants recruit future participants from among their acquaintances. Example: Studying a rare disease by asking patients to refer others.
Volunteer Sampling: Participants self-select to be part of the study. Example: Posting an online survey and analyzing responses from those who choose to participate.
Comparison of Sampling Methods
Sampling Method | Description | Advantages | Disadvantages |
|---|---|---|---|
Simple Random | Random selection from entire population | Minimizes bias; representative | May be costly or time-consuming |
Opportunity | Uses available participants | Quick and easy | May not be representative |
Snowball | Participants recruit others | Useful for hard-to-reach populations | Potential for bias |
Volunteer | Participants self-select | Easy to implement | Potential for bias |
Null Hypothesis Testing
Fundamental Concepts
Null hypothesis testing is a core method in inferential statistics, used to determine whether observed data provide sufficient evidence to reject a default assumption about a population.
Null Hypothesis (): Assumes no effect or no difference. Example: There is no difference in blood pressure between two treatments.
Alternative Hypothesis (): Assumes there is an effect or a difference. Example: There is a difference in blood pressure between two treatments.
Type I Error (): Incorrectly rejecting the null hypothesis when it is true (false positive).
Type II Error (): Failing to reject the null hypothesis when it is false (false negative).
Type I and Type II Errors Table
Decision | Reality: True | Reality: False |
|---|---|---|
Reject | Type I Error | Correct Decision |
Fail to Reject | Correct Decision | Type II Error |
Statistical Significance and p-values
A p-value is the probability of obtaining results at least as extreme as those observed, assuming the null hypothesis is true.
If , results are typically considered statistically significant, and the null hypothesis is rejected.
Statistical significance does not necessarily imply practical significance.
Standard Normal Distribution
Many hypothesis tests are based on the standard normal distribution (also called the probability distribution).
The critical value of is often used as a threshold for significance.
Example: Z-test for comparing sample mean to population mean.
Confidence Intervals
Definition and Interpretation
A confidence interval (CI) provides a range of values within which the true population parameter (such as the mean) is likely to fall, with a specified level of confidence (commonly 95%).
Formula for 95% Confidence Interval:
= sample mean
= critical value from standard normal distribution (for 95%, )
= sample standard deviation
= sample size
SPSS and other statistical software can calculate confidence intervals and display them in error bar charts.
Interpretation: If a 95% CI for the mean blood pressure is [120, 130], we are 95% confident that the true mean lies within this interval.
Steps for Calculating a 95% CI with SPSS
Open the dataset in SPSS.
Select 'Analyze' → 'Compare Means' → 'Means'.
Choose the variable and grouping factor.
Click 'Options' and select 'Display means for groups' and 'Confidence interval'.
View the output table and error bar chart for the CI.
Graphing Confidence Intervals
Use error bar charts to visually represent confidence intervals around group means.
Helps in comparing means and assessing variability.
Summary Table: Key Concepts
Concept | Definition | Example |
|---|---|---|
Descriptive Statistics | Summarize features of data | Mean, median, mode of blood pressure readings |
Inferential Statistics | Draw conclusions about populations | Estimating population mean from sample |
Sample | Subset of population | 100 patients from a hospital |
Population | Entire group of interest | All patients in a city |
Confidence Interval | Range for population parameter | [120, 130] for mean blood pressure |
Additional info: Expanded explanations, formulas, and tables have been added for academic completeness and clarity.