This part is about summarizing data so it makes sense at a glance.
1. Measures of Central Tendency #
These tell us the βcenterβ of data.
- Mean = average (sum Γ· number of values)
- Median = middle value (when data is sorted)
- Mode = most frequent value
π Example: Suppose car defects found in 7 inspections = [2, 3, 3, 4, 6, 7, 10]
- Mean = (2+3+3+4+6+7+10)/7 = 5
- Median = 4 (middle value)
- Mode = 3 (most frequent)
2. Measures of Spread #
These tell us how spread out the data is.
- Range = max β min
- Variance = average of squared differences from the mean
- Standard Deviation (SD) = square root of variance (tells how far values are from mean)
π In our defects example above [2, 3, 3, 4, 6, 7, 10]:
- Range = 10 β 2 = 8
- Mean = 5
- Standard Deviation β 2.6
3. Data Visualization #
- Histogram: shows distribution of numerical data (like a bar chart but for ranges)
- Boxplot: shows median, quartiles, and outliers
- Scatterplot: shows relationship between two variables