From Data to Decisions: A Beginner’s Guide to Statistical Inference

By Janmajay Kumar | October 2024

Real-World Questions: Needing Statistical Inference

Statistical inference is essential when we need to make sense of real-world data. Often, we observe differences or trends, but how do we know whether those observations are meaningful or simply due to chance? Let’s look at five common real-life questions across different fields, each requiring statistical inference to find answers.

  1. Social Science: Does Education Affect Income?

    • Question: Do people with a college degree earn more than those without one?

    • Example: In a survey of 100 people, 50 college graduates have an average income of $60,000, while the average income for 50 people without a degree is $45,000.

    • Objective: We want to determine if the observed income difference is real or simply a result of random variation in the sample.

  2. Business: Does Customer Happiness Drive Sales?

    • Question: Do happier customers spend more money?

    • Example: Out of 200 surveyed customers, those who spent more than $500 rated their satisfaction at 8.5 out of 10, while those who spent less gave a rating of 7.2.

    • Objective: The goal is to find out whether customer satisfaction truly influences how much people spend, or if the difference in ratings is just coincidental.

  3. Finance: Is the Stock Market More Unstable During Recessions?

    • Question: Is the stock market more volatile during economic downturns?

    • Example: During a recession, daily stock price changes averaged 2.8%, while in stable economic times, the changes averaged 1.5%.

    • Objective: We want to figure out if the stock market genuinely becomes more unstable during recessions or if the observed volatility could just be random.

  4. Physics: Does Gravity Vary in Different Locations?

    • Question: Is the gravitational force slightly different at two different locations on Earth?

    • Example: At one location, gravity was measured 10 times and averaged 9.81 m/s². At another location, it was measured at 9.78 m/s².

    • Objective: We aim to see if this small difference in gravity is significant or just due to random measurement variation.

  5. Biology: Is the New Drug Better for Lowering Blood Pressure?

    Question: Does a new drug reduce blood pressure more effectively than the current treatment?

    Example: In a clinical trial, 50 patients taking the new drug had an average blood pressure drop of 12 mmHg, while 50 patients on the existing drug experienced a 9 mmHg drop.

    Objective: We’re interested in determining if the new drug is truly more effective at lowering blood pressure or if the observed difference is just a coincidence.

  6. Sports: Does Practicing More Improve Performance?

    Question: Do athletes who practice more hours each week perform better in competitions?

    Example: We tracked the weekly practice hours of 50 soccer players. Players who practiced more than 10 hours per week scored an average of 15 goals in the season, while those practicing fewer than 10 hours scored 10 goals on average. Objective: We want to determine if practicing more hours genuinely improves performance or if the difference in goal-scoring could have occurred by chance.

  7. Sports: Does Team A Have a Home Advantage?

    Question: Do teams win more often when they play at home compared to playing away?

    Example: Over a season, Team A won 70% of its home games but only 40% of its away games.

    Objective: We’re interested in figuring out if the "home advantage" is real or if the difference in winning percentage could be just random variation.

Introduction: What is Statistical Inference?

Statistical inference is the process by which we use data from a sample to make generalizations or conclusions about a larger population. The core idea is that, in many situations, it is impractical or impossible to collect data from every individual in a population. Instead, we gather data from a smaller, manageable subset of the population, known as a sample, and use that information to make educated guesses or inferences about the entire population.

Statistical inference helps answer questions like: "Is this observed effect real, or could it have occurred by chance?" or "How confident can we be in the results from our sample?" Two primary tools used in statistical inference are estimation and hypothesis testing.

Let’s look at two examples to clarify how statistical inference works:

Example 1: Estimating Average Household Income

Suppose a city wants to know the average household income of its residents. Instead of surveying every household, which could be expensive and time-consuming, they survey 500 randomly selected households. From this sample, they find that the average income is $60,000. But the question remains: does this sample accurately reflect the income of all households in the city?

Statistical inference allows us to estimate the true average income of the city’s households based on this sample. By using techniques such as confidence intervals, we can say something like: "We are 95% confident that the true average household income in this city is between 58, 000and62,000."

Example 2: Testing the Effectiveness of a New Drug

Imagine that a pharmaceutical company is testing a new drug that aims to reduce blood pressure. They conduct a clinical trial with 100 participants: 50 take the new drug, and 50 take the standard treatment. After several weeks, they observe that the group taking the new drug has an average reduction in blood pressure of 12 mmHg, while the standard treatment group has an average reduction of 9 mmHg.

Is this difference in blood pressure reduction (3 mmHg) meaningful? Or could it just be due to random chance? Statistical inference allows the researchers to test whether the observed difference is statistically significant. Through hypothesis testing, they can infer whether the new drug is likely to be more effective than the standard treatment for the population at large, or if the observed difference is too small to conclude anything definitively.

What is to be Inferred?

Explain what we aim to infer in statistics, using clear, non-technical language. Discuss concepts like population mean, mode, proportion, standard deviation, and comparisons of means. Use real-life examples to describe these concepts without relying on complex mathematical jargon.

How to Infer?

mention also three important distribution anfnd later make connection to method of inference Discuss the process of making inferences by using sample data. Emphasize the importance of sampling and why it matters. Introduce foundational concepts like the Central Limit Theorem and the Law of Large Numbers, explaining how they provide the theoretical basis for statistical inference. Include examples of hypothesis testing and types of statistical tests (e.g., t-tests, chi-square tests) in a conceptual manner.

How Confident Are We About Our Inference?

Introduce the idea of confidence intervals and how they help us measure the reliability of our estimates. Explain the concept of hypothesis testing, including the role of errors (Type I and Type II errors) in statistical tests. Provide conceptual explanations without getting too technical.

Common Pitfalls

Statistical inference can be tricky for newcomers, and it’s easy to fall into a few common pitfalls. Let’s highlight some of these mistakes and misconceptions to help you avoid them:

  1. Confusing Correlation with Causation One of the most frequent errors in data analysis is assuming that correlation implies causation. Just because two variables move together doesn’t mean one causes the other. For example, if sales of ice cream increase during hot weather, it doesn’t mean that ice cream causes hot weather! Always be cautious about interpreting relationships — correlation shows association, not causation.

  2. Misunderstanding the p-value Many beginners believe that a p-value is the probability that the null hypothesis is true. This is incorrect. The p-value measures how likely it is to observe your data (or something more extreme) if the null hypothesis were true. For example, a p-value of 0.03 doesn’t mean there’s a 3% chance the null hypothesis is true — it means that, if the null hypothesis were true, there’s a 3% chance of seeing data as extreme as what you observed.

  3. Relying Too Much on Small Sample Sizes

    Small samples can lead to misleading conclusions. A common mistake is thinking that a small sample will always represent the population accurately. In reality, smaller samples are much more likely to show random variations that don’t reflect the true population characteristics. Larger sample sizes generally provide more reliable and stable results.

  4. Ignoring Assumptions of Statistical Tests

    Many statistical tests have underlying assumptions — such as normality of data, equal variances, or independence of observations. Ignoring these assumptions can lead to inaccurate conclusions. For example, using a t-test when your data is not normally distributed or has outliers can distort your results. Always check that your data meets the assumptions of the test you’re using.

  5. Over-Interpreting Confidence Intervals

    Confidence intervals are useful tools, but they can be misinterpreted. A common mistake is to think that a 95% confidence interval means there’s a 95% chance the true parameter lies within the interval. In fact, it means that if you were to repeat the sampling process many times, 95% of those intervals would contain the true parameter — but for any given interval, the parameter either is or isn’t within it.

  6. Cherry-Picking Data

    It’s tempting to choose only the data that supports your hypothesis or business goal. However, cherry-picking data or ignoring contradictory information can lead to biased conclusions and flawed decisions. It’s important to take a holistic view of all the data and remain objective.

  7. Misusing Statistical Significance

    Just because something is statistically significant doesn’t necessarily mean it’s practically important. A result might show statistical significance (e.g., a p-value < 0.05), but the actual effect size could be so small that it’s irrelevant in the real world. Always consider the magnitude of the effect alongside statistical significance.

Steps from Data to Decision

In practice, statistical inference follows a series of structured steps that take us from the raw data collection stage to making data-driven decisions. Below is an outline of these steps, presented as an algorithm for conducting statistical inference.

Algorithm: Steps for Statistical Inference in Practice

  1. Define the Problem and Research Question

    • Clearly articulate the problem you are trying to solve or the question you are investigating.

    • Example: "Is the new drug more effective at lowering blood pressure than the standard treatment?" or "Do college graduates earn more than non-graduates?"

  2. Collect a Representative Sample of Data

    • Design the data collection process carefully, ensuring the sample is random and representative of the population. The quality of the sample is crucial for making valid inferences.

    • Example: Survey 500 randomly selected households for income data, or conduct a controlled clinical trial for the new drug.

  3. Summarize and Explore the Data

    • Organize the data in tables, graphs, and descriptive statistics (mean, median, standard deviation, etc.).

    • Explore patterns, trends, or anomalies in the data to get a preliminary understanding.

    • Example: For income data, calculate the mean and median incomes of the sample and look for any outliers or extreme values.

  4. Formulate the Hypothesis

    • Establish the null hypothesis (H0) and the alternative hypothesis (H1) based on the problem.

      • H0: There is no effect or difference.

      • H1: There is an effect or a difference.

    • Example: For the new drug:

      • H0: The new drug’s effect is the same as the standard treatment.

      • H1: The new drug is more effective.

  5. Choose the Appropriate Statistical Test

    • Select the correct test based on the type of data and the research question. Common tests include:

      • t-test for comparing means.

      • Chi-square test for categorical data.

      • ANOVA for comparing multiple groups.

      • Regression analysis for predicting relationships between variables.

    • Example: For comparing blood pressure reductions between two drugs, you might use a t-test for independent samples.

  6. Check Assumptions

    • Verify the assumptions behind the statistical test (e.g., normal distribution of data, equal variances). If these assumptions do not hold, consider alternative methods or data transformations.

    • Example: Check whether blood pressure reductions are normally distributed and if variances are similar between the two groups.

  7. Compute the Test Statistic and p-Value

    • Perform the statistical test to calculate the test statistic (e.g., t-value, F-value, etc.).

    • Calculate the p-value, which measures the probability that the observed effect could have occurred by chance under the null hypothesis.

    • Example: Compute the t-value for the difference in average blood pressure reduction, and find the associated p-value.

  8. Make the Decision

    • Compare the p-value to a predefined significance level (usually α = 0.05):

      • If p ≤ α, reject the null hypothesis (H0) and conclude that there is a statistically significant effect.

      • If p > α, fail to reject the null hypothesis and conclude that the observed effect may have occurred by chance.

    • Example: If the p-value for the blood pressure test is less than 0.05, conclude that the new drug is statistically more effective than the standard treatment.

  9. Quantify the Uncertainty (Confidence Interval)

    • Calculate the confidence interval (CI) to quantify the uncertainty around your estimate. A 95% confidence interval means you are 95% confident that the true parameter lies within this range.

    • Example: If the difference in blood pressure reduction is estimated to be 3 mmHg, the 95% confidence interval might be [1.5, 4.5] mmHg.

  10. Draw Conclusions and Make Decisions

    • Use the results of the statistical test and the confidence intervals to make data-driven decisions or recommendations.

    • Example: If the new drug shows statistically significant and clinically meaningful improvements in lowering blood pressure, consider recommending it over the standard treatment.

  11. Report and Communicate Findings

    • Present the results in a clear and understandable format, including key statistics, test results, and confidence intervals.

    • Explain the implications of the results for the decision-making process.

    • Example: Present findings to a medical board or company leadership, recommending the new drug based on the evidence.

Problem and Stepwise Solution

Problem: Does the New Weight Loss Program Work?

Suppose a fitness company has introduced a new weight loss program. The company claims that participants lose an average of 5 kilograms more than with the current program. To test this claim, the company conducts an experiment with 40 people: 20 use the new program, and 20 use the current program. After 8 weeks, the results are as follows:

Objective: We want to determine whether the difference in weight loss between the two programs is statistically significant or if it could have happened by chance.

Step-by-Step Solution

Step 1: State the Hypotheses

We establish the null and alternative hypotheses.

Step 2: Choose the Significance Level (α)

We choose a significance level of α = 0.05.

Step 3: Gather the Data

We are given the following data:

Step 4: Perform the Calculation

1. Calculate the Standard Error (SE) for the difference between means:

$$SE = \sqrt{\left(\frac{S_{\text{new}}^2}{n_{\text{new}}}\right) + \left(\frac{S_{\text{current}}^2}{n_{\text{current}}}\right)}$$

Substitute the values: $$SE = \sqrt{\left(\frac{2^2}{20}\right) + \left(\frac{1.5^2}{20}\right)} = \sqrt{\left(\frac{4}{20}\right) + \left(\frac{2.25}{20}\right)} = \sqrt{0.2 + 0.1125} = \sqrt{0.3125} = 0.559$$

2. Compute the Test Statistic (t-value):

The t-value is calculated as: $$t = \frac{(\bar{X}_{\text{new}} - \bar{X}_{\text{current}})}{SE}$$

Substitute the values: $$t = \frac{(7 - 5)}{0.559} = \frac{2}{0.559} = 3.58$$

Step 5: Determine the p-value

Using a t-distribution table or calculator, the p-value corresponding to t = 3.58 with 38 degrees of freedom is approximately: p = 0.0004

Step 6: Make the Decision

Compare the p-value to the significance level α = 0.05:

In this case, p = 0.0004, which is less than α = 0.05, so we reject the null hypothesis. This suggests that the new weight loss program leads to significantly greater weight loss than the current program.

Step 7: Conclusion

Based on the statistical test, the new weight loss program leads to more weight loss than the current program. With a t-value of 3.58 and a p-value of 0.0004, we can confidently conclude that the observed difference in weight loss is statistically significant and not due to random chance.

Problem: Two-Population Testing (Independent t-test)

A school is testing two different teaching methods to determine which one leads to better student performance. The school randomly assigns 30 students to each method. After 12 weeks, the results (mean test scores) are as follows:

Objective: Determine whether the difference in test scores between the two methods is statistically significant.

Step-by-Step Solution

Step 1: State the Hypotheses

Step 2: Choose the Significance Level (α)

We choose a significance level of α = 0.05.

Step 3: Gather the Data

The provided data for each teaching method is as follows:

Step 4: Conduct the Independent Two-Sample t-test

We use the following formula to compute the t-statistic for independent samples:

$$t = \frac{\bar{X}_A - \bar{X}_B}{\sqrt{\frac{S_A^2}{n_A} + \frac{S_B^2}{n_B}}}$$

Substituting the values: $$t = \frac{85 - 80}{\sqrt{\frac{5^2}{30} + \frac{6^2}{30}}} = \frac{5}{\sqrt{\frac{25}{30} + \frac{36}{30}}} = \frac{5}{\sqrt{0.833 + 1.2}} = \frac{5}{\sqrt{2.033}} = \frac{5}{1.426} \approx 3.51$$

Step 5: Determine the Degrees of Freedom (df)

The degrees of freedom for an independent t-test are approximated using the following formula: $$df = \frac{\left( \frac{S_A^2}{n_A} + \frac{S_B^2}{n_B} \right)^2}{\frac{\left( \frac{S_A^2}{n_A} \right)^2}{n_A - 1} + \frac{\left( \frac{S_B^2}{n_B} \right)^2}{n_B - 1}}$$ Substituting the values: $$df = \frac{(0.833 + 1.2)^2}{\frac{(0.833)^2}{29} + \frac{(1.2)^2}{29}} = \frac{(2.033)^2}{\frac{0.694}{29} + \frac{1.44}{29}} = \frac{4.133}{\frac{0.694 + 1.44}{29}} = \frac{4.133}{0.0735} \approx 56.22$$ So the degrees of freedom are approximately df = 56.

Step 6: Compare the t-Statistic to the Critical Value

Using a t-distribution table, the critical value for a one-tailed test at α = 0.05 and df = 56 is approximately tcritical = 1.67.

Since the computed t = 3.51 is greater than tcritical = 1.67, we reject the null hypothesis.

Step 7: Conclusion

The test provides sufficient evidence to conclude that Method A leads to significantly higher test scores than Method B.

Problem: Chi-Square Test for Independence

Problem: Is There a Relationship Between Exercise and Sleep Quality?
A health study examines whether exercise frequency is related to sleep quality. The data is collected from 100 individuals and presented in the following contingency table:

Contingency Table for Exercise and Sleep Quality
Good Sleep Quality Poor Sleep Quality Total
Exercises 30 10 40
Does Not Exercise 20 40 60
Total 50 50 100

Objective: Determine whether there is a relationship between exercise and sleep quality.

Step-by-Step Solution

Step 1: State the Hypotheses

Step 2: Choose the Significance Level (α)
We choose α = 0.05.

Step 3: Calculate Expected Frequencies

First, we calculate the expected frequencies based on the marginal totals in the contingency table. The formula to compute the expected frequency for each cell is: $$E_{ij} = \frac{(\text{Row Total}) \times (\text{Column Total})}{\text{Grand Total}}$$

Expected Frequencies:

The expected frequencies are as follows:

Expected Frequencies for Exercise and Sleep Quality
Good Sleep Quality Poor Sleep Quality Total
Exercises (Expected) 20 20 40
Does Not Exercise (Expected) 30 30 60
Total 50 50 100

Step 4: Conduct the Chi-Square Test
Using the observed and expected frequencies, we calculate the chi-square statistic using the formula: $$\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}$$

Observed Values: O = {30, 10, 20, 40}

Expected Values: E = {20, 20, 30, 30}

Now, calculate χ2: $$\chi^2 = \frac{(30 - 20)^2}{20} + \frac{(10 - 20)^2}{20} + \frac{(20 - 30)^2}{30} + \frac{(40 - 30)^2}{30}$$ $$\chi^2 = \frac{(10)^2}{20} + \frac{(-10)^2}{20} + \frac{(-10)^2}{30} + \frac{(10)^2}{30}$$ $$\chi^2 = \frac{100}{20} + \frac{100}{20} + \frac{100}{30} + \frac{100}{30}$$ χ2 = 5 + 5 + 3.33 + 3.33 = 16.66

Step 5: Compare the Chi-Square Statistic to the Critical Value

We have 1 degree of freedom (df = (r−1)(c−1) = 1).

From the chi-square distribution table, the critical value for α = 0.05 and 1 degree of freedom is 3.84.

Since χ2 = 16.66 is greater than 3.84, we reject the null hypothesis.

Step 6: Conclusion

There is sufficient evidence to conclude that exercise frequency is related to sleep quality.

Problem: One-Way ANOVA

Problem: Do Different Fertilizers Affect Plant Growth?
A botanist tests whether three different fertilizers lead to different levels of plant growth. She applies Fertilizer A, B, and C to three groups of plants. The average growth (in cm) after one month is as follows:

Objective: Determine whether there is a significant difference in plant growth between the three fertilizers.

Step-by-Step Solution

Step 1: State the Hypotheses

Step 2: Choose the Significance Level (α)
We choose α = 0.05.

Step 3: Conduct the ANOVA

In one-way ANOVA, we decompose the total variation into two components:

Step 3.1: Compute the Group Means and the Overall Mean

$$\text{Group means:} \quad \bar{X}_A = \frac{10 + 12 + 11}{3} = 11, \quad \bar{X}_B = \frac{14 + 15 + 16}{3} = 15, \quad \bar{X}_C = \frac{9 + 8 + 10}{3} = 9$$ $$\text{Overall mean:} \quad \bar{X} = \frac{11 + 15 + 9}{3} = 11.67$$

Step 3.2: Compute the Between-Group Sum of Squares (SSB)

The formula for the between-group sum of squares is: SSB = nA(A)2 + nB(B)2 + nC(C)2 Where nA = nB = nC = 3 (each group has 3 observations). SSB = 3(11−11.67)2 + 3(15−11.67)2 + 3(9−11.67)2 = 3(0.672) + 3(3.332) + 3(2.672) SSB = 3(0.4489) + 3(11.0889) + 3(7.1289) = 1.3467 + 33.2667 + 21.3867 = 56

Step 3.3: Compute the Within-Group Sum of Squares (SSW)

The formula for the within-group sum of squares is: SSW = ∑(Individual value−Group mean)2

For Fertilizer A: SSWA = (10−11)2 + (12−11)2 + (11−11)2 = 1 + 1 + 0 = 2

For Fertilizer B: SSWB = (14−15)2 + (15−15)2 + (16−15)2 = 1 + 0 + 1 = 2

For Fertilizer C: SSWC = (9−9)2 + (8−9)2 + (10−9)2 = 0 + 1 + 1 = 2

SSW = SSWA + SSWB + SSWC = 2 + 2 + 2 = 6

Step 3.4: Compute the Degrees of Freedom

Degrees of freedom between groups (dfB) = k − 1 = 3 − 1 = 2 Degrees of freedom within groups (dfW) = N − k = 9 − 3 = 6

Step 3.5: Compute the Mean Squares

$$\text{Mean square between groups (MSB)} = \frac{SSB}{dfB} = \frac{56}{2} = 28$$ $$\text{Mean square within groups (MSW)} = \frac{SSW}{dfW} = \frac{6}{6} = 1$$

Step 3.6: Compute the F-Statistic

$$F = \frac{MSB}{MSW} = \frac{28}{1} = 28$$

Step 3.7: Compare the F-Statistic to the Critical Value

We compare the calculated F-statistic to the critical value of F for dfB = 2 and dfW = 6 at α = 0.05. From the F-distribution table, the critical value is 5.14. Since F = 28 > 5.14, we reject the null hypothesis.

Conclusion: There is sufficient evidence to conclude that at least one fertilizer leads to significantly different plant growth.