I am still confused about the word "confident". When he says he is confident there is a 95% chance, it sounds like "it is very probable that there is a 95% probability". That confuses me, so my question is: Is there a 95% probability, or is it not? If it is just very probable (we are confident, but not 100% sure) that there is a 95% chance, how probable is that our confidence is justified? Is it one probability involved here, or two? Maybe I am putting too much in the word confident and its use. But it would be simpler if he just said "there is a 95% chance", if that actually is the case.

Great explanation, the difference between a probability and knowing something has already happened really helped to drive the point home!

Sal please please please prepare a script and tell every sentence just once. its very annoying to hear it repeatedly till you finish writing.

That's cause he is thinking out loud and to be honest that is why I like his videos! This way you can follow his whole thought process while he is solving something. scripted == book -> read a book

Say that the difference of two distributions were either less than 0.7 or greater than 3.12. Would observing either of those two scenarios indicate that something went wrong in the replication of the experiment? That perhaps one is dealing with new distributions accidentally?

Those cases are part of the other 5% outside the 95% confidence interval. That doesn't mean the calculation was wrong. In fact, those cases were taken into account in the calculation by saying that we are 95% confident, not 100% confident, that the actual difference is between 0.7 and 3.12.

Main content

Course: Statistics and probability > Unit 13

Lesson 2: Comparing two means

Clarification of confidence interval of difference of means

Name: Clarification of confidence interval of difference of means
Uploaded: 2011-02-20T16:54:20Z
Description: Clarification of Confidence Interval of Difference of Means

Google Classroom

Clarification of Confidence Interval of Difference of Means. Created by Sal Khan.

Want to join the conversation?

Sort by:

Tommy
Posted 10 years ago. Direct link to Tommy's post “I am still confused about...”
I am still confused about the word "confident". When he says he is confident there is a 95% chance, it sounds like "it is very probable that there is a 95% probability". That confuses me, so my question is: Is there a 95% probability, or is it not? If it is just very probable (we are confident, but not 100% sure) that there is a 95% chance, how probable is that our confidence is justified?

Is it one probability involved here, or two?

Maybe I am putting too much in the word confident and its use. But it would be simpler if he just said "there is a 95% chance", if that actually is the case.
Button navigates to signup pageButton navigates to signup page
(11 votes)
Answer
- Bruno GC
  Posted 7 years ago. Direct link to Bruno GC's post “Great explanation, the di...”
  Great explanation, the difference between a probability and knowing something has already happened really helped to drive the point home!
  Button navigates to signup page
  (2 votes)
Robert Sokota
Posted 10 years ago. Direct link to Robert Sokota's post “Sal uses the given varian...”
Sal uses the given variance for the two samples, 4.67 and 4.04. Why doesn't he calculate a pooled variance for the separate variances and used that to calculate his confidence interval estimate?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
Parthiban Rajendran
Posted 6 years ago. Direct link to Parthiban Rajendran's post “Sal please please please ...”
Sal please please please prepare a script and tell every sentence just once. its very annoying to hear it repeatedly till you finish writing.
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Bastian Widanski
  Posted 6 years ago. Direct link to Bastian Widanski's post “That's cause he is thinki...”
  That's cause he is thinking out loud and to be honest that is why I like his videos! This way you can follow his whole thought process while he is solving something.
  scripted == book -> read a book
  Button navigates to signup page
  (3 votes)
Konstantin
Posted 8 years ago. Direct link to Konstantin's post “Say that the difference o...”
Say that the difference of two distributions were either less than 0.7 or greater than 3.12. Would observing either of those two scenarios indicate that something went wrong in the replication of the experiment? That perhaps one is dealing with new distributions accidentally?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- akshara757
  Posted 7 years ago. Direct link to akshara757's post “Those cases are part of t...”
  Those cases are part of the other 5% outside the 95% confidence interval. That doesn't mean the calculation was wrong. In fact, those cases were taken into account in the calculation by saying that we are 95% confident, not 100% confident, that the actual difference is between 0.7 and 3.12.
  Button navigates to signup page
  (2 votes)
Pavani Jeyathasan
Posted 8 years ago. Direct link to Pavani Jeyathasan's post “Can I say that --> We are...”
Can I say that --> We are 95% confident that the true mean difference of weight loss lies between 0.7 Ibs and 3.12 Ibs
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- horus.scope
  Posted 8 years ago. Direct link to horus.scope's post “Yes, that's what it's ess...”
  Yes, that's what it's essentially saying.
  i.e. If we ran this test 1000 more times, x1 would have some minimum and some maximum approximate value, and so would x2. On average, the difference between these two that we expect is 0.7 to 3.12 in favor of x1.
  It's good to reiterate that this is not just a probability estimate, but an estimate in the strictest sense in that we assume the population standard deviation from a single sample. If you really did do 1000 more tests, you would actually get a better approximation by summing all of the samples into a sub-population of 100,000 and deriving the standard deviation within that instead.
  Button navigates to signup page
  (1 vote)
lrnt.s.sale
Posted 8 years ago. Direct link to lrnt.s.sale's post “Hi. In the presentations ...”
Hi. In the presentations (e.i Bernoulli / Margin of errors and in the course the difference of std dev uses the diff of sample std dev (as done in this video). However, when using the std dev of the sample, the denominator n-1 (degree of freedom) is used for accuracy purpose. In this video and the next, the diff of std dev uses the std dev of the sample std dev, but the denominator is n, not n-1. Though there is a high level of confidence (99%) that it is my comprehension which fails somewhere, I would be grateful if I could understand why the denominator n-1 is not used. Thanks a lot for your great videos, and my best wishes for 2016. Kindest regards. Laurent.
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- deka
  Posted a year ago. Direct link to deka's post “there seem many things mi...”
  there seem many things mixed here
  
  1. n-1 (degree of freedom) vs. n-1 (denominator)
  1) n-1 as degree of freedom is different from n-1 as denominator (though their values are the same). they are used in two different contexts
  2) n-1 as DoF is for looking up t-table to do a hypothesis test (not involved in the calculation itself)
  3) n-1 as denominator is for calculating std of samples from a population
  # but we're doing neither 2) nor 3) here
  
  2. n (denominator) vs. n-1 (denominator)
  1) why we use n than n-1 as a denominator here?
  : because our final concern is the mean of a population (than of a sample)
  2) equations
  std_pop = whatever_data / n
  # but we don't know the values of whatever_data
  # thus we rely on std of samples (not of one sample) to estimate std_pop
  std_pop = std_samples / sqrt(n) = sqrt( variance_samples/n )
  # let's ignore summing variances for two difference populations for now
  3) std_sample = whatever_data / (n-1) as we learnt
  # std_sample isn't std_samples. the former is only for std of one set of sampling (and its datapoints), while the latter is for that of many samplings (and their means)
  
  3. why n-1?
  1) the reason behind using n-1 than n for std_sample is because in most cases sample size is smaller than population size
  2) and the numerator (whatever_data above) tends to be smaller than that of population
  3) thus we divide it with a bit smaller number (n-1) to prevent std_sample from being unnecessarily small
  
  by the way, thanks for asking this. it may help clear others' confusion about these concepts too
  Button navigates to signup page
  (1 vote)
Εμμανουήλ Απέργης
Posted 9 years ago. Direct link to Εμμανουήλ Απέργης's post “I don't really understand...”
I don't really understand why our 0.7 to 3.12 interval tells us that we are confident 95% that we might lose weight and why not about not losing weight. Why Sal picked the first?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- horus.scope
  Posted 8 years ago. Direct link to horus.scope's post “because of the worst case...”
  because of the worst case: x1 is still 0.7 more weight loss than x2
  he's saying if you ran the test again, there's a 95% chance that x1's WORST case is going to be some amount higher, and x2's BEST case is going to be some amount lower. Even when x1 is at its worst, and x2 is at its best (within 2 approximated std deviations at least), then x1 is still better
  Button navigates to signup page
  (1 vote)
Vijay Jayaraman
Posted 11 years ago. Direct link to Vijay Jayaraman's post “I just dont know why he s...”
I just dont know why he said that there is a 95% chance that 1.91 is withing 1.96 sigma of the mu.... It is disturbing me since yesterday.. someone please explain..
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- valtih1978
  Posted 10 years ago. Direct link to valtih1978's post “just watch previous video...”
  just watch previous video. 1.91 is the sample mean, μ is the population mean and he uses the z-table to get 1.96 (standard deviations) interval for 95% confidence.
  Comment on valtih1978's post “just watch previous video...”
  (1 vote)

Video transcript

Near the end of the last video, I wasn't as articulate as I would like to be. Mainly because I think 15 minutes into a video my brain starts to really warm up too much. But what I want to do is restate what I was trying to say. We got this confidence interval. I'll rewrite it here. I'll just restate the confidence interval. So there's the 95% confidence interval for the mean of this distribution. So, the mean of that distribution, we got as being 1.91 plus or minus 1.21. And near the end of the video I tried to explain why that is neat. Because here we have this confidence interval for this weird mean of the difference between the sampling means. So it seems kind of confusing. But I just want to restate what we saw in previous videos. This thing right over here, the mean of the difference of the sampling means, we saw two or three videos ago. It's the same thing as the mean of the difference of the means of the sampling distributions. And we know that the mean of each of the sampling distributions is actually the same as the mean of the population distributions. So this is the same thing as the mean of Population One minus the mean of Population Two. And this was the neat result about the last video. This isn't just a 95% confidence interval for this parameter right here. It's actually a 95% confidence interval for this parameter right here. And this is the parameter that we really care about. The true difference in weight loss between going on the low-fat diet and not going on the low-fat diet. And we have a 95% confidence interval that that difference is between 0.7 and 3.12 pounds. Which tells us that we have a 95% confidence interval that you're definitely going to lose some weight. We're not 100% sure. We're confident that there's a 95% probability of that. Anyway, hopefully that clarifies it a little bit. I didn't want to confuse you too much with that bungled language that I had at the end of the last video.