Schoolgirls are smokers, drinkers
This article presents the findings of a survey of 1400 girls aged 11 or 12, using percentages and relative likelihood. One must assume that the final statement should read "3.8 times more likely" rather than "3.8 per cent more likely", as this later statement would equate to 1.038 times more likely, and would probably not be considered worth reporting.
Many conditional probabilities can be found from information in this article if a few assumptions are made. For example
Pr (Smoke | friends smoke) = 7 Pr (Smoke | friends don't smoke)
Pr (Smoke | parents smoke) = 2.3 Pr (Smoke | parents don't smoke)
Pr (Smoke | low literacy) = 5 Pr (Smoke | high literacy)
[Here Pr (A | B) is the probability of event A occurring given the occurrence of event B].
The information that 26% smoked in their lives and 6% had drunk a glass of alcohol in the previous month can be combined with information that girls who smoked (if we assume the 26%) were 4.9 times more likely to have drunk alcohol (in the last month) and the survey size of 1400 to create a two-way table accounting for all of the girls. To create the table, fill in the row and column totals first, then use the ratio in the "smoked" column to work out the numbers who drank and did not drink (4.9 X + X = 364).
Smoked Didn't smoke Totals Drank 62 22 84 Didn't drink 302 1014 1316 Totals 364 1036 1400
This is a good approximation given that the percentages (26% and 6%) probably involved rounding. Notice the difference in the conditional probabilities available from the table:
Pr (drank | smoked) = 62/364 = .17 while
Pr (smoked | drank) = 62/84 = .74
Also Pr (drank and smoked) = 62/1400 = .04.
Two events are said to be independent if
Pr (A | B) = Pr (B | A) or Pr (B | A) = Pr (B).
Here we see that
Pr (drank) = .06 while Pr (drank | smoked) = .17
and Pr (smoked) = .26 while Pr (smoked | drank) = .74.
The events are not independent.
If, however, we assume that the statement about "4.9 times more likely" refers to the 6% who smoked in the last month, then a totally different picture emerges.
Smoked Didn't smoke Totals Drank 14 70 84 Didn't drink 70 1246 1316 Totals 84 1316 1400
Pr (smoked | drank) = .17 = Pr (drank | smoked) and
Pr (drank and smoked) = .01.
Again checking for independence of the events, we see that
Pr (drank) = .06 = Pr (smoked)
which is not equal to .17, thus again the events are not independent. This is what we are intuitively led to believe in reading the article.
What are the implications of the reconstructed data in the two tables? Students need considerable experience interpreting probabilities from two way tables. This article gives many opportunities.
Where to next?
Student Questions for this article
Index - Related articles
Index - Chance and Basic Probability
Index - Inference
Numeracy in the News - Main Index