Schoolgirls are smokers, drinkers


This article presents the findings of a survey of 1400 girls aged 11 or 12, using percentages and relative likelihood. One must assume that the final statement should read "3.8 times more likely" rather than "3.8 per cent more likely", as this later statement would equate to 1.038 times more likely, and would probably not be considered worth reporting.

Many conditional probabilities can be found from information in this article if a few assumptions are made. For example

Pr (Smoke | friends smoke) = 7 Pr (Smoke | friends don't smoke)
Pr (Smoke | parents smoke) = 2.3 Pr (Smoke | parents don't smoke)
Pr (Smoke | low literacy) = 5 Pr (Smoke | high literacy)

[Here Pr (A | B) is the probability of event A occurring given the occurrence of event B].

The information that 26% smoked in their lives and 6% had drunk a glass of alcohol in the previous month can be combined with information that girls who smoked (if we assume the 26%) were 4.9 times more likely to have drunk alcohol (in the last month) and the survey size of 1400 to create a two-way table accounting for all of the girls. To create the table, fill in the row and column totals first, then use the ratio in the "smoked" column to work out the numbers who drank and did not drink (4.9 X + X = 364).

  Smoked Didn't smoke Totals
Drank 62 22 84
Didn't drink 302 1014 1316
Totals 364 1036 1400

This is a good approximation given that the percentages (26% and 6%) probably involved rounding. Notice the difference in the conditional probabilities available from the table:

Pr (drank | smoked) = 62/364 = .17 while
Pr (smoked | drank) = 62/84 = .74
Also Pr (drank and smoked) = 62/1400 = .04.

Two events are said to be independent if

Pr (A | B) = Pr (B | A) or Pr (B | A) = Pr (B).

Here we see that

Pr (drank) = .06 while Pr (drank | smoked) = .17
and Pr (smoked) = .26 while Pr (smoked | drank) = .74.

The events are not independent.

If, however, we assume that the statement about "4.9 times more likely" refers to the 6% who smoked in the last month, then a totally different picture emerges.

  Smoked Didn't smoke Totals
Drank 14 70 84
Didn't drink 70 1246 1316
Totals 84 1316 1400

Pr (smoked | drank) = .17 = Pr (drank | smoked) and

Pr (drank and smoked) = .01.

Again checking for independence of the events, we see that

Pr (drank) = .06 = Pr (smoked)

which is not equal to .17, thus again the events are not independent. This is what we are intuitively led to believe in reading the article.

What are the implications of the reconstructed data in the two tables? Students need considerable experience interpreting probabilities from two way tables. This article gives many opportunities.


Where to next?

Student Questions for this article
Newspaper article
Index - Related articles
Index - Chance and Basic Probability
Index - Inference
Numeracy in the News - Main Index