**Smoking big risk for women
**

You may wish to reference this from Sampling or Inference.

Two kinds of issues can be addressed from this article. First there is the general issue of collecting the data, reporting it and explaining it. It is an interesting case of a retrospective study because it starts with people who have lung cancer (probably aged 55 or more since most have been smoking about 40 years).

The second issue is the representation of probabilities. It is not at all clear how these could be calculated from the information about the people in the study. Using a frequency approach to probability, one might conclude that since the researchers started with people with lung cancer (442 women and 403 men), the obtainable probabilities could be

Pr (smoking | lung cancer) NOT Pr (lung cancer | smoking).

[Here Pr (A | B) is the probability of event A occurring given the occurrence of event B.]

In the light of the comments about healthy people perhaps the researchers could calculate

Pr (smoking | healthy).

These, however, are not related directly to the probabilities given in the article.

The matching with "healthy" people of about the same age is of interest because it does not say whether they were smokers or not. From statements earlier in the article, one might suppose they were non-smokers. This could be important to the interpretation of findings. The matching of subjects is important for a study such as this but of course does not account for other factors (genetic or social) why people might choose to smoke or develop a propensity for lung cancer. A general discussion of this topic is given in the CoMap video

Statistics: Decisions Through Data, Unit 16, "The question of Causation" (available from the Australian Association of Mathematics Teachers Inc.).A further question arises as to the basis for the conclusions for "younger" women given at the beginning of the article,

Pr (Cancer | female who smoked 30 pack years) = 27 Pr (Cancer | female non smoker)

and

Pr (cancer | male who smoked 30 pack years) = 11 Pr (Cancer | male non smoker).After discussion it could be of interest to follow up this article by looking at the original research report in the

American Journal of Epidemiology.

Where to next?Student Questions for this article

Newspaper article

Index - Related articles

Index - Data Collection and Sampling

Index - Chance and Basic Probability

Numeracy in the News - Main Index