The recently submitted tutorial of TG7, âformulating causal concepts and principled statistics answersâ is accompanied by the âsimulation learnerâ. This is a simulated dataset, motivated by the Promotion of Breastfeeding Intervention Trial (PROBIT)1. Mother-infant pairs were randomised to receive either standard care or a breastfeeding encouragement (BFE) intervention, and weight achieved at age 3 months was the main outcome.

The simulated path from randomized assignment to outcome, went over the uptake of the intervention (education program), that could be followed by the start and a specific duration of breastfeeding. The data were further enriched by the generation of alternative exposure levels with their potential outcome data in addition to âobservedâ data. This enables the reader of the tutorial or student in a course to better understand concepts of causal inference through visualization of potential outcomes under different treatments and different causal effects estimands in different populations. It further allows to explore distinct estimation methods and compare their results with the simulated populations parameter. The simulation learner showed us for example how approaches valid for one type of exposure (e.g receiving an offer for the BFE programme or actually following the BFE programme) are not automatically also valid for other exposures (e.g. the effect of actually starting breastfeeding). R code for data generation and analysis is available on, where SAS and Stata code for analysis is also provided.

1. Kramer MS, Chalmers B, Hodnett ED, et al. Promotion of breastfeeding intervention trial (PROBIT) - A randomized trial in the Republic of Belarus. Journal of the American Medical Association. 2001;285(4):413-420.

