Free reading is over, click to pay to read the rest ... pages
0 dollars,0 people have bought.
Reading is over. You can download the document and read it offline
0people have downloaded it
Document pages: 48 pages
Abstract: This paper combines causal mediation analysis with double machine learning tocontrol for observed confounders in a data-driven way under aselection-on-observables assumption in a high-dimensional setting. We considerthe average indirect effect of a binary treatment operating through anintermediate variable (or mediator) on the causal path between the treatmentand the outcome, as well as the unmediated direct effect. Estimation is basedon efficient score functions, which possess a multiple robustness propertyw.r.t. misspecifications of the outcome, mediator, and treatment models. Thisproperty is key for selecting these models by double machine learning, which iscombined with data splitting to prevent overfitting in the estimation of theeffects of interest. We demonstrate that the direct and indirect effectestimators are asymptotically normal and root-n consistent under specificregularity conditions and investigate the finite sample properties of thesuggested methods in a simulation study when considering lasso as machinelearner. We also provide an empirical application to the U.S. NationalLongitudinal Survey of Youth, assessing the indirect effect of health insurancecoverage on general health operating via routine checkups as mediator, as wellas the direct effect. We find a moderate short term effect of health insurancecoverage on general health which is, however, not mediated by routine checkups.
Document pages: 48 pages
Abstract: This paper combines causal mediation analysis with double machine learning tocontrol for observed confounders in a data-driven way under aselection-on-observables assumption in a high-dimensional setting. We considerthe average indirect effect of a binary treatment operating through anintermediate variable (or mediator) on the causal path between the treatmentand the outcome, as well as the unmediated direct effect. Estimation is basedon efficient score functions, which possess a multiple robustness propertyw.r.t. misspecifications of the outcome, mediator, and treatment models. Thisproperty is key for selecting these models by double machine learning, which iscombined with data splitting to prevent overfitting in the estimation of theeffects of interest. We demonstrate that the direct and indirect effectestimators are asymptotically normal and root-n consistent under specificregularity conditions and investigate the finite sample properties of thesuggested methods in a simulation study when considering lasso as machinelearner. We also provide an empirical application to the U.S. NationalLongitudinal Survey of Youth, assessing the indirect effect of health insurancecoverage on general health operating via routine checkups as mediator, as wellas the direct effect. We find a moderate short term effect of health insurancecoverage on general health which is, however, not mediated by routine checkups.