Discussion about stat consulting, stat packages, applied statistics and data analysis by statistical consultants of Academic Technology Services at UCLA. Please leave your comments or send us email at stattalk at ats.ucla.edu

Tuesday, March 6, 2007

What The Heck(man) Is Going On?

We had a client come in with a question about a Heckman selection model that was giving her trouble. She had run it several weeks earlier and everything was working fine. She had some missing data among her predictors and decided to do a multiple imputation. After imputing the missing data she got the following error message:

     Dependent variable never censored due to selection.


She couldn't figure out what was wrong until I asked her if she had also imputed the response (dependent) variable. Instantly, she realized what the problem was. Since she had imputed all the variables in her dataset, there were no longer any missing values on her response variable and therefore no way to estimate a selection model.

pbe

1 comment:

Unknown said...

There is a bit of a problem here. In order to do multiple imputation you must assume that the probability of missingness can depend on any of the other variables but not on the missing value itself (this is called the Missing At Random, or MAR assumption). Also during the impution all variables and interaction and the dependent variable must be used. If the dependent variable contains missing values, than these too must be MAR. However, given that the client wanted to use heckman I would assume that she did not believe that MAR assumption holds for the dependent variable.

Contributors