StatTalk: June 2007

Wednesday, June 27, 2007

Seemingly Unrelated News

The first copies of Stata 10 arrived on Monday while I was out of town. The copies for our lab and my Mac arrived on Wednesday around noon. At 3pm, a client comes into consulting who is analyzing MRI data on several different probes over 300 time points. She has been doing non-linear modeling separately for each of the probes, however, it is the same exponential model in each case. The client wants to test whether the parameters estimated by the exponential models are statistically different or not. This turns out to be a perfect job for the new nlsur (non-linear seemingly unrelated regression) command. Ten minutes of work and it is up running. Its fun to play with new toys.

pbe

Wednesday, June 13, 2007

The Best of Stata 10

Its dangerous to try to pick out the best new features of a software package before you actually get your hands on it but I'm feeling in a daring mood. I chose features that I have either been waiting for or features that will make my statistical life easier. Since I'm the Mac guy around here anything that allows me to avoid windows software will make my life easier.

First feature: Stata 10 will have a mechanism for dealing with strata with singleton PSUs. This will make life much easier because it is a common occurance among our clients.

Second feature: Stata 10 has a new command for multilevel logit models. This can, of course, already be done in -gllamm- but it be interesting to see if it runs faster and is easier to use than -gllamm-. The HLM people will be including this in their SuperMixed program that comes out in the fall.

Third feature: Stata 10 has a new exact logistic estimation command. No more having to run LogExact in windows. I hope.

Bonus feature: Stata 10 finally gets a full discriminant analysis procedure. I know this is not on top of everybody's wish list but I like discriminant analysis and find it useful in interpreting some maonva's. Further, I will get to retire my -daoneway- ado program.

So these are my pick's, what are your's?

pbe

Monday, June 4, 2007

Stata 10

Stata Corp announced today that it will release version 10.0 on June 25th. There was a long list of new features and analyses, many of which have been long awaited by the Stata faithful. I don't really want to get into the new stuff in this blog, instead I want to discuss how ATS Stat Consulting deals with major new software releases.

First off, I am not sure when we will be receiving our copies of the software. The timing of this release is not optimal for our organization because our fiscal year ends on June 30th and the purchasing database shuts down several weeks before that. Furthermore, once the new fiscal year starts you can't order stuff right away because all the finance and business people are involved in preparing end of year fiscal closing reports. So, I'm not sure when we will have our hands on the software.

Basically, ATS Stat Consulting looks at new versions in terms of what web pages need to be revised, new pages that need to be created, and possibly pages that need to be removed.

Let's start with revised pages. When Stata changes how an existing command works, we need to update every page that uses the command. The biggest changes in our short history occurred when Stata did a massive overhaul of the graphics commands in Stata 8. Stata 9 also required numerous revisions, due in large part to the expanded use of prefix commands. On the surface it doesn't look like Stata 10 will require as many revisions, although there are some changes in options and features for some commands.

The tricky part here is that we have to show both the old and the new until most of our users have migrated to Stata 10.

New pages are required for new commands. There are quite a few of these. In particular the new graph editor will require pages showing how it works. We will also have to develop pages and live presentations demonstrating the best features of Stata 10.

It is not a all clear as to whether many or any pages will need to be removed. I will remove pages that are related to my -daoneway- (discriminant analysis) program since Stata 10 will provide several ways of doing disciminant analysis, but they will be replaced by pages for the new built-in procedures -discrim lda- and -candisc-.

We will be so busy in July and August with all the web stuff that I will hardly have time to work on my Stata 11 wish list.

Update: 6/8/07 -- Looks like we managed to get our purchase order in just under the wire before fiscal closing. Now its just a matter of waiting for delivery.

pbe

Sunday, June 3, 2007

More Control Groups Gone Wild

We had a client, come in to consulting recently, who was studying people receiving treatment in mental health clinics. He classified these patients into three groups; Group 1) individuals who had a personal history of depression, group 2) individuals with a family history of depression and Group 3) individuals with no history of depression. The last group was the control group. The outcome variable was a binary indicator, whether or not they had experienced a depressive episode since their last visit to the clinic.

The problem was that the control group did not experience any depressive episodes. This, in turn, creates a problem for logistic regression. There was an error message indicating that group 3 not equal zero predicts failure perfectly. And, instead of two degrees of freedom for group there was only one degree of freedom (comparing Group 1 versus Group 2) and no coefficient for Group 3 versus Group 1.

This could be dealt with by changing one response score in Group 3 at random from zero to one. There was a further complexity however. Each individual in each of the three groups was measured on 12 occasions, i.e., once a month for a year. And during those twelve months none of the individuals in the control group ever experienced a depressive episode. Since change over time was one of the research questions, it didn't seem right to randomly chance responses in the control group to one.

In the end, there just wasn't not any useful information available from Group 3. It was clearly one more case of control groups gone wild.

pbe

StatTalk