I Wish Sas By Statement Was Retired

I’ve been debating whether by statement should be retained in Stan. This is quite easily my least favourite “feature” of SAS. The modern methodology of “split-apply-combine” in R and Pandas is far superior and easier to read compared with the many hacks which come along. For example:

data class;
    set sashelp.class;
    by sex; /*just pretend its been sorted...*/
    retain cumulative_age name_list;
    if first.sex then do;
        cumulative_age = 0;
        name_list = '';
    end;
    
    cumulative_age = cumulative_age + age;
    name_list = catx(', ', name);
    if last.sex then output;
run;

is quite a common “recipe” in SAS coders who I have encountered. Often, it gets way more complex quite quickly. But under the “split-apply-combine”, the formula is the same and consistent.

df.groupby("sex").apply(<some function here>)

You don’t have to worry about retain statements (I’m pretty sure my sas code above is actually wrong; I can’t avoid a SAS license to check) or whether you remembered the clumsy “first.”, “last.” idioms. Lets not get into running means and other weird custom data steps which can’t be solved via a proc.

Despite all of this, if Stan actually becomes popular someone may have to implement this. This will be somewhere far down the to-do list!