Category Archive ‘SAS‘

 
 

List Processing With SAS: A Github Repository

I have a function like macro (recursive version) to create a sequence:

%macro _list(n,pre=ff);
    %if &n=1 %then &pre.1;
     %else %_list(%eval(&n-1)),&pre.&n;
%mend _list;

%put %_list(3); *produces ff1, ff2, ff3;

But when I read one of Ian Whitlock’s papers, Names, Names, Names – Make Me a List (SGF 2007, SESUG 2008),  I say: stop! I’m gonna use Ian’s %range and I create Github page to hold it (with minimum modifications due to personal preference).

I once posted My Collection of SAS Macro Repositories credited to some SAS gurus like Richard DeVenezia. When facing a programming challenge, there is always a trade-off: should I take a look at what others wrote, or I just write from the scratch? Searching also needs lots of efforts, so I plan to utilize Github pages to minimum my own searching efforts and hope it would be helpful for you (no intelligence waste anymore!). I begin with SAS list processing:

https://github.com/Jiangtang/Programming-SAS/tree/master/ListProcessing

I got most of such utilities macros (with detailed comments, examples and sources) from papers, blogs and other websites and honors belong to their authors! Sometimes I will also add my own if I think there are some holes to fill up. To get start, you may read a READ ME (will keep updated) first. Besides the individual macros, a combined file (trigged by a simple Dos command) is also available.

Localize Your Macro Variable? Mostly Not Needed or Do It If You Only Want to Initiate It

Scope

A piece of SAS codes to create a list of all variables within a dataset (a nice programming trick from Art Carpenter, 2004):

%macro getvars(dset) ;
   %local varlist ;
    %let fid = %sysfunc(open(&dset)) ;
    %if &fid %then %do ;
        %do i=1 %to %sysfunc(attrn(&fid,nvars)) ;
            %let varlist= &varlist %sysfunc(varname(&fid,&i));
        %end ;
        %let fid = %sysfunc(close(&fid)) ;
    %end ;
    &varlist
%mend ;

%put %getvars(%str(sashelp.iris));

One question would be: why declare macro variable varlist as local explicitly by a %local statement? Since it was defined within the macro %getvar, it just went to the local macro table automatically and the %local statement is not necessary anymore!

Such argument is definitely right. But the %local statement above was not intent to localize a macro variable, actually, it’s used to initiate it. If you just delete it, you will get a warning then an error:

WARNING: Apparent symbolic reference VARLIST not resolved.
ERROR: The text expression &VARLIST SPECIES contains a recursive reference to the macro variable VARLIST.  The
       macro variable will be assigned the null value.

This is because the macro variable &varlist in right side of the second %let statement was not declared before. To be clear, you can also replace the %local statement  with an explicit initiating statement:

%let varlist=;

In this example, the %local statement and the %let statement simply do the same job and the choice is totally subject to programmers’ preference(well I prefer the later).

So, take home question: any comments on the following snippet to count the numbers of word in a string(also from Art Carpenter, 2004; sorry Carpenter, your books and papers are the most familiar sources for me to learn SAS macro programming):

%macro wordcount(list);
    %local count;
    %let count=0;

    %do %while(%qscan(&list,&count+1,%str( )) ne %str());
        %let count = %eval(&count+1);
    %end;
    &count
%mend wordcount;

%put %wordcount(a b cd);

What’s New

I didn’t blog for a while in this first half March and there are bunches of new stuff to catch up:

I had a new baby! He was delivered on time (and on budget!), lions tigers and bears, oh my… His brother is Tiger so I named him, Leo.

And I got the latest SAS 9.3 (TS1M2) installed! SAS is jus getting much beautiful.

SAS9.3_12.1

OpenCDISC had the latest release, Version 1.4 with the new SDTM 3.1.3 validation checks,—and yes CDISC itself also had some significant updates:

SDTMv1.3 and SDTMIGv3.1.3 now have the machine readable metadata online. It’s a nice improvement (last year I just posted The Great, Open, Vendor-neutral, Platform-independent Data Standards, . . . Yet in PDF Formats).

Define-XML now turns to 2.0 (finally).

R had its final 2.* release, Version 2.15.3 and Version 3.0.0 will just come soon. RStudio also had a update recently. RStudio is the best IDE (not just R IDE) I used.

Google will shut down Google Reader, the best RSS reader ever. It’s a huge loss and I tell you, for example, the famous SAS and statistical blogger Wensui Liu, once frequently posted on Windows Live Space, and then Blogspot and finally WordPress. The former two blogs were closed and Google Reader feed is the only way to archive these lost posts!

New Game in Town: SAS Metadata Administration

You might also read that there were constant (although still not frequent) posts on SAS metadata querying and other administration tasks in SAS blogosphere since 2012 (when I started to play with it^). It is yet another evidence that more and more SAS programmers switched from SAS foundation to the so called SAS intelligent platform. In the traditional SAS foundation world, a nice source of metadata is the Dictionary table (or V datasets in SASHELP library); In SAS intelligent platform, a much more comprehensive metadata is stored in a SAS Metadata Server (Is "Data" singular or plural?).

To get started, Wendy McHenry posted a nice introductory blog, Why I love SAS metadata and Angela Hall, How did you find that metadata?, Jennifer Parks, Two methods for editing SAS metadata.

To communicate with SAS Metadata Server programmingly, you can use Java or Base SAS  which are both well documented. Chris Hemedinger recently shared bunch of posts on how to use PowerShell to talk with SAS Metadata Server.

Industry gurus Paul Homes and Gregory Nelson also begin to blog in SAS User Groups on this subject (Paul’s own website is a great source too).

A community blog “BI Notes” also has some interesting posts on SAS metadata. Check it out.

SAS Snippet: Reshape Data Using SAS DoW Loop (From Long to Wide)

F_Carpenter_cover.indd

In Art Carpenter’s latest book, Carpenter’s Guide to Innovative SAS Techniques, a data step approach to transpose data (from long to wide) works like (Ch2.4.2):

Art_SAS_transpose

data tst;
    input type $ grp value $3.;
datalines;
A 1 a
A 2 aa
A 3 aaa
B 1 b
B 2 bb
B 3 bbb
C 1 c
C 2 cc
C 3 ccc
;

data art(keep=type grp1-grp3);
   set tst;
   by type;
   retain grp1-grp3 ;
   array grps {3} $ grp1-grp3;
   if first.type then do i = 1 to 3;
      grps{i} = " ";
   end;

   grps{grp} = value;
   if last.type then output art ;
run;

And such logic can be best demonstrated by a DoW Loop:

data dow(keep=type grp1-grp3);
     array grps[3] $ grp1-grp3;
     do _n_ = 1 by 1 until(last.type);
        set tst;
        by type;
        grps[grp]=value;
     end;
run;

/*Note*/

1. The traditional PROC TRANSPOSE approach:

proc transpose data=tst
               out=trans(drop=_:)
               prefix=grp;
   by type;
   id grp;
   var value;
run;

2. Why use data step approach (both Art and DoW) to transpose data against the TRANSPOSE procedure:

  • it’s much faster since data step array used
  • save codes when complex transformation needed
  • last but not least, it’s cool!

3. Arthur Tabachneck maintains a general data step transposing macro, %transpose and you can call it like:

%transpose(data=tst, out=mac,
            by=type, var=value,
            id=grp)

Yet Another Undocumented Feature (new in SAS 9.3)

Today just caught an undocumented SAS feature. I ran the following SAS codes to delete a dataset(in Windows 7 with SAS 9.3):

data a;
b=1;
run;

proc datasets noprint;
delete a;
run;
quit;

and the dataset was deleted as I expected:

32   proc datasets noprint;
33       delete a;
34   run;

NOTE: Deleting WORK.A (memtype=DATA).
35   quit;

But when tested in a SAS 9.2 machine, I got errors

25   proc datasets noprint;
——-
22
202
NOTE: Enter RUN; to continue or QUIT; to end the procedure.
ERROR 22-322: Syntax error, expecting one of the following: ;, ALTER, DD, DDNAME, DETAILS,
FORCE, GENNUM, KILL, LIB, LIBRARY, MEMTYPE, MT, MTYPE, NODETAILS, NOFS, NOLIST,
NOWARN, PROTECT, PW, READ.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
26       delete a;
27   run;

NOTE: Statements not processed because of errors noted above.
27 !     quit;

NOTE: The SAS System stopped processing this step because of errors.

Then I checked the SAS HelpDoc, there is no such “noprint” option in PROC DATASETS statement, but in CONTENTS statement in the PROC DATASETS PROCEDURE. Actually I added the “noprint” option just because I expected it act like the same option in PROC FREQ statement to suppresses output. And (un)fortunately, it worked in my machine…

UPDATE(2013-02-27): Just got feedback from a SAS developer this “noprint” will be documented in the upcoming SAS 9.4; actually it works like the existing “nolist” option.

How to Write a Check? Use SAS Format!

5write_a_check_step5_sign_memo

During my initial stay in US last year, one of the interesting exercises was to write a check. The Arabic numerals (the universal language!) were pretty intuitive while I didn’t feel much comfortable on spelling the face value in words: I never played the game!

But the good side of this story was, as a SAS programmer and I used a WORDFw. Format:

data _null_;
    money=8.15;
    put money wordf100.;
run;

and got:

SAS_Format_Check

Another check with a bigger value:

SAS_Format_Check2

2

I paid specially interested on SAS Format recently because of the introducing Perl Regular Expression into PROC FORMAT invalue statement since SAS 9.3(an old procedure plays a new game!). For example, a labeled TIME8. informat created by

proc format;
    invalue xxx (default=20)
      ‘/(\d+):(\d\d)(?:\.(\d+))?/‘ (REGEXP) = [time8.];
run;

For more, see Rick Langston’s paper, Using the New Features in PROC FORMAT (2012).

Github for Clinical/Statistical Programmers

PhUSE-FDA Working Group 5 (Development of Standard Scripts for Analysis and Programming) just adopted Google Code as collaborative programming platform. Google Code is one of the most popular and respected open source software hosting sites in the world and it is definitely a good choice for PhUSE-FDA WG5.

But after viewing one of WG5’s working reports, Sharing Standard Statistical Scripts and getting to know why they finally chose Google Code (rather than Github which was also tested by WG5 members), I think it’s necessary to clarify some misunderstanding against Github where I’m also an occasional user.

As stated in Slide 11 in the report mentioned before, Github,

Too complicated an interface
Too much overhead for simple development
Too much training and education needed

designed for classic programming languages like C and Java (not for things like R and SAS)

For the first point regarding interface, it seems only Git command line tested, and it may be too complicated to “classic statistical programming users”. Actually, Github offers a great GUI tool, for example, GitHub for Windows to help users visually clone repositories, commit changes and other management tasks without typing Git commands:

Github_GUI

It’s also worthy to mention that with GitHub for Windows, users don’t need to install any separated version control software like Git, CVS or SVN. GitHub for Windows already includes a fully functional version of msysGit. It just makes users’ life much simpler. To use Google Code, you must install and configure something like TortoiseSVN.

The second, is Github suitable for “things like R and SAS”? It’s true that all hosts including Github are dominated by “classic programming languages like C and Java”. For SAS, SAS programmers as a whole are just not active in  any social coding activities, but for R, actually it is one of the mostly used languages in Github.

Google Code is good and a “Google Code vs Github” question is just mostly subjective. It seems to me the pickup of Google Code by WG5 rather than Github was based on incomplete information. I personally prefer Github and there are also some good reasons:

  • Use the GUI tool, GitHub for Windows to maintain a minimum Git/SVN/CVS setup.
  • Github supplies much richer statistics reports, including charts.
  • Github is more social oriented which makes it cool in this Web2.0 world.

Linguistic Sorting in SAS Proc Sort

Just took a look at the linguistic sorting features in SAS Sort procedure, and got some neat options to apply to my task. For example, I want to sort ID in the following dataset:

data t1;
    input ID $ ;
datalines;
T20
T4
T3
T1
;

and want to get such intuitive orderings (files sorting in Window 7 directory):

sort_num

But when apply the default sorting:

proc sort data=t1 out=t2;
    by ID;
run;

I get:

T1
T20
T3
T4

To produce what expected, add a SORTSEQ option:

proc sort data=t1 out=t3  
    SORTSEQ=LINGUISTIC(NUMERIC_COLLATION=ON)
          ;
    by ID;
run;

T1
T3
T4
T20

In the first block of code, the default sorting is determined by their characters’ appearance in EBCDIC or the ASCII tables (according to OS). To change this default collating sequences, a specific linguistic collation (numeric collation) option added.

For details, see the corresponding part in SAS SORT Procedure and Collating Sequence in SAS(R) 9.3 National Language Support (NLS) with a great paper.

SAS ODS Report Writing Interface: A Quick Demo

IRIS

I personally nominated SAS ODS Report Writing Interface was the one of the best technology I found in SAS in 2012. It can generates reports cell by cell and row by row and has much flexibility to produce highly customized reports. Basically, to use it,

  • first assign an ODS destination. Nothing new, and I prefer HTML like

ods html   file="output.html" style=sasweb;

  • then create a object(instance) based on the ODS Report Writing class, odsout,

declare odsout myout();

ODS Report Writing Interface holds some so called object oriented features. Odsout is a predefined class (or, a SET, a CONTAINER), then you take from it a instance or an object, myout(or any other legal SAS variable names). Suppose Odsout is a class for apples, then myout is a specific apple.

  • apply methods associated with the Odsout class, like CELL, ROW, TABLE to generate reports.

Roughly methods are just the functions (subroutines, procedures) in the object oriented world. For example, in SAS Data Steps, you use function weight(apple) to get the weight of the apple; in object oriented world, you use apple.weight() to return the same thing. In ODS Report Writing Interface, if you want to get a table, use TABLE methods:

    • myout.TABLE_START() to start a table
    • myout.TABLE_END to end a table

Then all we need to do next is to use the flexible SAS data steps to leverage the ODS Report Writing Interface methods (see docs). The codes, you can see below, are pretty verbose compared to its counterparts, PROC REPORT, but that’s why it gains power to build highly customized reports. It’s also very structural (and easy to build, like playing blocks):

/*making a sortable HTML table*/
%macro ods_html_sort_table;
    <script src=’
http://goo.gl/Pg0GB’></script>
    <script src=’
http://goo.gl/ruKEb’></script>
    <script>$(document).ready(function(){$(‘.table’).tablesorter({widgets: ['zebra']});});</script>
%mend;

title ;
ods listing close;
ods html   file="a:\test\iris.html" style=sasweb headtext="%ods_html_sort_table";

data _null;
    set sashelp.iris;
    by Species;
   
/*    create an object, obj, based on the ODS Report Writing class, odsout*/
    if _n_ = 1 then do;
        dcl odsout obj();
    end;

    if (first.Species) then do; *by group processing;
       obj.title(text: "Fisher’s Iris Data Set by Species"); *title;

/*       start a table*/
       obj.table_start();
               obj.row_start();
              if (Species = "Setosa") then
                 obj.image(file: "Iris_setosa.jpg" );*insert image;
              else if (Species = "Versicolor") then
                 obj.image(file: "Iris_versicolor.jpg" );
              else if (Species = "Virginica") then
                 obj.image(file: "Iris_virginica.jpg" );
            obj.row_end();

            obj.row_start();
                obj.format_cell(text: "Iris Species",  overrides: "fontweight=bold just=right" );
                obj.format_cell(text: Species, column_span: 3, overrides: "just=left");
            obj.row_end();

            obj.row_start();
                obj.format_cell(text: "Unit",  overrides: "fontweight=bold just=right" );
                obj.format_cell(text: "(mm)", column_span: 3, overrides: "just=left");
            obj.row_end();
       obj.table_end();

       /* start another table */
       obj.table_start();
            obj.head_start();
                obj.row_start();
                    obj.format_cell(text: "Sepal Length" , overrides: "fontweight=bold");
                    obj.format_cell(text: "Sepal Width" , overrides: "fontweight=bold");
                    obj.format_cell(text: "Petal Length" , overrides: "fontweight=bold");
                    obj.format_cell(text: "Petal Width" , overrides: "fontweight=bold");
                obj.row_end();
            obj.head_end();
    end;

        obj.row_start();
            obj.format_cell(data: SepalLength);
            obj.format_cell(data: SepalWidth);
            obj.format_cell(data: PetalWidth);
            obj.format_cell(data: SepalLength);
        obj.row_end();

    if (last.Species) then do;
        obj.table_end();

        obj.note(data: "Note: These Tables are Sortable."); *note;

        obj.foot_start(); *footer;
            obj.row_start();
                obj.cell_start();
                    obj.format_text(data: "Footer: Data from SAS V&sysver at &sysscp &sysscpl Sashelp.iris",just:"C");
                obj.cell_end();
            obj.row_end();
        obj.foot_end();

        obj.page();
    end;
run;

ods html close;
ods listing;

Note:

  • A flavor added to get a sortable HTML report. Thanks to Charlie Huang and then Andrew Z to introduce a Javascript library JQury to SAS HTML report.
  • The full report, see here.
  • If column spannings needed, use the following codes as header (and the report here):
  • obj.head_start();
        obj.row_start();
            obj.format_cell(text: "Sepal" , overrides: "fontweight=bold",column_span: 2);
            obj.format_cell(text: "Petal" , overrides: "fontweight=bold",column_span: 2);
        obj.row_end();

        obj.row_start();
            obj.format_cell(text: "Length" );
            obj.format_cell(text: "Width" );
            obj.format_cell(text: "Length" );
            obj.format_cell(text: "Width" );
        obj.row_end();
    obj.head_end();

  • ODS Report Writing Interface will get rid of its preproduction hat since SAS 9.4, but you can use it somehow since SAS 9.1.3. For more, see the draft SAS 9.4 ODS documentation with ODS Report Writing Interface.
  • To get started, see the developer’s paper.