Thursday, June 14, 2012

To calculate SAS program run time

Sometimes, I wanted to compare the efficiency between two approaches. I used following codes to get the program execution run-time:
 
%let _sdtm=%sysfunc(datetime());
*** SAS codes ***;
%let _edtm=%sysfunc(datetime());
%let _runtm=%sysfunc(putn(&_edtm - &_sdtm, 12.4));
%put It took &_runtm second to run the program;

It is also very easy to print the date and time:
%put %sysfunc(putn(&_sdtm, datetime20.));

Friday, June 8, 2012

Importing/exporting to XML file

I used to save metadata in SAS datasets so that I could check received data, or do something else. Usually, I save the metadata with data so that I don't create any new libname. But, other users may not want to see the metadata mixed with real data in SAS datasets.

I don't like to export metadata as a MS Excel sheet. MS Excel may change the original data, and some data structure information will be lost during PROC IMPORT/EXPORT. When the Excel sheet is imported back to SAS dataset, it could be significantly changed comparing to the original dataset.

I may also use plain text file, like CSV, ASCII files. It usually needs more customized codes.

One of solutions is using XML file. Below are the sample codes:

LIBNAME myxml XML "path\file_name.xml" XMLMETA=SCHEMADATA XMLTYPE=MSACCESS;

In the codes, XMLMETA=SCHEMADATA makes the XML file contain both metadata and data; XMLTYPE=MSACCESS is best XMLTYPE that will preserve the original data.

Using "PROC COMPARE" to compare the original dataset and dataset imported from XML file, the variables' attributes are almost identical, except format and informat. For data, I only found that datetime values are coverted to INT when datasets are exported to XML file by now. In most of cases, this works fine for me.

Wednesday, June 6, 2012

To select desired outputs in one PDF file


Recently, I was asked to generate a series of correlation coefficient plots (Scatter Plots). I put them in one big PDF file. Basically, the codes are like:

ods pdf file="***.pdf";
... codes for generating plots ...
ods pdf close;

Today, I was asked to add p-Values in those plots. To get p-Values, I have to use “ODS OUTPUT” (true?) to catch it by “PROC CORR”. The codes are like below:

ods output PearsonCorr=out1;
proc corr data=dataset_name PEARSON;
var x y;
run;
ods output off;

Basically, I just want the p-Value in dataset out1 from "ods output". After I add the codes, the outputs of “PROC CORR” are added to the PDF file. ouch!

Maybe, I used wrong terms. I could not find any answer from google to solve the issue. I turned back to SAS help. After researches and tests, I found a solution: adding “ods pdf exclude/select;”  to turn off/on the outputs in the PDF file. 

So the codes are like below (highlighted in light-green):

ods pdf file="***.pdf";
... codes for generating plots ...

ods pdf exclude all;
ods output PearsonCorr=out1;
proc corr data=dataset_name PEARSON;
var x y;
run;
*ods output off;
ods pdf select all;
... codes for generating plots
ods pdf close;

"PROC CORR" outputs are not shown in the pdf file. It works perfectly well.

Friday, June 1, 2012

To get obs number of a dataset


There are some approaches to get observation number of a dataset.

I usually use the observation numbers to decide how to generate a TLG. For example, if there is no data, I would generate a table with text as “No observation found…”, instead of no output or a blank page.

Mostly, I would use following codes:
%LET dsid =%sysfunc(OPEN(dataset_name));
%LET nobs =%sysfunc(ATTRN(&dsid, NLOBS));
%LET rc =%sysfunc(CLOSE(&dsid));
It should be good enough for me, because I know the dataset will be there. This means, above codes will not work for SAS views or if a dataset does not exist. If I want to put above codes in more general programs, I may check &dsid if the dataset exists.

I can also use SASHELP.VTABLE to fetch dataset information I want. It usually consumes a little more brain energy to get there, for example, I have to deal more details on libname and memname, and maybe more keystrokes.

If I want to know the observation numbers of subset of a dataset, which means a dataset with WHERE statement, it seems that I have to count the observations one by one. Usually, I use following codes:
%LET nobs=0;
DATA _null_;
    SET dataset_name (WHERE=(where_statement)) end=eof;
    IF eof THEN CALL SYMPUTX("nobs",_n_);
RUN;

It is important to have "%let nobs=0;" to initiate macro variable nobs.