Phase-III Macro System: Unterschied zwischen den Versionen

Aus phenixxenia.org
Zur Navigation springen Zur Suche springen
K
K
Zeile 30: Zeile 30:
 
== Approach ==
 
== Approach ==
  
'''# Avoid dependency of programs to data scope, study characteristics or personal styles.'''
+
#''' Avoid dependency of programs to data scope, study characteristics or personal styles.'''
'''# Have modules implemented in a way to operate in any emerging environment.'''
+
#''' Have modules implemented in a way to operate in any emerging environment.'''
'''# Be prepared to add new output structures without substantial delay.'''
+
#''' Be prepared to add new output structures without substantial delay.'''
'''# Produce a wide variety of output with a minimum set of modules.'''
+
#''' Produce a wide variety of output with a minimum set of modules.'''
'''# Minimize maintenance efforts through self-documenting and limited program code.'''
+
#''' Minimize maintenance efforts through self-documenting and limited program code.'''
'''# Maximize validation throughput by adopting a non-mutual-impact architecture.'''
+
#''' Maximize validation throughput by adopting a non-mutual-impact architecture.'''
  
 
== Architecture ==
 
== Architecture ==
Zeile 59: Zeile 59:
 
== Info Modules ==
 
== Info Modules ==
  
=== %GET_ATTR() ===
+
=== [[MACRO GET ATTR|%GET_ATTR()]] ===
  
 
==== Function ====
 
==== Function ====
Zeile 68: Zeile 68:
  
 
Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement.
 
Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement.
 
[[MACRO GET ATTR|Source]]
 
  
 
=== %GRP_DESC() ===
 
=== %GRP_DESC() ===

Version vom 28. Juni 2013, 16:36 Uhr


General

The Phase-III Macro System is a flexible, data independent and parameter controlled set of SAS macros.

The Phase-III Macro System is not an end-to-end reporting tool.

  • It is a highly interacting collection of macro modules providing transformation methods for study emergent datasets making use of all the information available in the description part of the dataset processed. The user is provided with (an) output dataset(s) containing character columns with standard names and externally controlled attributes.
  • The Phase-III Macro System provides subroutines that care for data types, formats, labels, headers, missing values, loops and more. Runtime generated information used to control processing is kept in standardized data structures using macro-variable lists (mlists), SAS formats and datasets.
  • Input data structures may need some form of pre-processing as well as output data structures may need some post-processing to perfectly fulfil requirements. The Phase-III Macro System already supports these steps to some extent by providing condense, struct and missline functions.

Objective

The Phase-III Macro System is aimed at serving as a base for an extendable system that provides mechanisms for shaping input datasets, processing calculations and generating SAS datasets with ready made text content.

Scope

The Phase-III Macro System interacts with and makes use of other programs, modules, systems and datasets available. Communication and information interchange use SAS macrovariables, environment variables from the operating system and data structures compatible with the SAS System.

Input data streams will require preprocessing in general by assigning formats and labels. Output datasets will need postprocessing using merge and set operations mainly.

Characteristics

Module size is kept small (not more than three screen pages) for maintainability and avoids hard-coded references to any application related information like data types, labels and formats. Coding style makes broad use of automatic documentation and generation of meta data and lookup tables at runtime.

Approach

  1. Avoid dependency of programs to data scope, study characteristics or personal styles.
  2. Have modules implemented in a way to operate in any emerging environment.
  3. Be prepared to add new output structures without substantial delay.
  4. Produce a wide variety of output with a minimum set of modules.
  5. Minimize maintenance efforts through self-documenting and limited program code.
  6. Maximize validation throughput by adopting a non-mutual-impact architecture.

Architecture

Info Modules

Provide information about datasets and variables for correct processing.

Service Modules

Provide frequently requested tasks in a standard format with limited parameter set

Core Modules

Perform input transformation, calculations and output transformation

User Modules

Generate datasets carrying subtables controlled by user-supplied parms.

Module Details

Info Modules

%GET_ATTR()

Function

Return single attributes like label, format, etc.

Description

Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement.

%GRP_DESC()

Function

Provide info about a categorial variable.

Description

Investigates given categorial variable and provides results using undeclared macro variables: &n_grp - number of distinct values; &v_grp – structured list of distinct unformatted values; &l_grp – structured list of distinct formatted values.

%CHK_LIST()

Function

Provide info about a list type macrovar.

Description

Reads supplied list of tokens and returns undeclared macro variables: &n_lst - number of list elements; &v_lst – structured list of supplied elements. Input list elements may be separated by blank and comma only.

User Modules

%TWO_CATV()

Function

Deliver PCT/count table from 2 nested categorial variables.

Description

Perform nested processing of two categorial variables looping the context variable from the row_* modules over the categories of the "outer" categories.


Parameters

Name Description
dsn input dataset name
row, row2 categorial variable name, 2=nested variable
exclude decode for excluded group from &ROW
weight Y/N (multiply percentages for &ROW and &ROW2)
col categorial variable name used for columns
head2 Y/N (block header for nested variable)
indent, indinc n (number of indent columns and increment for nested variable)
num n (sequence number of output)
stat Y/N (column with statistics names)
space 1/2/3 (blank line before or after output and between nesting levels)
struct, struct2 name of reference dataset used for full decode structure, 2=nested variable
condense var#value (non-distinct variable and true value for &ROW)
misslin2 Y/N (force missing line for nested variable)

Source

declares and upper level processing
%MACRO TWO_CATV(dsn=
               ,exclude=
               ,row=
               ,row2=
               ,col=
               ,indent=0
               ,num=
               ,stat=N
               ,weight=Y
               ,space=2
               ,condense=
               ,struct=
               ,struct2=
               ,head2=N,misslin2=
               ,indinc=2)
/ store des="" 
;
%LOCAL n_grp v_grp n name;
%LET name=TWO_CATV;
%IF &STRUCT  eq %THEN %LET struct =&DSN;
%IF &STRUCT2 eq %THEN %LET struct2=&DSN;
%GRP_DESC(dsn=&DSN
         ,grp=&ROW
         ,miss=n)
;
%TOP_FILT(dsn=&DSN
         ,grp=&ROW
         ,by=&COL
         ,grplvl=&NUM
         ,var=
         ,condense=&CONDENSE)
;
%TOP_FREQ(dsn=top_filt
         ,struct=&STRUCT
         ,grp=&ROW
         ,by=&COL)
;
%TOP_OUTC(dsn=top_freq
         ,head=n
         ,total=n
         ,stat=&STAT
         ,indent=&INDENT
         ,grp=&ROW
         ,rev=n
         ,use=
         ,by=&COL
         ,missline=)
;
loop for lower level processing
%DO n=1 %TO &N_GRP;
  %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN %DO;
    %ROW_FILT(dsn=&DSN
             ,context=&ROW
             ,subgrp=&N
             ,grp=&ROW2
             ,by=&COL
             ,var=
             ,miss=n)
    ;
    %ROW_FREQ(dsn=row_filt
             ,sum=top_freq
             ,struct=&STRUCT2
             ,context=&ROW
             ,grp=&ROW2
             ,by=&COL
             ,weight=&WEIGHT)
    ;
    %ROW_OUTC(dsn=row_freq
             ,sum=main_3rd
             ,head=&HEAD2
             ,stat=&STAT
             ,indent=%EVAL(&INDENT+&INDINC)
             ,context=&ROW
             ,grp=&ROW2 
             ,by=&COL
             ,missline=&MISSLIN2)
    ;
  %END;
%END;
care for naming and send completion mail
%IF &TAB_NAME ne %THEN %DO;
  data %SUBSTR(&TAB_NAME,1,3)&NUM%SUBSTR(&TAB_NAME,5,4);
   set
  %DO n=1 %TO &N_GRP;
    %IF &SPACE eq 1 %THEN dummy ;
    %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN row&NUM._&N ;
    %IF &SPACE eq 2 %THEN dummy ;
  %END;
    %IF &SPACE eq 3 %THEN dummy ;
   ;
  run;
%END;
%GEN_MAIL(name=&NAME);
%MEND TWO_CATV;