Category Archive ‘CDISC‘

 
 

Blogging is Awesome: CDISC Bloggers

I remember when blogging was cool.

Before the specializing and monetizing and Twitter-izing.

                                      —Peter Dewolf

Well I think blogging is still cool (and awesome and awesome …). The most appealing personal reason is, blog posts are Google searchable and suitable for archive while Tweets NOT. Admittedly I hold some sort of  Existentialism 2.0:

if it is not Google searched, it doesn’t exit!

Last month I placed a post on how to keep pace with CDISC from its official channels and I feel cool to add an appendix of source from the awesome blogosphere. Fortunately or not, CDISC is still in the niche market of topics and it takes few efforts to get the list(update me if someone else available! if you are a Google Reader user, just simply import this file, my Google Reader subscription on CDISC):

1. Blog @ Assero by Dave Iberson-Hurst (“Dave IH”)

http://www.assero.co.uk/category/blog/

Insightful and full of humor. I retweeted all of its latest posts and you can feel somehow on these tittles (YES on CDISC):

What I Want, What I Really Really Want

Churchill, the FDA and a Fall

Mad March and the FDA

Btw, I write blogs casual way while it is very impressive to read IH reminding me the George Orwell style.

2. d-Wise Technologies Blog

http://www.d-wise.com/blog/

It is my employer’s official blog site where Chris Decker is the key contributor to CDISC. You can check out his latest posts on FDA/PhUSE Annual Computational Science Symposium where he served as committee lead:

Overcoming Industry Challenges: A Shift to Collaboration

Validation and Quality: Are They the Same?

I will also commit to update this blog as my understanding on clinical standards goes. Here is the saying:

look to the master,
follow the master,
walk with the master,
see through the master,
become the master.

3.  XML4Pharma Blog

http://cdiscguru.blogspot.com/

with industry news and hard (while cool) way writing on XML (CDISC ODM, define.xml).

4. eClinical Trends by Clinovo

http://blog.clinovo.com/category/cdisc/

Clinovo jumps to this topic by launching a CDISC SDTM convertor CDISC Express.

5. eClinicalOpinion

http://eclinicalopinion.blogspot.com/

This blog is most focused on EDC, the clinical data management part. I like its series discussion on CDISC ODM.

6. eCTD Regulatory Submissions Network

http://ectdregulatorysubmissionsnetwork.blogspot.com/

This is a personal blog by Shakul Hameed. I read it mostly to get some information on submission requirements from European regulatory.

7. HL7 Watch

http://hl7-watch.blogspot.com/

while it is not CDISC directly related (#6 also), it’s nice to get some voice of HL7 which would be the future of CDISC.

8. From a Logical Point of View-CDISC

http://www.jiangtanghu.com/blog/category/cdisc/

Yes this one, my 2 cents. I will keep recording my personal immersion and understanding on CDISC and related clinical standards. (while it is privilege to cross reference oneself in his/her own blog! Keep awesome, keep blogging.)

9. Linked Data and URI:s for Enterprises

http://kerfors.blogspot.com/

Look at the colon (:) in the title of this blog and you’re right this blog plays (at least) with XML. I find it is good resource (thanks @kerfors for referencing!) to learn ODM, the foundation of CDISC while the latest post is

Semantic models for CDISC based standard and metadata management

P.S.: Blogger Chris Hemedinger maintains a nice list of SAS bloggers (blogs by SAS employees, and blogs by SAS customers, consultants, and the analytics community).

OpenCDISC Validator V1.3: An Unboxing Review (1): counting issue

The lasted OpenCDISC Validator version 1.3 was released at 29 March, 2012 (btw, there is a typo in the Line 1 of CHANGELOG.txt within the package: “2012” not “2011”). As usual, you can submit the following SAS scripts to get some basic information(remember to customize your directory):

filename CDISC url "https://raw.github.com/Jiangtang/Programming-SAS/master/Rules_Count_OpenCDISC_XML.sas";

%include CDISC;

%Rules_Count_OpenCDISC_XML(dir=C:\OpenCDISC1.3\compare\opencdisc-validator_1.3\config)

and you get a summary of validation rules of OpenCDISC Validator V1.3 (499 total unique rules):

OpenCDISC_V1.3

where

AD: Analytical Data
CT: Controlled Terminology
DD: Data Definition
OD: Operation Data Model
SD: Study Data
SE: SEND data

As comparison, a summary of V1.2.1 (385 total unique rules) posted before:

The most significant enhancement of V1.3 against V1.2.1 is the adding of rules for SDTM 3.1.2 with Amendment 1 and SEND 3.0. You can see there are also some changes among others modules, such ADaM 1.0 and SDTM 3.1.2. The OpenCDISC release newsletter said that there are 43 new SDTM rules added. Well, rules deleted, rules added, rules commented, we now have some arithmetical discrepancies.

The scripts above capture all instances of validation rule IDs (also delete some commented for example in config-define-1.0.xml, four rules commented: OD0004, OD0005, OD0007, OD0008). We can also double validate the counts manually:

  • copy all contents for example in SDTM 3.1.2 in its website into Notepad++ (where line numbers displayed)
  • delete all unnecessary entries
  • then the last line number is the total number of the rules (227 in this case).

Another way to check the rules is to open the XML configuration files using a web browser:

Theoretically the three ways are identical in counting, but there is an open bug in the style sheet file in …\OpenCDISC1.3\opencdisc-validator\config\resources\xsl\config.xsl, Line 175:

<xsl:template match="val:Unique|val:Condition|val:Match|val:Regex

|val:Required|val:Lookup|val:Metadata">

There is no “val:Find” to render all the Find validation rules (AD0061 in config-adam-1.0.xml) so all Find validators are not displayed. A suggested workaround is just to add “val:Find” to the file:

<xsl:template match="val:Unique|val:Condition|val:Match|val:Regex

|val:Required|val:Lookup|val:Metadata|val:Find">

Actually in the “OpenCDISC Validation Framework” page of OpenCDISC website, the “Find”validator is not documented yet.

<to be continued>

Fetch CDISC Control Terminology Files in NCI Vocabulary Repository: All in One Click

CDISC Control Terminology is the most frequent updated model among CDISC standards. Take SDTM as example, the latest SDTM terminologies released at 23 March 2012; and from 2009 to 2011, there were 15 different SDTM terminology versions! If you just rely on your own local repository, you might miss the pace somehow.

Here is a simple approach. Just submit the following one line of codes in a shell,

wget http://evs.nci.nih.gov/ftp1/CDISC/ -r –no-parent  -l 3

and you will get all the CDISC Control Terminology files (plus historical versions) with proper folder structures in your local driver (the current directory of your shell).

For syntax details, you can refer to its online manual:

-r: recursive retrieving

–no-parent: only fetch files under the URL

-l 3: set maximum depth level of 3  

If you are a Windows user, you might install Wget for Windows (it is a native tool in Unix/Linux under GNU)and add it into your environmental variable, Path. You can also save the above scripts in a notepad and save it as a .bat file(test.bat for example). Next time you just click the test.bat to get all the updates.

Wget is a very powerful tool. For me, the download speed is pretty acceptable (depends on internet connection) and almost no difference between a Windows 7 and a Ubuntu 11 machine:

Wget_Win7

Wget_ubuntu11

Quick Notes on RTP CDISC User’s Group Q1 Meeting

It’s my first time to attend a local event, RTP (Research Triangle Park) CDISC User’s Group meeting, Q1 and here are some quick notes.

1. people

Almost fresh faces for me. It’s nice to meet Jack Shostak of Duke Clinical Research Institute again. I visited him in Duke last year after SAS Global Forum in Las Vegas. Jack has a forthcoming book on SAS and CDISC, Implementing CDISC Using SAS: An End-to-End Guide. It’s the first book on this topic and worth waiting!

I also met (unexpectedly and exciting) a Chinese friend Chunmao in the meeting. Very interesting: after introduction, then we got that we emailed on CDISC mapping before! Chunmao just moved from DC to Triangle as SAS programmer weeks before(a side message: Triangle is hiring!). Big bonus to attend this meeting.

My colleague Chris Decker of d-Wise Technologies also showed up in this meeting. Actually he and Jack both serve as committee members in RTP CDISC User’s Group (they are also core members in CDISC community worldwide).

Tom Soeder of Cato (venue supplier for this meeting) kindly served as host while Jeff Abolafia of Rho the moderator.

2. agenda

Jeff and another key member of this group introduced some important updates from CDISC. One of the most interesting messages for me is the regular release cycle of SDTM Model and Implementation Guide. SDTM will be released semiannually, so we will get SDTMIG 3.1.3 in this summer, 3.1.4 at the end of year which will mainly hold the recently updates of Trial Summary, an amendment,  and CDISC Devise domains respectively.

SDTM is the flagship model of CDISC. SDTMIG 3.1.1 published in 2005 while 3.1.2 in 2008. It’s nice to see from the new more frequent release schedule that the CDISC community is expanding (and more organized and expected).

Recently SDTM does have lots of updates, including a copy of the Metadata Submission Guideline (MSG). CDISC organization will also offer periodic webinars on updates.

Chris then gave a summary on latest FDA/PhUSE Computational Science Symposium (CSS while Chris organized it). You may get more information on Chris’s blog, and CDISC blog. It’s better to keep CSS in watch list.

Jack and Jeff had comments on working the FDA/PhUSE working groups.

Peter Schaefer of Certara released the outputs of latest CDISC user network servey where SDTM and ADaM are still on the top of user’s list.

Final part (most practical), group exercises! Three groups were assigned to map some challenging CRF pages to SDTM. Some users also took some CRF pages from their own companies for public discussion (nice to have some flavors!).

3. Links

RTP CDISC User’s Group on Yahoo Group (the traffic is low but still informative):

http://tech.groups.yahoo.com/group/rtp_cdisc/

CDISC official site:

http://www.cdisc.org/

Now we have more reasons to visit CDISC website frequently for new updates models (e.g., Control Terminology also released semiannually) and webinar postings.

FDA/PhUSE working groups Wiki:

http://www.phusewiki.org/wiki/index.php?title=PhUSE_Wiki

Lots of action followed by the six working groups.

Chris is one of the core members to promote CDISC among industry and regulator  and he is also the most active blog writer on d-Wise blog and you can get informed:

http://d-wise.com/blog/

GitHub and Weekend Programming

Yihui of Iowa State just texted me that GitHub is programmers’ Facebook. Inspired by him(great thanks!), I also begin to play with GitHub now:

https://github.com/Jiangtang

Currently I only created one repo as personal SAS code repository. To kill weekend time, I uploaded piece of codes to count the OpenCDISC validation rules by models. To use it:

filename CDISC url “https://raw.github.com/Jiangtang/Programming-SAS/master/Rules_Count_OpenCDISC_XML.sas”;

%include CDISC;

%Rules_Count_OpenCDISC_XML(dir=C:\temp\OpenCDISC\software\opencdisc-validator\config)

while get:

OC_by_model

Happy weekend and happy programming.

Face Off: Review OpenCDISC XML files

OpenCDISC, the first open source CDISC validator, is already in the toolbox of FDA reviewers (CDER/CBER, see CDISC Standards in the Regulatory Submission Process, 26 January 2012, P.33). The key features in OpenCDISC is a dichotomy of validation rules (XML based) and application logic. Currently OpenCDISC Validator (Version 1.2.1) officially supports the four following CDISC modules:

You can get the corresponding configuration files (validation rules) online or in the software folder (in ..\opencdisc-validator\config with extension of .xml). Since SDTM 3.1.2 has the most rich set of validation rules from Janus, WebSDM and of course additional  OpenCDISC rules by itself, its configuration file (config-sdtm-3.1.2.xml) deserves more attention. Better understanding of config-sdtm-3.1.2.xml is the first step to customize the software according to business needs. Followings are some personal tips and tricks to play and even “torture” the file, using Notepad++, web browsers (IE and Firefox), Excel with MSXML and SAS XML Mapper.

1. DON’T use the Windows default Notepad to open and edit the xml file

XML_Notepad

while the reason:

if you use Notepad to open a XML file, almost you get nothing but strings and strings.

and another supporting reason, see bellowing picture.

2. USE Notepad++ or other REAL text editors to open and edit it

XML_Notepad

Notepad++ makes the difference. It supports multiple tabs view, XML syntax highlighting and XML tags match and other fancy stuff never in the plain Notepad. And like OpenCDISC, it’s free, both in sense of free beer and free speech.

Other real text editor, include Vim, UltraEdit and such, but for most users, I still think Notepad++ is the most handy one.

3. At first, use a web browsers to review it

XML_IE

It is the web view of config-sdtm-3.1.2.xml. The secret is a style file, define-1.0.xsl in ..\opencdisc-validator\config\schematron. This is another story of dichotomy. The config-sdtm-3.1.2.xml file itself is only used to store metadata (machine-readable), while the style file (also a XML file) used to instruct how to display it (human-readable). Within some proper internal interface, web browsers (I tested in IE and Firefox; Google Chrome doesn’t work). Excel can also render this XML file well (only test on Excel 2010 and 2007) while Web view is much better:

XML_Excel

4. The real awesome job: use Microsoft XML parser or other XML parsers to dig into XML structure

XML_Tags_Excel

I use Excel 2010 with Microsoft XML parser (MSXML 6.0. You can get the version of your MSXML by visiting this website in IE and you will get the different results when switching to other web browsers because Firefox and Chrome use other parsers).

You can also get a instance of each XML tag:

XML_Tags_Excel_preview

5. The real awesome job: use SAS XML Mapper to get the tabulation view

And you may want to exact all the tables in the XML file with tabulation view, ideally, in SAS dataset:

For example, the first few rows in config-sdtm-3.1.2.xml:

ODM_xml_tab

and the corresponding SAS dataset:

ODM_tab

Actually you can put all the data in XML into a big dataset but with lots of redundancies. To use SAS XML Mapper (the latest version is 9.3), you should design a mapping file to tell the structure of the XML file. For the simple ODM dataset, you indicate the table name, column name and path, type and length:

map

It never be fun to play with XML files. SAS XML Mapper is supposed to read CDISC ODM based XML files automatically (OpenCDISC XML files are called ODM compliant), but at least for this config-sdtm-3.1.2.xml, it failed and that’s why we should create a mapping file (see above) by ourselves. Fortunately you don’t need to write it from scratch (it would be thousands of lines of codes):

  • find a CDISC ODM based XML file that SAS XML Mapper can read automatically, e.g., in http://www.cdisc.org/define-xml, a file named define-example1.xml works well.
  • use AutoMap function in SAS XML Mapper to get the mapping file.
  • modify the mapping file to fit your needs.
  • for details, refer SAS XML mapping syntax.

6. Final Notes for Excel

Right click config-sdtm-3.1.2.xml then open with “Microsoft Excel”:

Excel1

Option 2 will go to section 3. If go with option 1:

Excel2

Option 1-1 and 1-2:   tabulation view in section 5

Option  1-3:  tag view in section 4.

Dive into CDISC Express (5): Generate and Validate SDTM domains and define.xml

Dive into CDISC Express (1): Introductory

Dive into CDISC Express (2): Create a New Study

Dive into CDISC Express (3): Navigate mapping file

Dive into CDISC Express (4): Data manipulation techniques

A more friendly PDF version of these all CDISC Express series is also available in

http://jiangtanghu.com/docs/en/CDISCExpress.pdf

The following tasks, such as generating SDTM domains and define.xml, need just some clicking button work in CDISC Express using a well designed mapping file. Few words needed due to the software.


Click to read more…

Dive into CDISC Express (4): Data manipulation techniques

Dive into CDISC Express (1): Introductory

Dive into CDISC Express (2): Create a New Study

Dive into CDISC Express (3): Navigate mapping file

4.3 Data manipulation techniques in CDISC Express

CDISC Express supplies relative rich sets of data manipulation techniques assembling with SAS languages used for data mapping. Following is a not limited listing and I will keep it updated.

4.3.1 Reference one dataset

A raw dataset name appear in “Dataset” column indicate a “set” operation in SAS.

All dataset options can be used when referencing a dataset, such as

siteinv(drop=invcode)

siteinv(rename=(invcode=inv))

siteinv(where=(invcode ne “”))

You can also reference an external dataset. You should incorporate the external file in spreadsheet with name beginning with an underscore, “_”, and “_visits” in this case:

clip_image001

Then you can use it in any domains needed, e.g., TV domain:

clip_image003

There is a macro %cpd_importlist used to import the external dataset, “_visits”. Again, this macro roots in C:\Program Files\CDISC Express\macros\function_library\.

Using a macro call to re-sharp or modify an input dataset offers great flexibility referencing data. We will also discuss the benefits later on.

4.3.2 Assignment

You can assign a number, string and a dataset variable with any valid SAS functions to a SDTM domain variable in “Expression” column.

Sometimes a temporary variable needed for later calculation. You can produce such temporary variable in “Dataset” column with an assignment in the “Expression” column just similar with any other domain variables. Two differences: first, such temporary variables named begin with an asterisk, “*”; second, all temporary variables will not be included in the final domain. Once created, such temporary variables can be used for any other expressions.

clip_image005

There are three special symbols used in “Dataset” column of CDISC Express. Asterisk, “*” indicates a temporary variable, while other two are

Tilde, “~” : indicate a variable used for supplemental domain (SUPPQUAL).

Number sign, “#”: indicate a variable used for comments domain (CO).

Another symbol, at sign, “@”, used in “Expression” column, indicated referencing a variables produced before:

clip_image006

In this case, “AGEU” uses “AGE” as input, while “AGE” is calculated before. “@AGE” just indicates the dependency. In concept, it looks like the “calculated” option in SAS PROC SQL:

proc sql ;

select (AvgHigh – 32) * 5/9 as HighC ,

(AvgLow – 32) * 5/9 as LowC ,

(calculated HighC – calculated LowC)

as Range

from temps;

quit;

4.3.3 Match-merging

We already got a math-merging example before. If “all” appears as a dataset in the “Dataset” column, all the previous datasets should be merged first for later processing by the common key specified in “Merge Key” column. If no key assigned, patient ID is used by the system.

CDISC Express also supports two types of join, inner join and outer join (left, right, full) using data steps. The implementation has slightly difference with standard SQL, but the ideas are same.

We add a new column, “Join”, usually beside the “Merge Key” column.

clip_image008

There are two values for “Join”, “O” or “I” while “O” stands for “outer join” and “I”, “inner join”. A join indicator “I” equals a dataset option “in=” in action while “O” means no. Use the above as illustration, the corresponding SAS codes behind look like

data temp;

merge demog(in=a) siteinv(in=b);

by sitecode;

if b;

run;

This is so called “right outer join”. The combination of “I” and “O” in these two datasets can perform all the four types of join, one inner join and three outer join:

in 

As we could see, if no “Join” column specified, CDISC Express will perform inner join by default.

So far CDISC Express cannot support multiply merge keys. For example, the following file is illegal currently:

 

Dataset

Merge Key

arm   

siteid, grpno

armdescri

siteid, grpno

The developer Romain indicated that such enhancements would be raised to the next round of product road map and he also proposed a work around. To use multiple keys for merging, we can create a temporary variable holding such multiple keys as a concatenation then this temporary variable can be used as a single merging key.

4.3.4 Concatenating

Above we discussed lots about “merge” operation in CDISC Express. This section dedicated for “set” operation. We already know how to “set” one dataset for referencing, but how to “set” multiple datasets, i.e, “Concatenating”?

Symmetrically, an “all” appears in “Dataset” column indicating merging operation, an “all (stack)” indicates concatenating operation:

clip_image014

The above file can be also translated to SAS codes for better understanding:


Click to read more…

Dive into CDISC Express (3): Navigate mapping file

Dive into CDISC Express (1): Introductory

Dive into CDISC Express (2): Create a New Study

4. Step 2 of 6: Generate mapping file

Generating template (blank) mapping file only needs pieces of effort by submitting generate_mapping_template.sas. The toughest one is to fill it with mapping rules according to specified study.

4.1 Get the blank template mapping file (generate_mapping_template.sas)

To get the blank template mapping file, just fill the one line of macro call in generate_mapping_template.sas:

%createmapping(filespec=SDTM_Specs_3_1_1.xls, Dom=CM AE TV, req=YES, perm=YES, exp=YES);

Also, you can specify SDTM implementation version, 3.1.1 or 3.1.2. For domains (&Dom), DM, CO and SUPPQUAL will be created automatically; you should list others accordingly:

SDTM 3.1.1: SV CM EX AE DS MH DV EG IE LB PE QS SC VS TV TI XD SU XR XS XE TR (Total: 22)

SDTM 3.1.2: AE CE CM DA DS DV EG EX FA IE LB MB MH MS PC PE PP QS SC SE SU SV TA TE TI TS TV VS (Total: 28)

You should also choose the “CORE” variable (REQUIRED, PERMISSIBLE and EXPECTED) by triggering &req, &perm, and &exp to “YES” or “NO”. Note that

REQUIRED and EXPECTED variables must always be included (req=YES, exp=YES);

PERMISSIBLE variables included if needed (perm=YES or perm=NO)

Submit generate_mapping_template.sas and you can get a blank template mapping file tmpmapping.xls in C:\Program Files\CDISC Express\temp\.

clip_image002

Copy it to C:\Program Files\CDISC Express\studies\CLINCAP\doc\Mapping file – working version for example used for study “CLINCAP” and then fill all the blank columns (it NEEDS efforts!).

If this mapping file passes the validation process, a final version named mapping.xls will be copied automatically to C:\Program Files\CDISC Express\studies\CLINCAP\doc\Mapping file – validated version\ for later processing.

Note that if you already have some validated mapping file for other studies, it would serve as a good start rather than using the blank template from the scratch.

4.2 Navigate mapping file

Let’s take a look at the “real” worked mapping file for a demo study first, in C:\Program Files\CDISC Express\studies\example1\doc\Mapping file – working version\.

The first sheet is a welcome dashboard:

clip_image004

Then StudyMetadata sheet, a XML metadata specification used to generate define.xml. you need only add some information in “Values” column:

clip_image006

The FORMAT sheet:

clip_image007

Such format structure is similar with the one we export the format from a format catalog using

proc format library=library cntlout=format_out;

run;

In most production environment, programmers get formats from clinical data management group. If the entire formats are assigned into proper libraries (work or library), you don’t need to export such formats into this spreadsheet. Of course in the format sheet, you can type some customized format.

A typical domain sheet (AT LAST!) that needs efforts and our understanding of the software, DM for example:

clip_image009

From the ‘Dataset’ column, three raw datasets from C:\Program Files\CDISC Express\studies\example1\source\ needed to map into DM domain, demog, siteinv and eligassess. Note that you can use any data step options such as drop=, rename=, where= for the input datasets.

At the last of ‘Dataset’ column, “all” indicates that all the previous datasets mentioned above should be merged together for final processing.

In the ‘Merge Key’ column, ‘sitecode’ is designed to datasets demog and siteinv which means demog and siteinv should be merged by the common key, ‘sitecode’.

As we mentioned, all the previous datasets should be merged at last. But there is no common key settled in the ‘Merge Key’ column. It is a common rule: if no key specified for merge, USUBJID is used by default.

The third column is ‘CDISC variable’, which list all the needed variables according to implementation version. An important note: you do not need to implement all the variables according to the order as they appear in the blank template mapping file. In the previous blank file, “AGE” in DM domain is ordered in Line 12, but in this working file, “AGE” is calculated in the second last order. The variable order of final DM domain will be as same as the blank one.

It makes sense in practice. For example, the sequential variable, e.g. AESEQ is ordered after USUBJID, but you can only get the sequential number when all other variables well done. So SEQ variables are always computed in the final stage in a working mapping file.

“Expression” column specify the mapping rule from raw datasets to SDTM domains. Assignments, expressions and macro calls (rooted in C:\Program Files\CDISC Express\macros\function_library\) are allowed in this column and most of them are straightforward. We will discuss more in the following section.

Sum up, we can “translate” this mapping sheet to SAS codes for better understanding of CDISC Express architecture:


Click to read more…

Dive into CDISC Express (2): Create a New Study

Dive into CDISC Express (1): Introductory

3. Step 1 of 6: Create a new study (create_new_study.sas)

Open create_new_study.sas in C:\Program Files\CDISC Express\programs\, you can see only one line of a macro call:

%addnewstudy(studyname=my new study);

Just assign a study name to the macro variable, &studyname, e.g, “CLINCAP”:

%addnewstudy(studyname= CLINCAP);

Submit the codes, you can find a folder named “CLINCAP” with the same structure as the two demo studies imbedded in this application(example1 and example2) in C:\Program Files\CDISC Express\studies\, see(the left and right panels are folders and files before and after the execution of create_new_study.sas. The following the same):

new

Folder ‘doc’ is used to hold the mapping files;

Folder ‘log’ used to hold log files generated by following macro calls, such as generating SDTM domains;

Folder ‘results’ and its subfolder will hold all the outputs, such as define.xml, SAS transport file, validation reports and SDTM datasets;

Folder ‘source’ holds all the clinical raw data used as inputs for SDTM domains;

Folder ‘tempdata’ holds all the temporary datasets generated by following macro calls.

Also, a configuration file named CLINCAP_configuration.sas put in C:\Program Files\CDISC Express\programs\study configuration\. This file is used to set some study level parameters, such as lab and toxicity specifications (details in C:\Program Files\CDISC Express\specs\Lab specs\).

Two versions of SDTM implementation guides are supported by CDISC Express, CDISC SDTM Implementation Guide Version 3.1.1 and Version 3.1.2. You can find the corresponding specification files in C:\Program Files\CDISC Express\specs\SDTM specs\:

SDTM_Specs_3_1_1.xls

SDTM_Specs_3_1_2.xls

The choosing of SDTM implementation version is also coded in the configuration file, in Line 41:

%LET SDTMSPECFILE=SDTM_Specs_3_1_1.xls;

Version 3.1.1 is used by default. You can also choose Version 3.1.2 if needed:

%LET SDTMSPECFILE=SDTM_Specs_3_1_2.xls;

Assign a study name and choose a SDTM implementation version. That’s all needed in step 1. Let’s take few minutes to navigate the software. CDISC Express is a set of macros and Excel files. It is important to know the file structure first:

C:\Program Files\CDISC Express\

├─documentation :FAQ, Quick Start, User Guide

├─macros

│ ├─ClinMap :system level macros

│ └─function_library :study level macros

├─programs :"action taken" macros

│ ├─study configuration :study parameters configuration

├─SDTM Validation :For validation of SDTM domains

│ └─study1

├─specs :specification files

│ ├─Excel engine :ExcelXP tagset file

│ ├─Lab specs :lab and toxicity

│ ├─Mapping validation :validation rules

│ ├─SDTM specs :hold two versions of SDTM implementation

│ └─SDTM Terminology :SDTM codelist(including NCI terminology)

├─studies

│ ├─example1

└─temp :hold temporary data not specified to any studies

As we already got, all the “action taken” programs such as create_new_study.sas are located in C:\Program Files\CDISC Express\programs\. In create_new_study.sas, one macro is called, %addnewstudy, which is in C:\Program Files\CDISC Express\macros\ClinMap\.

Note that in C:\Program Files\CDISC Express\macros\, there are two sets of macros in different folders:

C:\Program Files\CDISC Express\macros\ClinMap\: this folder holds all “system” level macros used by the application only. No modification encouraged.

C:\Program Files\CDISC Express\macros\function_library\: macros used for mapping among studies. You can also create you own macro in this folder. The application imbedded macros also documented in user guide.

Following will be the most important part, mapping file.

TobeContinued