End-to-end CDISC Workflow (Mock Study)
This portfolio project shows an end-to-end CDISC pipeline implemented in SAS OnDemand: curation of raw-like sources → SDTM domains → ADaM analysis datasets → TLFs (tables/listings/figures).
Repo: https://github.com/Ousmanerabi/clinical-trials-programming-portfolioAt a glance
- Objective: produce regulatory-ready data flow (SDTM → ADaM → TLF) for a mock clinical study.
- Stack: SAS (OnDemand), CDISC SDTM & ADaM principles.
- What you’ll see here: a concise walkthrough, code excerpts, and links to the full programs and outputs.
NoteWhy a mock study?
To demonstrate real-world clinical programming skills without sharing proprietary datasets, while following CDISC conventions and good programming practices.Data sources & scope
- Inputs: synthetic/raw-like CSV/XLSX files and SAS datasets stored in the repo (for demonstration).
- Study artifacts: trial design shells and example CRF mapping logic where needed.
- Outputs: SDTM domains (e.g.,
DM
,AE
,EX
,VS
), then ADaM datasets (e.g.,ADSL
,ADAE
), and selected TLFs.Repository layout (key folders)
clinical-trials-programming-portfolio/ 01_RAW/ # raw-like inputs (csv/xlsx/sas7bdat)
02_SDTM/
programs/ # SAS programs (domain-specific)
out/ # SDTM outputs (sas7bdat/preview png)
03_ADaM/programs/ # SAS programs (ADSL, ADAE, etc.)
out/ # ADaM outputs
04_TLF/
programs/ # TLF SAS code (tables, listings, figures)
out/ # rendered artefacts (rtf/pdf/png)
docs/ # helper docs (specs, trial design)
SDTM (Study Data Tabulation Model)
Goal: create standardized, submission-ready domains that mirror CRF content and study design.
Highlights - Patient key: USUBJID
built from STUDYID
+ subject identifier.
- Controlled terminology & value-level mapping (e.g., SEX
, ARM
, AESER
).
- Date handling (partial dates), label/length alignment.
Example — DM (Demography) derivation (excerpt)
/* 02_SDTM/programs/sdtm_dm.sas */
data sdtm.dm(label="Demographics");
set raw.demography;
length STUDYID $12 USUBJID $40 DOMAIN $2 ARM $20 ARMCD $8 SEX $1 AGEU $5;
DOMAIN = "DM";
USUBJID = catx("-", STUDYID, put(SUBJID, z4.));
AGEU = "YEARS";
/* … additional recodes / controlled terms … */
keep STUDYID DOMAIN USUBJID SUBJID SEX AGE AGEU ARM ARMCD; run;
proc sort data=sdtm.dm; by USUBJID;
run;
Full program: https://github.com/Ousmanerabi/clinical-trials-programming-portfolio/tree/main/02_SDTM/programs
SDTM outputs: https://github.com/Ousmanerabi/clinical-trials-programming-portfolio/tree/main/02_SDTM/out
Quality gates (SDTM): uniqueness of USUBJID
, required variables by domain, controlled codes, and basic cross-domain checks (e.g., DM
vs AE
subject coverage).