Real-World Evidence Study Protocol: Impact of Arterial Catheter Use on Mortality and Resource Utilization in Mechanically Ventilated ICU Patients (MIMIC-II)

RWE
EHR
MIMIC-II
observational-study
ICU
logistic regression
propensity-score
Retrospective cohort assessing IAC placement impact on mortality and resource use in ventilated patients
Author

Ousmane Diallo, MPH-PhD

Published

November 16, 2025

1.Title page

Element Details
Title Evaluation of the Association Between Indwelling Arterial Catheter Placement and Clinical Outcomes in Mechanically Ventilated Patients: A Retrospective Cohort Study Using MIMIC-II EHR Data.
Primary Objective Estimate the adjusted association (aOR) between IAC use and 28-day mortality using multivariable regression models accounting for severity-of-illness confounding.
Secondary Objectives - Assess whether IAC use is associated with higher daily arterial blood gas utilization.
- Evaluate impact on ICU length of stay (LOS).
Exposure Definition Exposure defined using aline_flg (1 = IAC used, 0 = no IAC). Timestamp-based new-user design not possible due to lack of procedure timing in MIMIC-II.
Protocol Version v1.0 (27-Nov-2025)
Contributors PI: Ousmane Diallo, MPH-PhD (Design, Analysis, Reporting)
Contact: ousmanerabi12@gmail.com
Study Registration OSF.io preregistration (DOI: TBD); ClinicalTrials.gov planned (NCT ID: TBD).
Sponsor Self-sponsored project for RWE training and portfolio development.
Conflict of Interest None declared.
Funding Self-funded; no industry or external support.

2.Abstract

Background

Indwelling arterial catheters (IACs) are frequently used in intensive care units (ICUs) for continuous blood pressure monitoring and repeated arterial blood gas sampling among mechanically ventilated patients. Although clinically useful, they are invasive and may carry risks such as local complications and bloodstream infections.

Despite their routine use, real-world evidence (RWE) describing how IAC use relates to mortality and resource utilization in mechanically ventilated but hemodynamically stable ICU patients remains limited. This study uses the MIMIC-II database to evaluate whether the presence of an arterial line is associated with differences in 28-day mortality and care intensity.

Research question and objectives

How to write a research question: In the population of interest (study cohort), is the exposure to the variable of interest associated with a different outcome than in the control group?.

This question employs a new-user design to minimize immortal time bias.

Among mechanically ventilated adult ICU patients who are not receiving vasopressor therapy, is the use of an indwelling arterial catheter (IAC) defined as any documented arterial line during the ICU stay (aline_flg = 1) compared with no IAC use, associated with:

  • differences in 28-day mortality (primary, patient-centered outcome)?
  • a higher number of blood gas measurements per day (secondary, mechanistic/intermediate outcome)?

Primary Objective:
Estimate the adjusted association (aOR) between IAC use and 28-day mortality using multivariable regression models accounting for confounding by severity of illness.

Secondary Objective:
Evaluate whether IAC use is associated with greater daily arterial blood gas utilization and ICU length of stay.

Methods

We will conduct a retrospective cohort study using the MIMIC-II database, including adult ICU patients receiving invasive mechanical ventilation. Exposure will be defined as documented IAC use at any point during the ICU stay (aline_flg = 1).

To address confounding by indication, we will use two complementary analytical strategies. First, a multivariable logistic regression model will estimate the association between IAC use and 28-day mortality, adjusting for baseline severity of illness (e.g., SAPS I, SOFA), demographics, comorbidities, and day-1 physiologic and laboratory variables. Second, we will estimate propensity scores for IAC use via logistic regression and apply matching to balance covariates between groups before refitting the outcome model.

Effect estimates will be reported as odds ratios with 95% confidence intervals.

3. Amendments and updates

Version date Version number Section amended Amendment / update Reason for amendment
27-Nov-2025 v1.0 Full protocol Initial version First complete draft of study protocol
18-Dec-2025 v2.0 Data source, exposure, covariates, secondary outcomes Updated protocol to use MIMIC-III with reconstructed MV time zero and early IAC exposure window; excluded high-frequency event tables (labs/vitals) due to computational/storage constraints; focused on primary mortality outcome Improved temporal validity and feasibility; minimizes immortal time bias while preserving reproducibility

Note: This protocol is version 1.0 (27-Nov-2025). Future updates will be documented in the table above. Amendments may include:

  • changes in exposure definition (e.g., refinement of aline_flg logic),
  • updates to covariate selection or variable construction,
  • modifications to statistical methodology (e.g., switching from IPTW to PS matching),
  • adjustments to inclusion/exclusion criteria,
  • addition of new sensitivity analyses,
  • corrections following peer review or analytic QA.

Each amendment will be timestamped, versioned, and archived to ensure full traceability, transparency, and reproducibility.

4. Milestones

The study timeline includes the following milestones:

Milestone Date
Protocol finalization December 2025
Data extraction & preprocessing December 2025 – January 2026
Exploratory data analysis January 2026
Propensity score modeling January – February 2026
Primary analysis (mortality models) February 2026
Secondary analysis (ABG/day, LOS) February 2026
Sensitivity analyses March 2026
Portfolio documentation & reporting March 2026

5. Rationale and background

Critically ill patients receiving invasive mechanical ventilation require close monitoring of oxygenation, ventilation parameters, and hemodynamic status. Clinicians frequently rely on arterial blood gas (ABG) measurements and continuous arterial pressure monitoring to guide ventilator settings and detect physiologic deterioration. Mechanical ventilation is associated with significant morbidity and mortality, and monitoring intensity varies widely across patients and institutions.

Indwelling arterial catheters (IACs) are commonly used in ICUs to provide continuous invasive blood pressure measurements and facilitate frequent ABG sampling. Their use is widespread among ventilated patients, particularly those with greater severity of illness. Although IACs offer clinical advantages, they are invasive and associated with risks such as catheter-related infections, vascular complications, and bloodstream infections. Previous studies suggest possible associations between IAC use and increased mortality, though results are inconsistent and confounded by the fact that sicker patients are more likely to receive an IAC (“confounding by indication”).

Despite their ubiquitous use, the net clinical impact of IAC placement on patient-centered outcomes—especially mortality, resource utilization, and monitoring intensity—remains unclear. Randomized trials are scarce and often underpowered, and observational evidence is limited by confounding, heterogeneity in practice, and missing procedural timing in many EHR datasets. Furthermore, real-world data sources like MIMIC-II rarely provide granular timestamps for line insertion, complicating causal interpretation and mirroring common constraints faced in operational RWE analyses.

Using the MIMIC-II EHR dataset, this study will estimate the association between IAC use and 28-day mortality among mechanically ventilated adults in the absence of vasopressor therapy. By applying both multivariable regression and propensity score methods (matching or IPTW), the study aims to reduce confounding and generate more balanced comparisons between IAC users and non-users. Beyond answering the clinical question, this work demonstrates a transparent, reproducible framework for conducting ICU-focused RWE analyses using EHR data under realistic data limitations. The study therefore contributes methodological insight (handling confounding, handling limited timestamps) and practical evidence on IAC-related outcomes.

6. Research question and objectives

A. Primary research question and objective

Study element Specification
Objective To estimate the association between indwelling arterial catheter (IAC) use and 28-day mortality among mechanically ventilated adult ICU patients not receiving vasopressors.
Hypothesis IAC use is not associated with lower 28-day mortality after adjusting for baseline severity of illness (null hypothesis).
Population (mention key inclusion-exclusion criteria) Adult ICU patients in MIMIC-II who: (1) received invasive mechanical ventilation; (2) were not receiving vasopressor therapy on day 1; (3) had sufficient data available for key covariates and outcomes.
Exposure Documented use of an indwelling arterial catheter during the ICU stay (aline_flg = 1).
Comparator No documented IAC use during the ICU stay (aline_flg = 0).
Outcome 28-day all-cause mortality, assessed using in-hospital death flags and follow-up information.
Time (when follow up begins and ends) Time zero: initiation of invasive mechanical ventilation. Follow-up: up to 28 days after MV initiation or until hospital discharge or death, whichever occurs first.
Setting Adult ICUs from the MIMIC-II clinical database (Beth Israel Deaconess Medical Center).
Main measure of effect Adjusted odds ratio (aOR) for 28-day mortality from multivariable logistic regression and propensity-score–weighted models.

B. Secondary research question and objective

Study element Specification
Objective To assess whether IAC use is associated with increased monitoring intensity (arterial blood gas measurements per day) and greater ICU resource utilization (ICU length of stay).
Hypothesis IAC use is associated with more frequent ABG sampling and increased ICU length of stay compared with no IAC use.
Population (mention key inclusion-exclusion criteria) Same as primary analysis: mechanically ventilated adult ICU patients in MIMIC-II who are not receiving vasopressor therapy on day 1 and have sufficient data for covariates and outcomes.
Exposure Documented IAC use during the ICU stay (aline_flg = 1).
Comparator No documented IAC use during the ICU stay (aline_flg = 0).
Outcome (1) Number of arterial blood gas measurements per ICU day; (2) ICU length of stay (LOS) in days.
Time (when follow up begins and ends) Time zero: initiation of invasive mechanical ventilation. Follow-up: for ABG/day and LOS, the entire ICU stay until ICU discharge or in-ICU death.
Setting Adult ICUs from the MIMIC-II clinical database.
Main measure of effect (1) Rate ratio or adjusted mean difference in ABG/day (e.g., via negative binomial or linear regression). (2) Adjusted mean difference or hazard ratio for ICU LOS.

7. Research methods

Study design

Retrospective observational cohort study using routinely collected intensive care unit (ICU) electronic health record data from the MIMIC-II database.

A cohort design is appropriate because the research question compares outcomes (28-day mortality, monitoring intensity, ICU length of stay) between patients with and without exposure to an indwelling arterial catheter (IAC). The cohort framework preserves the temporal sequence between baseline covariates, IAC use, and subsequent outcomes, and allows estimation of absolute risks and adjusted measures of association (e.g. odds ratios).

Using an existing ICU EHR cohort (MIMIC-II) is efficient, reflects real-world practice, and provides rich information on severity of illness, physiology, and resource use needed to adjust for confounding by indication. Alternative designs such as case-control or cross-sectional studies would provide less transparent temporality and weaker support for causal interpretation in this context.

Study design diagram

The study follows a retrospective cohort structure:

  1. Cohort entry (“Time zero”)
    • Defined as the initiation of invasive mechanical ventilation (MV).
    • All baseline covariates (demographics, severity scores, comorbidities, day-1 labs and vitals) are assessed at or before MV initiation.
  2. Exposure assessment
    • Patients are classified as IAC users if aline_flg = 1 at any time during the ICU stay.
    • Patients with aline_flg = 0 serve as the unexposed group.
  3. Follow-up period
    • For mortality outcomes: from MV initiation until 28 days, hospital discharge, or death.
    • For secondary outcomes (ABG/day, LOS): from MV initiation until ICU discharge or in-ICU death.
  4. Analysis
    • Comparison of outcomes between exposed and unexposed groups using multivariable models and propensity-score–based methods.

A DAG illustrating confounding by severity of illness (e.g., SAPS I, SOFA) and its relationship to both IAC use and mortality will be provided in the Appendix.

Settings and variables

Setting
This study is conducted using the MIMIC-II clinical database, derived from ICU encounters at Beth Israel Deaconess Medical Center (Boston, USA). The dataset includes high-resolution clinical measurements, laboratory results, medications, interventions, and outcomes.

Key variables

  1. Exposure
    • Indwelling arterial catheter use: aline_flg (1 = used, 0 = not used)
  2. Primary outcome
    • 28-day all-cause mortality: constructed from in-hospital death flag and time to event fields.
  3. Secondary outcomes
    • Arterial blood gas measurements per ICU day: count of ABG records / ICU length of stay.
    • ICU length of stay: number of days from ICU admission to ICU discharge.
  4. Covariates (baseline)
    • Demographics: age, sex, ethnicity
    • Severity of illness: SAPS I, SOFA score
    • Comorbidities: Charlson comorbidity index
    • Day-1 laboratory values: pH, PaO2, PaCO2, lactate, sodium, potassium, hematocrit, WBC
    • Day-1 vital signs: heart rate, mean arterial pressure, temperature, oxygen saturation
    • Clinical status: vasopressor use, fluid balance
    • ICU type: medical, surgical, trauma

Covariate selection is based on clinical relevance and prior literature on factors associated with both IAC use and mortality.

Data analysis

1. Descriptive analysis

  • Summaries of baseline characteristics by exposure group (IAC vs. no IAC).
  • Continuous variables: mean ± SD or median (IQR).
  • Categorical variables: counts and percentages.
  • Standardized mean differences (SMD < 0.1 indicates good balance).

2. Primary analysis (28-day mortality)

  • Multivariable logistic regression model:
    logit(P(death_28d = 1)) = β0 + β1*IAC + β2*SAPS_I + β3*age + ...
  • Report adjusted odds ratios (aOR) with 95% CIs.

3. Propensity score model

  • Logistic regression to estimate probability of receiving an IAC:
    PS = P(aline_flg = 1 | covariates)
  • Two approaches:
    • Nearest neighbor matching (1:1 or 1:3)
    • Inverse probability of treatment weighting (IPTW) using stabilized weights
  • Assess balance using standardized mean differences before and after adjustment.

4. Secondary analyses

  • ABG/day:
    • Count model (Poisson or negative binomial depending on dispersion).
    • Outcome: number of ABG per ICU day.
  • ICU length of stay (LOS):
    • Linear regression or Cox proportional hazards model depending on distribution.

5. Sensitivity analyses

  • Alternative severity covariate sets.
  • PS trimming (e.g., 1st–99th percentile).
  • Complete-case vs. multiple imputation for missing covariates.
  • Excluding patients with extremely short ICU stays (< 24 h).

All analyses will be conducted in R or Python using reproducible scripts.

Data sources

The study uses the MIMIC-II database, a publicly available ICU EHR dataset hosted by PhysioNet. MIMIC-II includes de-identified patient-level clinical data such as:

  • demographics
  • vital signs at high temporal resolution
  • laboratory measurements
  • comorbidities and diagnoses
  • procedures and interventions
  • ICU outcomes (length of stay, mortality)

Only patients who meet PhysioNet data-use requirements and CITI training certification are granted access.

Data management

  • Data will be downloaded from PhysioNet servers following IRB-free data-use approval.
  • The database will be stored locally in a PostgreSQL environment for efficient querying.
  • Data preprocessing will include:
    • creation of analytic cohort (mechanically ventilated, no vasopressors)
    • construction of exposure, outcome, and covariate variables
    • unit harmonization and outlier handling
    • missing data assessment and documentation
  • All data transformations will be recorded in version-controlled scripts (GitHub).
  • No identifiable information exists in MIMIC-II; all analyses will be conducted on de-identified data.

Quality control

  • Use of reproducible R scripts with clear documentation.
  • Independent verification of inclusion/exclusion logic.
  • Internal data consistency checks (range checks, type validation).
  • PS balance checks with SMD thresholds (<0.1).
  • Replication of key results using both regression and PS-weighted models to assess robustness.

Study size and feasibility

MIMIC-II includes thousands of ICU patients, with a substantial proportion receiving invasive mechanical ventilation and/or arterial catheterization. This ensures adequate statistical power for:

  • multivariable regression with multiple covariates
  • propensity score matching or weighting
  • subgroup and sensitivity analyses

Given the large sample size and rich covariate set, the study is feasible and well-powered to detect clinically meaningful associations.

8. Limitation of the methods

  • No timestamp for IAC insertion in the derived dataset: exposure misclassification is possible.
  • Residual confounding despite regression and PS methods (unmeasured illness severity, clinician judgment).
  • Confounding by indication: sicker patients are more likely to receive an IAC.
  • Single-center study: may limit generalizability.
  • Missing data: depends on ICU documentation patterns; imputation may be necessary.
  • Cannot infer causality: observational design remains subject to unmeasured confounding.

9. Protection of human subjects

This study uses the de-identified MIMIC-II database, which meets HIPAA requirements for removal of personal identifiers. No direct patient interaction occurs, and no re-identification is possible.

The study qualifies as non-human subjects research, and no IRB approval is required. Data access is granted only after completing CITI training and signing PhysioNet’s data-use agreement, which prohibits attempts at patient re-identification.

10. Reporting of adverse events

As this study uses retrospective, de-identified EHR data, no clinical interventions are performed and no adverse events may occur as part of the research process. Therefore, adverse event reporting does not apply.

11. References

  1. Hsu et al. (Crit Care Med 2017) PMID:28426500.
  2. Goldberger et al. (MIMIC-II Sci Data 2017).
  3. O’Grady et al. (IAC risks, Lancet 2001).
  4. Hernán et al. (Target Trial Emulation, NEJM 2016).

12. Appendices

  • Appendix A: DAG representing confounding structure
  • Appendix B: Variable definitions and coding
  • Appendix C: SQL queries and pre-processing workflow
  • Appendix D: Propensity score diagnostics (SMD plots, Love plot)
  • Appendix E: Regression model specifications

Ousmane Diallo, MPH-PhD – Biostatistician, Data Scientist & Epidemiologist based in Chicago, Illinois, USA. Specializing in SAS programming, CDISC standards, and real-world evidence for clinical research.

Back to top