Registry development and structure. The SYSTEMIC registry collects detailed data about patients (sociodemographic characteristics and clinical information) as well as baseline and follow-up characteristics of the ostomy and pouching system (eg, shape, use of additional aids). The registry database is accessed only through the Research Electronic Data Capture (REDCap) data management application.15 The project was developed under an umbrella cooperation agreement between the Health Professional Management Service of the University Hospital of Padova and the Unit of Biostatistics, Epidemiology and Public Health of the Department of Cardiac, Thoracic, Vascular Sciences, and Public Health at the University of Padova, which is also in charge of its maintenance. Registry development included the clinician and the research team and included the following steps: 1) creation of a new project in REDCap with the completion and submission of all required information regarding the project settings, 2) creation of data collection instruments after the definition of variables and their properties, 3) preview and test of data entry, 4) configuration of study members’ permissions and users’ rights according to their role in the project, and 5) pilot data collection. Access to the final dataset was authorized only to the Unit of Biostatistics, Epidemiology and Public Health staff through their usernames and passwords. The clinical data are encrypted and stored on a secure server. Files downloaded from REDCap for analysis are deidentified using the export page deidentifiers.
Inclusion criteria. In the registry, data were collected from adult patients with any type of gastrointestinal or urinary stoma. For preliminary data analysis, patients with ureter-ileum ostomy and jejunum ostomy were excluded due to the limited number of cases.
Study design and setting. The prospective observational registry was started in June 2018 at the University Hospital of Padova. Patients admitted for the first time to the ostomy ambulatory were consecutively included in the database. The Padova University Hospital ostomy ambulatory is 1 of 2 referral centers for ostomy and incontinence care in the Veneto region, Italy. Services include management, consultation, and protocol creations for the continuity of care after hospital discharge.
Study variables. All variables included in the registry were selected by consultation with the literature and discussion with WOC nurses in the ostomy ambulatory. The variables used for study analysis were as follows: 1) sociodemographic characteristics (sex, age, diagnosis, comorbidities [yes/no], and body mass index (BMI) [healthy, overweight, obese]) and social situation (living alone, with a spouse, with family, or in a community); 2) baseline characteristics of the ostomy (elective or emergency surgery, type of ostomy [colostomy/ileostomy], preoperative ostomy marking [yes/no], size and features of the ostomy location, ostomy duration [temporary or permanent], characteristics of the pouching system [one/two pieces, flat/convex/convex light, flange presence, manufacturer], ostomy complications [yes/no], presence or absence of preoperative radiation therapy and chemotherapy); and 3) follow-up evaluation (management of the ostomy [caregiver assistance, can the patient see the stoma?], ostomy complications, and changes in the pouching system used). A half-circular ostomy guide in millimeters was used to measure the diameter and height of the stoma, and the measure was taken from the base to the edge.
Data management and safety. Data entered in the registry are deidentified and collected in the context of routine clinical practice. During hospital admission, patients provide their consent for data use for scientific purposes. For this reason, it was not necessary to obtain ethical board approval for the current study. Electronic case report forms developed within the REDCap platform15 were used for data collection. The study design, along with the data collection procedure, was performed while adhering to clinical practice regulations. The treatment of personal information was carried out in agreement with the Italian legislative decrees 211/2003-196/2003 and the European Regulation 2016/679.
If peristomal complications were detected at the follow-up evaluation, they were treated directly in the ambulatory department or referred to the appropriate clinicians.
Data collection procedures. All data were collected during routine visits (Figure 1). Baseline data were collected at the first examination before hospital discharge but after the procedure; follow-up data were collected during the first follow-up visit after hospital discharge, which usually took place at least 30 days after surgery. Patients not requiring an ambulatory evaluation were contacted by telephone. They were asked for information on stoma characteristics and possible complications.
Patients reporting complications (without an evaluation performed by other health care professionals) were referred to the ambulatory department for further assessments. For patients evaluated by other health care providers, complications are reported when their pouching system is changed after the baseline visit.
Data entry and quality control. Data were entered directly into the database created for the study. REDCap allows restricted data formatting, ranges of numbers and dates, data validation, and warnings if the entered data violate specific limits. When outliers or discrepancies were encountered, health care personnel were consulted.
Bayesian machine learning framework. A machine learning (ML) algorithm within a Bayesian framework was used to develop an algorithm for predicting complications. The advantage of using an ML algorithm lies in the fact that it is not looking for inferring associations between the variables and outcomes as in traditional statistical models (eg, linear or logistic regression), but instead is looking for a prediction model. ML algorithms are more suitable to identify complex and nonlinear relationships in the data. Moreover, Bayesian inference was chosen because it can incorporate in the final inference the information provided by the data (objective prior) or by the expert’s opinion (subjective prior).16,17 This information is considered “prior” because it exists before or regardless of the data.18 This approach allows explicit probability statements about the hypothesis to be made, which cannot be inferred through classic methods.19 Noticeably, Bayesian inference treats all sources of uncertainty in the modeling process, allowing maximum flexibility in the modeling procedure.20 More specifically, the authors used the naïve Bayes classifier, which is a method useful for modeling probabilistic associations between categorical variables.21 This method has been used with good performance in studies with small sample sizes, as in the current study.22
Use of the Bayesian ML technique is increasing in clinical studies, especially for predicting outcomes. For example, Fojo et al23 developed a Bayesian algorithm in 2444 participants from 2 National Network of Depression Centers and the Johns Hopkins HIV Clinical Cohort to predict mental health symptoms (eg, depression, anxiety, and mania) and substance use (eg, alcohol, heroin, and cocaine). Another recent study24 compared 3 Bayesian ML techniques, naïve Bayes, Bayesian Network, and Bayesian Additive Regression Trees, to predict extraintestinal manifestations of Crohn’s disease; in this case, the algorithm could be important for clinical decision-making. The naïve Bayes classifier also was used to explore the health care orientation process indicators in newly hired nurses and physicians.22 It was also applied to predict patient hospitalization and emergency department visits in home health care clinical notes25 and to predict hospital admissions at the emergency department.26
Algorithm development. The naïve Bayes algorithm21 was employed to predict the risk of developing complications. The naïve Bayes model predictors were ostomy type, ostomy duration (temporary or permanent), condition necessitating the ostomy surgery, BMI, surgery type, preoperative site marking, ostomy location, and the patient’s ability to see the ostomy site. Predictors were chosen based on those reported in the literature as risk factors for ostomy complications.7,27,28
Two scenarios concerning the prior choices were implemented to illustrate how the prior information may affect and enhance the algorithm’s performance.
Data-driven prior model. The ostomy complication distribution frequency, provided by the SYSTEMIC registry data, was considered to define the prior probability distribution.
Objective prior model. The data provided by studies conducted in similar research settings were considered to derive a prior distribution for the frequency of complications among patients with an ostomy. A frequency of ostomy complications of 50% was seen in similar studies in the literature.3-5 Bayesian ML models were adopted to obtain the posterior probability distributions from the data.29
The naïve Bayesian predictive performances were evaluated using the following measures: accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value, and the area under the ROC curve (AUC).
SYSTEMIC registry expected sample size. To determine a reasonable minimal sample size for this pilot study, the authors assumed an incidence rate of complications of 40%.27 A sample size estimation procedure for the proportion of ostomy complications was performed considering a precision approach on the confidence interval estimate. A pilot study on 52 patients will ensure a precision (error) of the final estimate of 13.5% for a 95% confidence interval. Considering a sample size of 200 patients (approximately 6 months of expected recruitment), the confidence interval error will decrease by 7%.
Statistical analysis. Continuous variables were reported as quartiles I, II (median), and III and percentages for categorical variables. The Wilcoxon test was performed for continuous variables; the likelihood ratio chi-square test from the proportional odds model was performed for categorical ordered variables, and the Pearson chi-square test was performed for categorical nonordered variables.30 For the analysis, diagnosis was grouped into 4 main categories (diverticula and bowel perforation, inflammatory bowel disease, neoplasia, and other). BMI was calculated by dividing weight (in kilograms) by height (in square meters). Statistical analysis was performed using the BlueSky Statistics System and R statistical software.