PUBLISHED ARTICLES

Sensitivity Analysis of G-estimators to Invalid Instrumental Variables

V. Vancak & A. Sjölander 

Instrumental variables regression is a tool that is commonly used in the analysis of observational data. The instrumental variables are used to make causal inference about the effect of a certain exposure in the presence of unmeasured confounders. A valid instrumental variable is a variable that is associated with the exposure, affects the outcome only through the exposure (exclusion criterion), and is not confounded with the outcome (exogeneity). These assumptions are generally untestable and rely on subject-matter knowledge. Therefore, a sensitivity analysis is desirable to assess the impact of assumptions violation on the estimated parameters. In this paper, we propose and demonstrate a new method of sensitivity analysis for G-estimators in causal linear and non-linear models. We introduce two novel aspects of sensitivity analysis in instrumental variables studies. The first is a single sensitivity parameter that captures violations of exclusion and exogeneity assumptions. The second is an application of the method to non-linear models. The introduced framework is theoretically justified and is illustrated via a simulation study. Finally, we illustrate the method by application to real-world data and provide practitioners with guidelines on conducting sensitivity analysis.


V. Vancak & A. Sjölander (2023), Sensitivity Analysis of G-estimators to Invalid Instrumental Variables, Statistics in Medicine, 42 (23), 4057-4299 


The Number Needed to Treat Adjusted for Explanatory Variables and Survival Analysis: Theory and Application

V. Vancak, Y. Goldberg, S.Z. Levine

The number needed to treat (NNT) is an efficacy index commonly used in randomized clinical trials. The NNT is the average number of treated patients for each undesirable patient outcome, for example, death, prevented by the treatment. We introduce a systematic theoretically-based framework to model and estimate the conditional and the harmonic mean NNT in the presence of explanatory variables, in various models with dichotomous and nondichotomous outcomes. The conditional NNT is illustrated in a series of four primary examples; logistic regression, linear regression, Kaplan-Meier estimation, and Cox regression models. Also, we establish and prove mathematically the exact relationship between the conditional and the harmonic mean NNT in the presence of explanatory variables. We introduce four different methods to calculate asymptotically-correct confidence intervals for both indices. Finally, we implemented a simulation study to provide numerical demonstrations of the aforementioned theoretical results and the four examples. Numerical analysis showed that the parametric estimators of the NNT with nonparametric bootstrap-based confidence intervals outperformed other examined combinations in most settings. An R package and a web application have been developed and made available online to calculate the conditional and the harmonic mean NNTs with their corresponding confidence intervals.

V. Vancak, Y. Goldberg, S.Z. Levine (2022), The Number Needed to Treat Adjusted for Explanatory Variables and Survival Analysis: Theory and Application, Statistics in Medicine, 41(17), 3299-3320 


Guidelines to Understand and Compute the Number Needed to Treat 

V. Vancak, Y. Goldberg, S.Z. Levine

The NNT is an efficacy index that is commonly used in randomized clinical trials. When the NNT is computed, it is assumed that the outcome is dichotomous, reflecting response or non-response. The NNT is the average number of patients needed to treat to obtain one successful outcome (i.e., response) due to treatment. We developed the nntcalc R-package for desktop use and extended it to a user-friendly web application. We described the various analytic options that accompany the web application and developed and provided users with a user-friendly step-by-step guide. The application calculates the NNT for various models with and without explanatory variables, which correspond to the unadjusted and adjusted NNT, respectively. If no explanatory variables are available, one can compute the unadjusted Laupacis' NNT, Kraemer & Kupfer's KK-NNT, and the Furukawa & Leucht’s NNT. The implemented models for the adjusted NNT are linear regression and ANOVA, logistic regression, Kaplan-Meier, and Cox regression. All NNT estimators are computed with their associated appropriate 95% confidence intervals. All calculations are in R and are replicable. 

V. Vancak, Y. Goldberg, S.Z. Levine (2021). Guidelines to Understand the Number Needed to Treat, Evidence-based Mental Health, Statistics in Practice, 24, 131-136


Systematic Analysis of the Number Needed to Treat

V. Vancak, Y. Goldberg, S.Z. Levine

The number needed to treat is often used to measure the efficacy of a binary outcome in randomized clinical trials. There are three different available measures of the number needed to treat. Two of these measures, Furukawa and Leucht’s and Kraemer and Kupfer’s, focus on converting Cohen’s δ index into the number needed to treat, while Laupacis et al.’s measure deals primarily with the number needed to treat’s estimation rather than with a reformulation. Mathematical and numerical analysis of numbers needed to treat and their estimators was conducted. Three novel number needed to treat estimators were introduced to supplement the numbers needed to treat introduced by Laupacis, Furukawa and Leucht, and Kraemer and Kupfer. The analysis showed that Laupacis et al.’s number needed to treat is intrinsically different from Kraemer and Kupfer’s number needed to treat, and that Furukawa and Leucht’s estimator is appropriate to use only for normally distributed outcomes with equal standard deviations. Based on the numerical analysis, the novel numbers needed to treat outperformed the existing ones under correct model specifications. Asymptotic analysis was used to test three different types of confidence intervals to supplement the numbers needed to treat. An R-package to calculate these numbers needed to treat and their confidence intervals has been developed and made available for users online. 

V. Vancak, Y. Goldberg, S.Z. Levine (2020). Systematic Analysis of the Number Needed to Treat, Statistical Methods in Medical Research, 29, 2939-2410



Continuous statistical models: With or without truncation parameters?

V. Vancak, Y. Goldberg, S.K. Bar-Lev, B. Boukai

Lifetime data are usually assumed to stem from a continuous distribution supported on [0, b) for some b ≤ ∞. The continuity assumption implies that the support of the distribution does not have atom points, particularly not at 0. Accordingly, it seems reasonable that with an accurate measurement tool all data observations will be positive. This suggests that the true support may be truncated from the left. In this work we investigate the effects of adding a left truncation parameter to a continuous lifetime data statistical model. We consider two main settings: right truncation parametric models with possible left truncation, and exponential family models with possible left truncation. We analyze the performance of some optimal estimators constructed under the assumption of no left truncation when left truncation is present, and vice versa. We investigate both asymptotic and finite-sample behavior of the estimators. We show that when left truncation is not assumed but is, in fact present, the estimators have a constant bias term, and therefore will result in inaccurate and inefficient estimation. We also show that assuming left truncation where actually there is none, typically does not result in substantial inefficiency, and some estimators in this case are asymptotically unbiased and efficient. 

V. Vancak, Y. Goldberg, S.K. Bar-Lev, B. Boukai (2015). Continuous Statistical Models – With or Without Truncation Parameters?, Mathematical Methods of Statistics, 24, 55-73

SUBMITTED ARTICLES

Estimation of the Number Needed to Treat, the Number Needed to Expose, and the Exposure Impact Number with Instrumental Variables

V. Vancak & A. Sjölander 

The Number needed to treat (NNT) is an efficacy index defined as the average number of patients needed to treat to attain one additional treatment benefit. In observational studies, specifically in epidemiology, the adequacy of the populationwise NNT is questionable since the exposed group characteristics may substantially differ from the unexposed. To address this issue, groupwise efficacy indices were defined: the Exposure Impact Number (EIN) for the exposed group and the Number Needed to Expose (NNE) for the unexposed. Each defined index answers a unique research question since it targets a unique sub-population. In observational studies, the group allocation is typically affected by confounders that might be unmeasured. The available estimation methods that rely either on randomization or the sufficiency of the measured covariates for confounding control will result in inconsistent estimators of the true NNT (EIN, NNE) in such settings. Using Rubin's potential outcomes framework, we explicitly define the NNT and its derived indices as causal contrasts. Next, we introduce a novel method that uses instrumental variables to estimate the three aforementioned indices in observational studies. We present two analytical examples and a corresponding simulation study. The simulation study illustrates that the novel estimators are consistent, unlike the previously available methods, and their confidence intervals meet the nominal coverage rates. Finally, a real-world data example of the effect of vitamin D deficiency on the mortality rate is presented. 


V. Vancak, A. Sjölander, Estimation of the Number Needed to Treat, the Number Needed to Expose, and the Exposure Impact Number with Instrumental Variables, submitted



Students' Difficulties with the Discipline-Oriented Teaching of Programming: The Case of Teaching R

O. Elior, V. Vancak, H. Mosheiff

The paper discusses difficulties evinced by students, particularly novices, in studying programming. It focuses on difficulties that ensue from teaching a programming language with the aim of providing limited knowledge of it necessary for carrying out domain-specific tasks. We refer to this approach as Discipline-Oriented Teaching (DOT). Focusing on DOT of R and based on existing scholarship, the paper begins with an analysis of DOT's nature and three aspects of this teaching methodology that may lead to difficulties: fast pace, parrot-fashion learning, and shallow learning. It next presents the first systematic research of students' difficulties with DOT of R. Carried out in a 14-week R course intended for undergraduates in a Psychology Department and having all three aspects of DOT, the research is based on an analysis of replies obtained to weekly questionnaires. The questionnaires were answered by 68 students who took part in the course. It was found that the three said aspects of DOT put various obstacles in the students' path to successful learning, inter alia, forgetting previously taught material, fragmentary knowledge, inability to answer non-replication questions, and variation in solution methods. We suggest that the research results should be considered in designing programming courses based on DOT 

O. Elior, V. Vancak , H. Mosheiff, Students' Difficulties with the Discipline-Oriented Teaching of Programming: The Case of Teaching R, R&R 

WORKING PAPERS

V. Vancak & A. Sjölander , The Direct and the Indirect Number Needed to be Treat

Reviewer


Statistical Methods in Medical Research, Statistics in Medicine, Pharmaceutical Statistics, Biometrical Journal, Value in Health, BMC Infectious Diseases, Journal of Clinical Epidemiology