| Title: | Multiple Bias Analysis in Causal Inference |
|---|---|
| Description: | Quantify exposure-outcome causal effects with adjustment for multiple biases. The functions can simultaneously adjust for any combination of uncontrolled confounding, exposure/outcome misclassification, and selection bias. The underlying method generalizes the combination of inverse probability of selection weighting with predictive value weighting. Simultaneous multi-bias analysis can be used to enhance the validity and transparency of real-world evidence obtained from observational, longitudinal studies. Based on the work from Paul Brendel, Aracelis Torres, and Onyebuchi Arah (2023) <doi:10.1093/ije/dyad001>. |
| Authors: | Paul Brendel [aut, cre, cph] |
| Maintainer: | Paul Brendel <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.7.3 |
| Built: | 2026-05-20 10:03:48 UTC |
| Source: | https://github.com/pcbrendel/multibias |
bias_params is one of two different options to represent bias assumptions for bias adjustment. The multibias_adjust() function will apply the assumptions from these models and use them to adjust for biases in the observed data. It takes one input, a list, where each item in the list corresponds to the necessary models for bias adjustment. See below for bias models.
For each of the following bias models, the variables are defined:
X = True exposure
X* = Misclassified exposure
Y = True outcome
Y* = Misclassified outcome
C = Known confounder
j = Number of known confounders
U = Uncontrolled confounder
S = Selection indicator
bias_params(coef_list)bias_params(coef_list)
coef_list |
List of coefficient values from the above options of models. Each item of the list is an equation. The left side of the equation identifies the model (i.e., "u" for the model predicting the uncontrolled confounder). For the multinomial models, specify the value here based on the numerator (i.e., "x1u0", "x0u1", "x1u1" for the three multinomial models in Uncontrolled Confounding & Exposure Misclassification, Option 2) The right side of the equation is the vector of values corresponding to the model coefficients (from left to right). |
list_for_uc <- list( u = c(-0.19, 0.61, 0.70, -0.09, 0.10, -0.15) ) bp_uc <- bias_params(coef_list = list_for_uc) list_for_em_om <- list( x1y0 = c(-2.18, 1.63, 0.23, 0.36), x0y1 = c(-3.17, 0.22, 1.60, 0.40), x1y1 = c(-4.76, 1.82, 1.83, 0.72) ) bp_em_om <- bias_params(coef_list = list_for_em_om)list_for_uc <- list( u = c(-0.19, 0.61, 0.70, -0.09, 0.10, -0.15) ) bp_uc <- bias_params(coef_list = list_for_uc) list_for_em_om <- list( x1y0 = c(-2.18, 1.63, 0.23, 0.36), x0y1 = c(-3.17, 0.22, 1.60, 0.40), x1y1 = c(-4.76, 1.82, 1.83, 0.72) ) bp_em_om <- bias_params(coef_list = list_for_em_om)
data_observed combines the observed dataframe with specific identification
of the columns corresponding to the exposure, outcome, and confounders. It is
an essential input of the multibias_adjust() function.
data_observed(data, bias, exposure, outcome, confounders = NULL)data_observed(data, bias, exposure, outcome, confounders = NULL)
data |
Dataframe for bias analysis. |
bias |
String type(s) of bias distorting the effect of the exposure on the outcome. Can choose from a subset of the following: "uc", "em", "om", "sel". These correspond to uncontrolled confounding, exposure misclassification, outcome misclassification, and selection bias, respectively. |
exposure |
String name of the column in |
outcome |
String name of the column in |
confounders |
String name(s) of the column(s) in |
An object of class data_observed containing:
data |
A dataframe with the selected columns |
bias |
The type(s) of bias present |
exposure |
The name of the exposure variable |
outcome |
The name of the outcome variable |
confounders |
The name(s) of the confounder variable(s) |
df <- data_observed( data = df_uc, bias = "uc", exposure = "X_bi", outcome = "Y_bi", confounders = c("C1", "C2", "C3") )df <- data_observed( data = df_uc, bias = "uc", exposure = "X_bi", outcome = "Y_bi", confounders = c("C1", "C2", "C3") )
data_validation is one of two different options to represent bias
assumptions for bias adjustment. It combines the validation dataframe
with specific identification of the appropriate columns for bias adjustment,
including: true exposure, true outcome, confounders, misclassified exposure,
misclassified outcome, and selection. The purpose of validation data is to
use an external data source to transport the necessary causal relationships
that are missing in the observed data.
data_validation( data, true_exposure, true_outcome, confounders = NULL, misclassified_exposure = NULL, misclassified_outcome = NULL, selection = NULL )data_validation( data, true_exposure, true_outcome, confounders = NULL, misclassified_exposure = NULL, misclassified_outcome = NULL, selection = NULL )
data |
Dataframe of validation data |
true_exposure |
String name of the column in |
true_outcome |
String name of the column in |
confounders |
String name(s) of the column(s) in |
misclassified_exposure |
String name of the column in |
misclassified_outcome |
String name of the column in |
selection |
String name of the column in |
An object of class data_validation containing:
data |
A dataframe with the selected columns |
true_exposure |
The name of the true exposure variable |
true_outcome |
The name of the true outcome variable |
confounders |
The name(s) of the confounder variable(s) |
misclassified_exposure |
The name of the misclassified exposure variable |
misclassified_outcome |
The name of the misclassified outcome variable |
selection |
The name of the selection indicator variable |
df <- data_validation( data = df_sel_source, true_exposure = "X", true_outcome = "Y", confounders = c("C1", "C2", "C3"), selection = "S" )df <- data_validation( data = df_sel_source, true_exposure = "X", true_outcome = "Y", confounders = c("C1", "C2", "C3"), selection = "S" )
Data containing one source of bias, three known confounders, and
100,000 observations. This data is obtained from df_emc_source
by removing the column X. The resulting data corresponds to
what a researcher would see in the real-world: a misclassified exposure,
Xstar, and no data on the true exposure. As seen in
df_emc_source, the true, unbiased exposure-outcome odds ratio = 2.
df_emdf_em
A dataframe with 100,000 rows and 5 columns:
misclassified exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
Data containing two sources of bias, three known confounders, and
100,000 observations. This data is obtained from df_emc_omc_source
by removing the columns X and Y. The resulting data corresponds
to what a researcher would see in the real-world: a misclassified exposure,
Xstar, and a misclassified outcome, Ystar. As seen in
df_em_om_source, the true, unbiased exposure-outcome
odds ratio = 2.
df_em_omdf_em_om
A dataframe with 100,000 rows and 5 columns:
misclassified exposure, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_em_om
Data with complete information on the two sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_em_om and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_em_om. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_em_om_sourcedf_em_om_source
A dataframe with 100,000 rows and 7 columns:
true exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
misclassified exposure, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
Data containing two sources of bias, three known confounders, and
100,000 observations. This data is obtained by sampling with replacement
with probability = S from df_em_sel_source then removing the
columns X and S. The resulting data corresponds to what a
researcher would see in the real-world: a misclassified exposure,
Xstar, and missing data for those not selected into the study
(S=0). As seen in df_em_sel_source, the true, unbiased
exposure-outcome odds ratio = 2.
df_em_seldf_em_sel
A dataframe with 100,000 rows and 5 columns:
misclassified exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_em_sel
Data with complete information on the two sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_em_sel and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_em_sel. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_em_sel_sourcedf_em_sel_source
A dataframe with 100,000 rows and 7 columns:
true exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
misclassified exposure, 1 = present and 0 = absent
selection, 1 = selected into the study and 0 = not selected into the study
df_em
Data with complete information on one sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_em and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_em. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_em_sourcedf_em_source
A dataframe with 100,000 rows and 6 columns:
exposure, 1 = present and 0 = absent
true outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
misclassified exposure, 1 = present and 0 = absent
Data containing one source of bias, three known confounders, and
100,000 observations. This data is obtained from df_om_source
by removing the column Y. The resulting data corresponds to
what a researcher would see in the real-world: a misclassified outcome,
Ystar, and no data on the true outcome. As seen in
df_om_source, the true, unbiased exposure-outcome odds ratio = 2.
df_omdf_om
A dataframe with 100,000 rows and 5 columns:
exposure, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
Data containing two sources of bias, a known confounder, and
100,000 observations. This data is obtained by sampling with replacement
with probability = S from df_om_sel_source then removing the
columns Y and S. The resulting data corresponds to what a
researcher would see in the real-world: a misclassified outcome,
Ystar, and missing data for those not selected into the study
(S=0). As seen in df_om_sel_source, the true, unbiased
exposure-outcome odds ratio = 2.
df_om_seldf_om_sel
A dataframe with 100,000 rows and 5 columns:
exposure, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_om_sel
Data with complete information on the two sources of bias, a known
confounder, and 100,000 observations. This data is used to derive
df_om_sel and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_om_sel. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_om_sel_sourcedf_om_sel_source
A dataframe with 100,000 rows and 7 columns:
exposure, 1 = present and 0 = absent
true outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
selection, 1 = selected into the study and 0 = not selected into the study
df_om
Data with complete information on one sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_om and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_om. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_om_sourcedf_om_source
A dataframe with 100,000 rows and 6 columns:
exposure, 1 = present and 0 = absent
true outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
Data containing one source of bias, three known confounders, and 100,000
observations. This data is obtained by sampling with replacement with
probability = S from df_sel_source then removing the S
column. The resulting data corresponds to what a researcher would see
in the real-world: missing data for those not selected into the study
(S=0). As seen in df_sel_source, the true, unbiased
exposure-outcome odds ratio = 2.
df_seldf_sel
A dataframe with 100,000 rows and 5 columns:
exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_sel
Data with complete information on study selection, three known
confounders, and 100,000 observations. This data is used to derive
df_sel and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_sel. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_sel_sourcedf_sel_source
A dataframe with 100,000 rows and 6 columns:
true exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
selection, 1 = selected into the study and 0 = not selected into the study
Data containing one source of bias, three known confounders, and
100,000 observations. This data is obtained from df_uc_source
by removing the column U. The resulting data corresponds to
what a researcher would see in the real-world: information on known
confounders (C1, C2, and C3), but not for
confounder U.
As seen in df_uc_source, the true, unbiased exposure-outcome
effect estimate = 2.
df_ucdf_uc
A dataframe with 100,000 rows and 7 columns:
binary exposure, 1 = present and 0 = absent
continuous exposure
binary outcome corresponding to exposure X_bi, 1 = present and 0 = absent
continuous outcome corresponding to exposure X_cont
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
Data containing two sources of bias, three known confounders, and
100,000 observations. This data is obtained from df_uc_em_source
by removing the columns X and U. The resulting data
corresponds to what a researcher would see in the real-world: a
misclassified exposure, Xstar, and missing data on a confounder
U. As seen in df_uc_em_source, the true, unbiased
exposure-outcome odds ratio = 2.
df_uc_emdf_uc_em
A dataframe with 100,000 rows and 5 columns:
misclassified exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
Data containing three sources of bias, three known confounders, and
100,000 observations. This data is obtained by sampling with replacement
with probability = S from df_uc_em_sel_source then removing
the columns X, U, and S. The resulting data corresponds
to what a researcher would see in the real-world: a misclassified exposure,
Xstar; missing data on a confounder U; and missing data for
those not selected into the study (S=0). As seen in
df_uc_em_sel_source, the true, unbiased exposure-outcome
odds ratio = 2.
df_uc_em_seldf_uc_em_sel
A dataframe with 100,000 rows and 5 columns:
misclassified exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_uc_em_sel
Data with complete information on the three sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_uc_em_sel and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_uc_em_sel. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3 + α5U
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_uc_em_sel_sourcedf_uc_em_sel_source
A dataframe with 100,000 rows and 8 columns:
true exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
unmeasured confounder, 1 = present and 0 = absent
misclassified exposure, 1 = present and 0 = absent
selection, 1 = selected into the study and 0 = not selected into the study
df_uc_em
Data with complete information on the two sources of bias, a known
confounder, and 100,000 observations. This data is used to derive
df_uc_em and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_uc_em. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3U
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_uc_em_sourcedf_uc_em_source
A dataframe with 100,000 rows and 7 columns:
true exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
unmeasured confounder, 1 = present and 0 = absent
misclassified exposure, 1 = present and 0 = absent
Data containing two sources of bias, three known confounders, and
100,000 observations. This data is obtained from df_uc_om_source
by removing the columns Y and U. The resulting data
corresponds to what a researcher would see in the real-world: a
misclassified outcome, Ystar, and missing data on the binary
confounder U. As seen in df_uc_omc_source, the true, unbiased
exposure-outcome odds ratio = 2.
df_uc_omdf_uc_om
A dataframe with 100,000 rows and 5 columns:
exposure, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
Data containing three sources of bias, three known confounders, and
100,000 observations. This data is obtained by sampling with replacement
with probability = S from df_uc_om_sel_source then removing
the columns Y, U, and S. The resulting data
corresponds to what a researcher would see in the real-world:
a misclassified outcome, Ystar; missing data
on a confounder U; and missing data for those not selected
into the study (S=0). As seen in df_uc_om_sel_source,
the true, unbiased exposure-outcome odds ratio = 2.
df_uc_om_seldf_uc_om_sel
A dataframe with 100,000 rows and 5 columns:
exposure, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_uc_om_sel
Data with complete information on the three sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_uc_om_sel and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_uc_om_sel. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3 + α5U
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_uc_om_sel_sourcedf_uc_om_sel_source
A dataframe with 100,000 rows and 8 columns:
exposure, 1 = present and 0 = absent
true outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
unmeasured confounder, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
selection, 1 = selected into the study and 0 = not selected into the study
df_uc_om
Data with complete information on the two sources of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_uc_om and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_uc_om. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3U
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_uc_om_sourcedf_uc_om_source
A dataframe with 100,000 rows and 7 columns:
exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
unmeasured confounder, 1 = present and 0 = absent
misclassified outcome, 1 = present and 0 = absent
Data containing two sources of bias, three known confounders, and 100,000
observations. This data is obtained by sampling with replacement with
probability = S from df_uc_sel_source then removing
the columns U and S. The resulting data corresponds to
what a researcher would see
in the real-world: missing data on confounder U; and missing data for
those not selected into the study (S=0). As seen in
df_uc_sel_source, the true, unbiased exposure-outcome odds ratio = 2.
df_uc_seldf_uc_sel
A dataframe with 100,000 rows and 5 columns:
exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
df_uc_sel
Data with complete information on the two sources of bias, a known
confounder, and 100,000 observations. This data is used to derive
df_uc_sel and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_uc_sel. With this source data, the fitted regression
logit(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3 + α5U
shows that the true, unbiased exposure-outcome odds ratio = 2.
df_uc_sel_sourcedf_uc_sel_source
A dataframe with 100,000 rows and 7 columns:
true exposure, 1 = present and 0 = absent
outcome, 1 = present and 0 = absent
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
unmeasured confounder, 1 = present and 0 = absent
selection, 1 = selected into the study and 0 = not selected into the study
df_uc
Data with complete information on one source of bias, three known
confounders, and 100,000 observations. This data is used to derive
df_uc and can be used to obtain bias parameters for purposes
of validating the simultaneous multi-bias adjustment method with
df_uc. With this source data, the fitted regression
g(P(Y=1)) = α0 + α1X + α2C1 + α3C2 + α4C3 + α5U
shows that the true, unbiased exposure-outcome effect estimate = 2 when:
g = logit, Y = Y_bi, and X = X_bi or
g = identity, Y = Y_cont, X = X_cont.
df_uc_sourcedf_uc_source
A dataframe with 100,000 rows and 8 columns:
binary exposure, 1 = present and 0 = absent
continuous exposure
binary outcome corresponding to exposure X_bi, 1 = present and 0 = absent
continuous outcome corresponding to exposure X_cont
1st confounder, 1 = present and 0 = absent
2nd confounder, 1 = present and 0 = absent
3rd confounder, 1 = present and 0 = absent
uncontrolled confounder, 1 = present and 0 = absent
multibias_adjust returns the exposure-outcome odds ratio and confidence
interval, adjusted for one or more biases.
multibias_adjust( data_observed, data_validation = NULL, bias_params = NULL, bootstrap = FALSE, bootstrap_reps = 100, level = 0.95 )multibias_adjust( data_observed, data_validation = NULL, bias_params = NULL, bootstrap = FALSE, bootstrap_reps = 100, level = 0.95 )
data_observed |
Object of class |
data_validation |
Object of class |
bias_params |
Object of class 'bias_params' corresponding to the
bias parameters used to adjust for bias in the observed data. There must
be parameters corresponding to the bias or biases specified in
|
bootstrap |
Boolean for whether to perform bootstrapping to obtain the estimate and confidence interval. |
bootstrap_reps |
Integer number of bootstrap samples to run in bootstrapping. |
level |
Value from 0-1 representing the full range of the confidence interval. Default is 0.95. |
Bias adjustment can be performed by inputting either a validation dataset or
the necessary bias parameters. Values for the bias parameters
can be applied as fixed values or as single draws from a probability
distribution (ex: rnorm(1, mean = 2, sd = 1)). The latter has
the advantage of allowing the researcher to capture the uncertainty
in the bias parameter estimates. To incorporate this uncertainty in the
estimate and confidence interval, this function should be run in loop across
bootstrap samples of the dataframe for analysis. The estimate and
confidence interval would then be obtained from the median and quantiles
of the distribution of odds ratio estimates.
A list including: the bias-adjusted effect estimate of the exposure on the outcome, the standard error, and the confidence interval as the vector: (lower bound, upper bound).
# Adjust for exposure misclassification ------------------------------------- df_observed <- data_observed( data = df_em, bias = "em", exposure = "Xstar", outcome = "Y", confounders = "C1" ) # Using validation data df_validation <- data_validation( data = df_em_source, true_exposure = "X", true_outcome = "Y", confounders = "C1", misclassified_exposure = "Xstar" ) multibias_adjust( data_observed = df_observed, data_validation = df_validation ) # Using bias_params bp <- bias_params(coef_list = list(x = c(-2.10, 1.62, 0.63, 0.35))) multibias_adjust( data_observed = df_observed, bias_params = bp ) # Adjust for three biases --------------------------------------------------- df_observed <- data_observed( data = df_uc_om_sel, bias = c("uc", "om", "sel"), exposure = "X", outcome = "Ystar", confounders = c("C1", "C2", "C3") ) # Using validation data df_validation <- data_validation( data = df_uc_om_sel_source, true_exposure = "X", true_outcome = "Y", confounders = c("C1", "C2", "C3", "U"), misclassified_outcome = "Ystar", selection = "S" ) multibias_adjust( data_observed = df_observed, data_validation = df_validation ) # Using bias_params bp1 <- bias_params( coef_list = list( u = c(-0.32, 0.59, 0.69), y = c(-2.85, 0.71, 1.63, 0.40, -0.85, 0.22), s = c(0.00, 0.74, 0.19, 0.02, -0.06, 0.02) ) ) multibias_adjust( data_observed = df_observed, bias_params = bp1 ) bp2 <- bias_params( coef_list = list( u1y0 = c(-0.20, 0.62, 0.01, -0.08, 0.10, -0.15), u0y1 = c(-3.28, 0.63, 1.65, 0.42, -0.85, 0.26), u1y1 = c(-2.70, 1.22, 1.64, 0.32, -0.77, 0.09), s = c(0.00, 0.74, 0.19, 0.02, -0.06, 0.02) ) ) # with bootstrapping ## Not run: multibias_adjust( data_observed = df_observed, bias_params = bp2, bootstrap = TRUE, bootstrap_reps = 1000 ) ## End(Not run)# Adjust for exposure misclassification ------------------------------------- df_observed <- data_observed( data = df_em, bias = "em", exposure = "Xstar", outcome = "Y", confounders = "C1" ) # Using validation data df_validation <- data_validation( data = df_em_source, true_exposure = "X", true_outcome = "Y", confounders = "C1", misclassified_exposure = "Xstar" ) multibias_adjust( data_observed = df_observed, data_validation = df_validation ) # Using bias_params bp <- bias_params(coef_list = list(x = c(-2.10, 1.62, 0.63, 0.35))) multibias_adjust( data_observed = df_observed, bias_params = bp ) # Adjust for three biases --------------------------------------------------- df_observed <- data_observed( data = df_uc_om_sel, bias = c("uc", "om", "sel"), exposure = "X", outcome = "Ystar", confounders = c("C1", "C2", "C3") ) # Using validation data df_validation <- data_validation( data = df_uc_om_sel_source, true_exposure = "X", true_outcome = "Y", confounders = c("C1", "C2", "C3", "U"), misclassified_outcome = "Ystar", selection = "S" ) multibias_adjust( data_observed = df_observed, data_validation = df_validation ) # Using bias_params bp1 <- bias_params( coef_list = list( u = c(-0.32, 0.59, 0.69), y = c(-2.85, 0.71, 1.63, 0.40, -0.85, 0.22), s = c(0.00, 0.74, 0.19, 0.02, -0.06, 0.02) ) ) multibias_adjust( data_observed = df_observed, bias_params = bp1 ) bp2 <- bias_params( coef_list = list( u1y0 = c(-0.20, 0.62, 0.01, -0.08, 0.10, -0.15), u0y1 = c(-3.28, 0.63, 1.65, 0.42, -0.85, 0.26), u1y1 = c(-2.70, 1.22, 1.64, 0.32, -0.77, 0.09), s = c(0.00, 0.74, 0.19, 0.02, -0.06, 0.02) ) ) # with bootstrapping ## Not run: multibias_adjust( data_observed = df_observed, bias_params = bp2, bootstrap = TRUE, bootstrap_reps = 1000 ) ## End(Not run)
This function generates a forest plot comparing the observed effect estimate with adjusted effect estimates from sensitivity analyses. The plot includes point estimates and confidence intervals for each analysis.
multibias_plot(data_observed, multibias_result_list, log_scale = FALSE)multibias_plot(data_observed, multibias_result_list, log_scale = FALSE)
data_observed |
Object of class |
multibias_result_list |
A named list of sensitivity analysis results.
Each element should be a result from |
log_scale |
Boolean indicating whether to display the x-axis on the log scale. Default is FALSE. |
A ggplot object showing a forest plot with:
Point estimates (blue dots)
Confidence intervals (gray horizontal lines)
A vertical reference line at x=1 (dashed)
Appropriate labels and title
## Not run: df_observed <- data_observed( data = df_em, bias = "em", exposure = "Xstar", outcome = "Y", confounders = "C1" ) bp1 <- bias_params(coef_list = list(x = c(-2.10, 1.62, 0.63, 0.35))) bp2 <- bias_params(coef_list = list(x = c(-2.10 * 2, 1.62 * 2, 0.63 * 2, 0.35 * 2))) result1 <- multibias_adjust( data_observed = df_observed, bias_params = bp1 ) result2 <- multibias_adjust( data_observed = df_observed, bias_params = bp2 ) multibias_plot( data_observed = df_observed, multibias_result_list = list( "Adjusted with bias params" = result1, "Adjusted with bias params doubled" = result2 ) ) ## End(Not run)## Not run: df_observed <- data_observed( data = df_em, bias = "em", exposure = "Xstar", outcome = "Y", confounders = "C1" ) bp1 <- bias_params(coef_list = list(x = c(-2.10, 1.62, 0.63, 0.35))) bp2 <- bias_params(coef_list = list(x = c(-2.10 * 2, 1.62 * 2, 0.63 * 2, 0.35 * 2))) result1 <- multibias_adjust( data_observed = df_observed, bias_params = bp1 ) result2 <- multibias_adjust( data_observed = df_observed, bias_params = bp2 ) multibias_plot( data_observed = df_observed, multibias_result_list = list( "Adjusted with bias params" = result1, "Adjusted with bias params doubled" = result2 ) ) ## End(Not run)