P(A=1|Z=1)=(54+6)/(54+6+24+36)=0.5
P(A=1|Z=0)=(28+12)/(28+12+8+32)=0.5
P(A=1)=0.5
suppose Z, A, and Y are binary variables, the original sample size is shown below:
| treatment(A=1) | control(A=0) | |
|---|---|---|
| Z=1 | a | b |
| Z=0 | c | d |
In treatment group: PS=P(A=1|Z=1)=a/(a+b), weight1=1/PS=(a+b)/a
In control group: 1−PS=P(A=0|Z=1)=b/(a+b), weight2=1/(1−PS)=(a+b)/b
Therefore,
weighted sample size with Z=1 in treatment(A=1) group is a∗weight1=a∗(a+b)/a=a+b;
weighted sample size with Z=1 in control(A=0) group is b∗weight2=b∗(a+b)/b=a+b
in treatment group: PS=P(A=1|Z=0)=c/(c+d), weight3=1/PS=(c+d)/c
in control group: 1−PS=P(A=0|Z=0)=d/(c+d), weight4=1/(1−PS)=(c+d)/d
sample size with Z=0 in treatment(A=1) group = c∗weight3=c∗(c+d)/c=c+d
sample size with Z=0 in the control(A=0) group = d∗weight4=d∗(c+d)/d=c+d
to sum up, the weighted sample size is:
| treatment(A=1) | control(A=0) | |
|---|---|---|
| Z=1 | a+b | a+b |
| Z=0 | c+d | c+d |
that is, exchangeability achieved!
IPW=Ti/PS+(1−Ti)/(1−PS)
IPW=Ti/PS+(1−Ti)/(1−PS)
Our goal in this example is to estimate the causal effect of bed net usage on malaria risk using only observational data.
Our goal in this example is to estimate the causal effect of bed net usage on malaria risk using only observational data.
The mosquito_net data set contains the following variables:
Our goal in this example is to estimate the causal effect of bed net usage on malaria risk using only observational data.
The mosquito_net data set contains the following variables:
Our goal in this example is to estimate the causal effect of bed net usage on malaria risk using only observational data.
The mosquito_net data set contains the following variables:
Malaria risk (malaria_risk): The likelihood that someone in the household will be infected with malaria, range 0-1.
Mosquito net (net and net_num): A binary variable indicating if the household used mosquito nets. Eligible for program (eligible): A binary variable indicating if the household is eligible for the free net program.
Our goal in this example is to estimate the causal effect of bed net usage on malaria risk using only observational data.
The mosquito_net data set contains the following variables:
Malaria risk (malaria_risk): The likelihood that someone in the household will be infected with malaria, range 0-1.
Mosquito net (net and net_num): A binary variable indicating if the household used mosquito nets. Eligible for program (eligible): A binary variable indicating if the household is eligible for the free net program.
Income (income): The household’s monthly income.
Our goal in this example is to estimate the causal effect of bed net usage on malaria risk using only observational data.
The mosquito_net data set contains the following variables:
Malaria risk (malaria_risk): The likelihood that someone in the household will be infected with malaria, range 0-1.
Mosquito net (net and net_num): A binary variable indicating if the household used mosquito nets. Eligible for program (eligible): A binary variable indicating if the household is eligible for the free net program.
Income (income): The household’s monthly income.
Nighttime temperatures (temperature): The average temperature at night.
Health (health): Self-reported healthiness in the household. Measured on a scale of 0–100.
Number in household (household): Number of people living in the household.
Health (health): Self-reported healthiness in the household. Measured on a scale of 0–100.
Number in household (household): Number of people living in the household.
Insecticide resistance (resistance): Some strains of mosquitoes are more resistant to insecticide and thus pose a higher risk of infecting people with malaria. This is measured on a scale of 0–100, with higher values indicating higher resistance.
Health (health): Self-reported healthiness in the household. Measured on a scale of 0–100.
Number in household (household): Number of people living in the household.
Insecticide resistance (resistance): Some strains of mosquitoes are more resistant to insecticide and thus pose a higher risk of infecting people with malaria. This is measured on a scale of 0–100, with higher values indicating higher resistance.
pacman::p_load(readr)nets <- read_csv("mosquito_nets.csv") head(nets[,c(1:6)])
## # A tibble: 6 x 6## id net net_num malaria_risk income health## <dbl> <lgl> <dbl> <dbl> <dbl> <dbl>## 1 1 TRUE 1 33 781 56## 2 2 FALSE 0 42 974 57## 3 3 FALSE 0 80 502 15## 4 4 TRUE 1 34 671 20## 5 5 FALSE 0 44 728 17## 6 6 FALSE 0 25 1050 48 head(nets[,c(7:10)])
## # A tibble: 6 x 4## household eligible temperature resistance## <dbl> <lgl> <dbl> <dbl>## 1 2 FALSE 21.1 59## 2 4 FALSE 26.5 73## 3 3 FALSE 25.6 65## 4 5 TRUE 21.3 46## 5 5 FALSE 19.2 54## 6 1 FALSE 25.3 34###fit logistic regression to get propensity scoremodel_net <- glm(net ~ income + temperature + health, data = nets, family = binomial(link = "logit"))# tidy(model_net, exponentiate = TRUE)pacman::p_load(broom)##augment_columnnet_probabilities <- augment_columns( model_net,##model nets, ## original data type.predict="response" ## type of the prediction ) %>% rename(propensity = .fitted)net_probabilities %>% select(id, net, income, temperature, health, propensity) %>% head()
## # A tibble: 6 x 6## id net income temperature health propensity## <dbl> <lgl> <dbl> <dbl> <dbl> <dbl>## 1 1 TRUE 781 21.1 56 0.367## 2 2 FALSE 974 26.5 57 0.389## 3 3 FALSE 502 25.6 15 0.158## 4 4 TRUE 671 21.3 20 0.263## 5 5 FALSE 728 19.2 17 0.308## 6 6 FALSE 1050 25.3 48 0.429net_ipw <- net_probabilities %>% mutate(ipw = (net_num / propensity) + ((1 - net_num) / (1 - propensity)))# Look at the first few rows of a few columnsnet_ipw %>% select(id, net, income, temperature, health, propensity, ipw) %>% head()
## # A tibble: 6 x 7## id net income temperature health propensity ipw## <dbl> <lgl> <dbl> <dbl> <dbl> <dbl> <dbl>## 1 1 TRUE 781 21.1 56 0.367 2.72## 2 2 FALSE 974 26.5 57 0.389 1.64## 3 3 FALSE 502 25.6 15 0.158 1.19## 4 4 TRUE 671 21.3 20 0.263 3.81## 5 5 FALSE 728 19.2 17 0.308 1.44## 6 6 FALSE 1050 25.3 48 0.429 1.75# pacman::p_load(jstable)pacman::p_load(tableone)# svyCreateTableOnepacman::p_load(survey)#svydesignsdm_dat_weighted <- svydesign(ids = ~ id, strata = ~ net, weights = ~ ipw, nest = FALSE, data = net_ipw)sdm_dat_unweighted <- svydesign(ids = ~ id, strata = ~ net, weights = ~ 1, nest = FALSE, data = net_ipw)tabWeighted <- svyCreateTableOne(vars = c('income','health','temperature'), strata = "net", data =sdm_dat_weighted, test = FALSE)tabunweighted <- svyCreateTableOne(vars = c('income','health','temperature'), strata = "net", data =sdm_dat_unweighted , test = FALSE)## Show table with SMDprint(tabunweighted, smd = TRUE)
## Stratified by net## FALSE TRUE SMD ## n 1071.00 681.00 ## income (mean (SD)) 872.75 (172.69) 955.19 (201.63) 0.439## health (mean (SD)) 48.06 (17.22) 54.91 (18.93) 0.379## temperature (mean (SD)) 24.09 (4.03) 23.38 (4.20) 0.172print(tabWeighted, smd = TRUE)
## Stratified by net## FALSE TRUE SMD ## n 1741.09 1784.96 ## income (mean (SD)) 899.89 (175.51) 890.39 (210.85) 0.049## health (mean (SD)) 50.38 (17.49) 49.71 (19.40) 0.036## temperature (mean (SD)) 23.83 (4.03) 23.91 (4.24) 0.019model_unweighted <- lm(malaria_risk ~ net, data = nets)tidy(model_unweighted)
## # A tibble: 2 x 5## term estimate std.error statistic p.value## <chr> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) 41.9 0.405 104. 0 ## 2 netTRUE -16.3 0.649 -25.1 2.25e-119model_ipw <- lm(malaria_risk ~ net, data = net_ipw, weights = ipw)tidy(model_ipw)
## # A tibble: 2 x 5## term estimate std.error statistic p.value## <chr> <dbl> <dbl> <dbl> <dbl>## 1 (Intercept) 39.7 0.468 84.7 0 ## 2 netTRUE -10.1 0.658 -15.4 3.21e-50Calculation : IPW=Ti/PS+(1−Ti)/(1−PS)
Target population: the whole population, both treated and controlled.
Calculation : IPW=(PS∗Ti)/PS+[PS∗(1−Ti)]/(1−PS)
Target population: treated population
Calculation : IPW=[(1−PS)∗Ti]/PS+[(1−PS)∗(1−Ti)]/(1−PS)
Target population: treated population
Calculation : IPW=[minPS,1−PS]/[Ti∗PS+(1−Ti)∗(1−PS)]
Target population: treated population
Calculation : IPW=(1−PS)∗Ti+PS∗(1−Ti)
Target population: treated population
Keyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| Number + Return | Go to specific slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |