class: center, middle, inverse, title-slide # Evaluating causality in an
ideal
setting ## RCTs, DAGs, and Bias ### Mathew Kiang, Zhe Zhang, Monica Alexander ### March 15, 2017 --- <!-- class: center, middle .center[<img src="./assets/do_you_like_dags.jpg">] --> # Roadmap ??? `\(\def\indep{\perp \! \! \perp}\)` So we've got some of the basic definitions down. We know what a causal effect is and we have a framework for estimating causal effects under certain assumptions. Now we're going to talk about the best case scenario for estimaing causal effects — RCTs. **NEXT SLIDE** While you probably won't get to do a lot of randomized experiments in the real world, they are useful to think about because ultimately every study design or analytic method in causal inference is ultimately just trying to mimic an RCT. **NEXT SLIDE** After talking a bit about RCTs, we'll take a bit of a detour and talk about ways you can graphically state your causal and statistical assumptions. Causal DAGs. I think they're very useful and hope I can convince you they are too; however, a lot of people don't. Regardless, you should know about them because you'll likely come across them in the causal inference literature. **NEXT SLIDE** Finally, we'll combine RCTs and DAGs to show how bias can affect estimates when we don't have an RCT (and why RCTs are so powerful). Then next lecture, Zhe will talk more about the math behind these different biases. -- ### Randomized Controlled Trials - Thinking about causality in the ideal setting -- ### Causal Directed Acyclic Graphs (DAGs) - Useful ways of encoding our beliefs and assumptions - Helpful for thinking about ways our assumptions can be wrong -- ### Understanding Bias from a Graphical Perspective - More about these biases next lecture - Why RCTs avoid many of these biases --- # Randomized Controlled Trials .pull-right[<img src="./assets/james_lind.jpg" width="250">] ??? Again, we'll start off with a bit of motivation and history. **NEXT SLIDE** This is James Lind. He was a Scottish physician in the Royal Navy and he conducted the first recorded RCT in the 1700s. Specifically, he was interested in scurvy which is a disease from Vitamin C deficiency and causes pretty bad symptoms. It's a disease of the connective tissue because Vitamin C is required to create connective tissue, all sorts of bad things happen. Severe bleeding, gum disease, inability to heal from superficial wounds, etc. People eventually get personality changes and then die of infection or bleeding. That said, it's very easy to treat. Just some Vitamin C and improvement occurs in days and full recovery in about a month. Was very common back when there were long sea voyages and they were unable to keep fresh fruit available. **NEXT SLIDE** So Lind was on a ship and he found 12 sailors with scurvey. He randomly split them up into 6 groups and gave each group a different treatment in addition to their regular diet. These are the six different interventions. **NEXT SLIDE** As he suspected, only the group that had citrus fruits showed substantial improvement with one returning back to work and one on the mend (they ran out of fruit before he could completely heal). -- .footnote[Wikipedia] - First recorded RCT was done in 1747 by James Lind -- - Scurvy is a terrible disease caused by Vitamin C deficiency. Serious issue during long sea voyages. -- - Lind took 12 sailors with scurvy and split them into six groups of two: - Groups were assigned: (1) 1 qt cider, (2) 25 drops of vitriol, (3) 6 spoonfuls of vinegar, (4) 1/2 pt of sea water, (5) garlic, mustard, and barley water, (6) 2 oranges and 1 lemon -- - Only Group 6 (citrus fruit) showed substantial improvement. --- # Randomized Controlled Experiments .center[<img src="./assets/facebook_emotion_header.jpg" width="600">] ??? So you are all probabily more familiar with this study in PNAS. It was done by the Facebook Core Data Science Team. **NEXT SLIDE** They wanted to better understand social contagion so they conducted an experiment on over 600,000 Facebook users where they purposely hid negative or positive messages from users News Feeds and then tracked whether or not the users themselves posted more negative or positive emotions. **NEXT SLIDE** What they found was that emotional contagion existed within Facebook networks. That is, if you saw fewer negative posts, you were less negative and more positive in your own posts and vice versa. In other words, the emotions of your friends seemed to spread through Facebook. **NEXT SLIDE** It was really quite innovative because most social contagion work at this point had been done based on data and networks it was not designed for like the Framingham Heart Study or done in a laboratory setting with very few subjects. They were able to show that social contagion was real in an experimental setting and that you didn't actually need physical face-to-face interaction for it to happen. **NEXT SLIDE** It was also incredibly controversial because they never got informed consent from the users and yet they were manipulating users emotions. Thankfully, the effect size was very small, but what would have happened if the effect size wasn't small? And even if you have a small effect size, when you experiment on over half a million people, you may get a few adverse events. **NEXT SLIDE** -- - Massive experiment involving >600,000 people over 1 week -- - Found that emotions expressed by others (on Facebook) affect our own emotions. -- - Innovative because it showed that emotional contagion did not require directed social interaction, non-verbal cues, or face-to-face interaction. -- - Also incredibly controversial due to manipulating emotions of real people on a large scale with no informed consent. --- # Randomized Controlled Trials/Experiments .footnote[Hernan Epi 201] ### What's the big deal about randomization? ??? There are a few things worth noting about these ranodmized experiments. **NEXT SLIDE** First, Lind showed a strong causal effect with a tiny sample. An n=12 study would literally never get published today, but nonetheless, it's hard to argue with a strong causal effect even in a small RCT. **NEXT SLIDE** Second, even though there was just a tiny effect size, a large sample and randomization allowed Facebook to make pretty strong causal claims. **NEXT SLIDE** Third, even though Lind was wildly wrong about **why** the RCT worked, he was right about the causal effect of citric acid on preventing and curing scurvy. Back then, medicine didn't reocgnize the importance of vitamins and so Lind believed that survey or all the bleeding was a result of the body beging to putrefy and acid reduced the process. Similarly, we don't really understand the psychology of why social contagion works or much about it, yet it is hard to argue that it does not exist with such a large experiment. **NEXT SLIDE** -- - Even with a tiny sample, Lind was able to show convincing causal effect -- - Conversely, despite very small causal effect, Facebook was able to show convincing causality (via large sample size) -- - RCTs allow us to gain knowledge about causal effects **without knowing the mechanism** - Lind believed acid stopped putrefaction - Medicine had not yet understood importance of vitamins - Previous research suggested emotional contagion required non-verbal cues to be effective --- # Randomized Controlled Trials ### But RCTs aren't perfect! ??? BUT! RCTs have very real drawbacks or else wouldn't even be here learning about causal inference on observational data. **NEXT SLIDE** This is a real article out of BMJ and the caption at the bottom says "Parachutes reduce the risk of injury after gravitational challenge, but their effectiveness has not been proved by randomized control trials." The point being that not everything can be tested with a randomized control trials. **NEXT SLIDE** -- .footnote[.red[*]Smith and Pell, 2003.] .pull-right[<img src="./assets/parachutes_rcts.jpg" width="650">] .pull-left["Parachutes reduce the risk of injury after gravitational challege, but their effectiveness has not been proved with randomised controlled trials."] --- # Randomized Controlled Trials ### But RCTs aren't perfect - Expensive — limited feasibility ??? However, it is worth noting that you can't always do an RCT. You've never read an RCT on the causal effects of smoking and lung cancer in humans. It would be wildly unethical to force some humans to smoke and prevent others from doing so. The famous example people like to use is that you've never read an RCT on the effectiveness of parachutes on preventing death from jumping out of a plane. We know they work. We have observational studies to indicate it works. We have strong priors. Etc. **NEXT SLIDE** It's also important to note that RCTs are often very costly and have a smaller, more select sample size. While the *internal* validity is high, the *external* validity may not be. In other words, I know the causal effect is true for people *inside of my study*. But I cannot be certain that it is true for everybody outside of my study as well. -- - Ethical concerns — real effects on real people - Facebook and emotional manipulation study -- - Limited generalizability - The *internal validity* of an RCT is very high, but the *external validity* is often not when sample size is small or selective - For example, perhaps Facebook users value social posts more than non-Facebook users -- - Difficult — good RCTs are very hard to do --- # RCT/E Definitions .footnote[Hernan Epi 201] - **Randomized**: The intervention is assigned randomly (not necessarily the same as the treatment is *received* randomly) ??? Briefly, let's go over some definitions When people talk about randomized designs, they mean that the treatment was randomly assigned to groups — not that the people inside of the study are random (that's random sampling). Also note that assignment to treatment is not the same as receiving treamtnet. **NEXT SLIDE** Controlled just means there is at least one well-defined comparison group to act as your counterfactual. **NEXT SLIDE** An experiment just means investigators directly control the intervention. And a trial is just an experiment with the goal of studying the effect of some sort of medical intervention on humans. There's a distinction because trials have ethical requirements that experiments may or may not. **NEXT SLIDE** Lastly, let's define an IDEAL RCT as one with no loss to follow-up, so you see everybody's data for the entire study, perfect compliance — everybody does exactly as you ask them to, only one counterfactual group, and double blind. -- - **Controlled**: There is (at least) one well-defined comparison group -- - **Experiment**: Investigators (directly) control the intervention - **Trial**: An experiment where the goal is to study effect of some medical intervention on humans - Trials have ethical requirements that experiments may not (e.g., equipoise) -- - **Ideal setting**: In an ideal RCT, there is no loss to follow-up, perfect compliance, only one version of treatment, double (or triple) blind assignment --- # Randomized Controlled Trials - Recall from last lecture, because of the *Fundamental Problem of Causal Inference*, we can never observe both counterfactuals. ??? Ok. So why do people like randomized experiments? Recall that the fundamental problem of causal inference states we cannot observe both counterfactuals. **NEXT SLIDE** So instead we try to approximate it. Under what scenarios would the difference in treatment groups be the same as the difference in counterfactuals? **NEXT SLIDE** Then recall our assumptions. Here they are. **NEXT SLIDE** These conditions are all met under a randomized experiment. The most important is exchangeability so we'll go over that now. **NEXT SLIDE** -- - Instead, we ask: under what conditions do the data we have properly estimate the data we want? `$$Pr(Y^{a=1})-Pr(Y^{a=0})=Pr(Y|A=1)-Pr(Y|A=0)$$` -- - Again, recall our assumptions for the Rubin causal model: -- - How do we ensure our groups *exchangeable*? - That we have *positivity*? - And that we have *stability*? -- ### .center[These are all met in an ideal RCT] --- # Randomized Controlled Trials .footnote[Hernan Epi 201] - Suppose we have a very large (nearly infinite) population of perfect compliers ??? So here's a simple thought experiment. Suppose you have a large population. Let's say everybody in Mexico and everybody has agreed to do exactly as you say. **NEXT SLIDE** You take them and you randomly assign them to two groups. **NEXT SLIDE** By flipping a coin. **NEXT SLIDE** Now decide that you're going to treat group Heads and not Group Tails. You run the experiment and at the end, you find that the risk of death in Group Heads is .25. **NEXT SLIDE** Now pretend you did it the other way. You decided to treat group Tails and not Group Heads. What would the risk of death be? **NEXT SLIDE** Still .25. Because we randomized, on average, our groups will be the same across all characteristics **NEXT SLIDE** -- - We divide them into two groups: (1) Group Heads and (2) Group Tails -- - We assigned the groups by flipping a (fair) coin -- - Now, we decide to treat Group Heads and not treat Group Tails - The risk of death in Group Heads is: `\(Pr(Y=1|A=1) = .25\)` -- - Pretend that instead of doing this, we treated Group Tails and did not treat Group Heads. What would the expected risk of death in Group Tails be? -- - Risk of death in Group Tails would still `\(.25\)` if we flipped treatment. --- class: center, middle # Exchangeability is a consequence of randomization ## ??? So exchangeability is a consequence of randomization and that's why experiments are so powerful. **NEXT SLIDE** --- class: center, middle # Exchangeability is a consequence of randomization ## Unfortunately, we (usually) won't get to randomize ??? Unfortunately we don't get to randomize very often so instead we need to think of other ways that we can safely assume exchangeability. So let's define it formally. **NEXT SLIDE** --- # Formal definition of exchangeability .footnote[Hernan Epi 201] The counterfactual outcome `\(Y^a\)` is independent of the actual treatment `\(A\)`. `$$Y^a \indep A \text{ for all } a$$` ??? Exchangeability means that your counterfactual outcome is independent of your treatment. **NEXT SLIDE** Not straightforward to express using statistical language. **NEXT SLIDE** Note that it is different from independence. We actually expect your actual outcome to be related to your actual assignment. **NEXT SLIDE** -- - Exchangeability is "another causal concept that cannot be represented by associational (statistical) language" -- ### NOTE! Exchanagebility is not the same as independence. That is, `$$Y^a \indep A \neq Y \indep A$$` --- class: center, middle # All design and analytic methods try to mimic randomization in order to "create" exchangeability ??? Ok so that section on experiments was just to emphasize the importance of exchangeability. All design and analytic methods we talk about are just efforts to mimic randomization and "create" exchangeability. --- class: center, middle, inverse # The Life-changing Magic of DAGs --- # Causal Directed Acyclic Graphs (DAGs) .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - Causal DAGs provide a mathematical link between statistical association and causality. ??? Let's take what may seem like a little detour and talk about DAGs. DAGs are directed acyclic graphs. DAGs allow us to visually express our assumptions about causal effects based only on statistical associations. That is, you have some data and you run a regression with a bunch of variables and you get out a bunch of coefficients. **NEXT SLIDE** Drawing a DAG allows you to very explicitly state the relationship between all those statistical associations and how they lead to some causal effect. **NEXT SLIDE** Importantly, they allow you to also do the reverse. That is, given a bunch of associations from your regression model, what are the different ways you could have come up with a causal estimate and how are those ways wrong. DAGs have a strong mathematical underpinning and link causality and statistics. If you're interested in learning more about them, definitely check out Causality from Judea Pearl. **NEXT SLIDE** -- - DAGs allow us to easily express the statistical associations of different variables implied by our assumed causal structure .center[<img src="./assets/basic_dag.jpg" width="200">] -- - Even better, also allow us to do the reverse. Given a set of associations in our data, diagram the possible causal structures that could result in the same associations .center[<img src="./assets/total_confounding.jpg" width="300">] --- # DAGs and Causal Inference .center[<img src="./assets/basic_dag.jpg" width="200">] .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - Graphically, the goal of causal inference is to estimate magnitude and direction of the arrow ??? So in causal inference, you want to estimate the direction and magnitude of this arrow. X goes to Y so how much does changing X also change Y? **NEXT SLIDE** Now we can actually only estimate the magnitude. Our tools only allow us to estimate association so even if your model is right and your assumptions are right and your data are perfect, you're still only getting the magnitude of the effect. **NEXT SLIDE** And this estimate can occur for any one of these reasons. You can just got lucky with your sample. It could actually be causal. It could be reverse causation so maybe Y caused X and that's why they are associated. Maybe theres a common cause for both X and Y and you're estimating an association through a backdoor path. And lastly, maybe you are conditioning on something that is actually a common effect of X and Y. **NEXT SLIDE** -- - Statistically, we can only estimate the magnitude -- - The edge represents statistical association, which can exist due to one or more of the following reasons: 1. **Randomness**: Just got lucky 1. **Causal**: `\(X\)` really does cause `\(Y\)` 1. **Reverse Causation**: `\(Y\)` actually causes `\(X\)` 1. **Confounding**: `\(X\)` and `\(Y\)` share a common cause 1. **Collider/Selection**: `\(X\)` and `\(Y\)` have a common effect that we are conditioning on --- class: center, middle # Your job is to figure out which of these reasons is most consistent with the data and rule out all other explanations ??? Of these reasons, which one is the most plausible? Which ones can you rule out? That's the point of causal inference. Statistical inference is just are you using the right model and making the right assumptions to get an appropriate estimate of the association, but causal inference goes one step farther. If you have an association, can you rule out other possible explanations of its causes? --- # Rules for drawing a DAG .center[<img src="./assets/example_dag.jpg" width="200">] .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - Time flows left to right (thus arrows always point right) ??? Ok. So let's go over the rules for drawing and reading a DAG. First, when you draw a causal DAG, assume time goes left to right. Thus arrows will always point right because something from the future cannot cause something from the past. **NEXT SLIDE** Any arrow implies your belief that one variable caused another. Converse, no arrow implies your believe that it did **not** cause something. This allows us to encode expert or domain-specific knowledge. If you know something cannot cause something else from previous research or from a biological limitation or whatever, then not drawing that arrow allows you to encode it. **NEXT SLIDE** Finally, everything you care about is in that DAG. Even things that are not in your regression. If it is plausible that it is somewhere in your causal pathway, then it must be included in the DAG. Not including it implies you do not believe it exists in the causal pathway. **NEXT SLIDE** -- - An arrow implies our belief that something is *causal*. Conversely, the lack of an arrow implies our belief that things are *not* causal -- - Everything we are concerned with exists in the DAG (all common causes must be in the DAG) --- # Reading DAGs: Causal assumptions .center[<img src="./assets/example_dag.jpg" width="200">] .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - `\(X\)` and `\(U\)` are direct causes of `\(Y\)` ??? Ok. How do we read a causal DAG? There are two parts of reaching a causal DAG. First, the causal part. **NEXT SLIDE** In the DAG above, it is clear that both X and U independently are direct causes of Y. What is a direct cause of Z? **NEXT SLIDE** Does Z have any indirect causes? Yes, both U and X can cause Z through Y (i.e., Y mediates the U -> Z relationship). **NEXT SLIDE** X and U do not cause each other. **NEXT SLIDE** And no two variables share a common cause. (But Y is a common effect of both U and X). There are also a lot of statistical implications and this is where DAGs are really powerfu. So we are going to kind of skim this because d-separation is useful but beyond the scope of this class. Really all you need to know are these rules. **NEXT SLIDE** -- - `\(Y\)` is a direct cause of `\(Z\)` -- - `\(X\)` and `\(U\)` are *indirect* of causes `\(Z\)` through `\(Y\)` -- - `\(X\)` and `\(U\)` are not causes of each other -- - No two variables share a common cause --- # Reading DAGs: Mini d-Separation .center[<img src="./assets/example_dag.jpg" width="200">] .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - An edge implies statistical dependence ??? Any edge implies statistical dependence. **NEXT SLIDE** However, no statistical dependence if there is a collider. So statistical association cannot flow from X to U because Y is a collider. **NEXT SLIDE** No statistical dependence if you condition or block a variable. **NEXT SLIDE** However, blocking a collider or a descendent of that collider opens up the path through that collider. **NEXT SLIDE** -- - No statistical dependence if there is a *collider*. That is, when two arrowheads point to same variable (e.g., `\(U \rightarrow Y \leftarrow X\)`) -- - No statistical dependence if we *condition* or *block* a variable (denoted with a box around that variable) -- - There is statistical dependence if we block a collider or a *descendent* of that collider --- # Reading DAGs: Statistical implications .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] .center[<img src="./assets/example_dag.jpg" width="200">] ??? Ok So what edges in our DAG are statistical dependent? **NEXT SLIDE** -- - `\(X\)` and `\(Y\)` are statistically dependent - `\(U\)` and `\(Y\)` are statistically dependent - `\(Y\)` and `\(Z\)` are statistically dependent -- - `\(X\)` and `\(Z\)` are statistically dependent (i.e., `\(X \rightarrow Y \rightarrow Z\)`) - `\(U\)` and `\(Z\)` are statistically dependent (i.e., `\(U \rightarrow Y \rightarrow Z\)`) --- # Reading DAGs: Statistical implications .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] .center[<img src="./assets/example_dag.jpg" width="200">] ??? Is there anything in our DAG that is statistical independent? **NEXT SLIDE** -- - `\(X\)` and `\(U\)` are statistically independent (blocked by a collider `\(Y\)`) -- - `\(X\)` and `\(U\)` are statistically dependent, conditional on `\(Y\)` (conditioning on a collider, unblocks that path) -- - `\(X\)` and `\(U\)` are statistically dependent, conditional on `\(Z\)` (conditioning on a collider's descendent, unblocks that path) -- - `\(X\)` and `\(Z\)` are statistically independent, conditional on `\(Y\)` (conditioning on `\(Y\)` blocks the `\(X \rightarrow Y \rightarrow Z\)` path) -- - Similar to above, `\(U\)` and `\(Z\)` are statistically independent, conditional on `\(Y\)`. --- # Causal DAGs Assumptions .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] .center[<img src="./assets/example_dag.jpg" width="200">] 1. **Causal Markov Assumption.** Any variable `\(X\)` is independent of any other variable `\(Y\)` conditional on the direct causes of `\(X\)` (unless `\(Y\)` is an effect of `\(X\)`). ??? The CMA basically means if I know Y, knowing U and X won't provide me with more information about Z. If you just follow the rules in the slide before, this follows. **NEXT SLIDE** Faithfulness is easy... We're going to assume if there are positive and negative causal effects, we will be able to estimate them because they will never equal exactly 0. **NEXT SLIDE** Neglible randomness just means we are going to assume our sample size is of sufficient size that we aren't just getting lucky draws from our population. **NEXT SLIDE** -- 1. **Faithfulness.** Positive and negative causal effects never *perfectly* offset. -- 1. **Neglible randomness.** Statistical associations (or lack of such associations) are never due to random change (i.e., large sample size assumption). --- class: center, middle # Understanding Bias Graphically ## Structural underpinnings of bias ??? I'm going to go over the ways our models can be wrong using DAGS. It's really just sort of a high level conceptual view of bias and causal inference. Then Zhe will talk about bias in a more mathematical way. --- # The Structure of Bias .center[<img src="./assets/partial_confounding.jpg" width="200">] - Recall the five ways we can find statistical association: 1. **Randomness**: Just got lucky 1. **Causal**: `\(X\)` really does cause `\(Y\)` 1. **Reverse Causation**: `\(Y\)` actually causes `\(X\)` 1. **Confounding**: `\(X\)` and `\(Y\)` share a common cause 1. **Collider/Selection**: `\(X\)` and `\(Y\)` have a common effect that we are conditioning on ??? So recall this slide from before. These are the only ways a statistical association can exist. --- # The Structure of Bias .center[<img src="./assets/partial_confounding.jpg" width="200">] - Recall the five ways we can find statistical association: 1. ~~**Randomness**: Just got lucky~~ 1. **Causal**: `\(X\)` really does cause `\(Y\)` 1. **Reverse Causation**: `\(Y\)` actually causes `\(X\)` 1. **Confounding**: `\(X\)` and `\(Y\)` share a common cause 1. **Collider/Selection**: `\(X\)` and `\(Y\)` have a common effect that we are conditioning on - Ignore (1) because of negligible randomness assumption (in other words, assume we have a sufficient sample size). ??? Using the neglible randomness association, let's just forget about 1. --- # Reverse Causation - In observational data, we often do not have *temporal ordering*; that is, we don't know for sure our treatment occurred before our outcome. ??? In observational data, we often only have a cross-section of data and thus we don't know for sure temporal ordering. We can't be positive that our treatment occurred before our outcome. This is also true for more abstract or ill-defined causal questions. **NEXT SLIDE** Does income affect health? That is, do richer people live longer because they are richer? It's unclear... There's ambiguous temporal ordering. **NEXT SLIDE** Maybe people who get sick a lot miss a lot of school, don't do as well, get a lower quality education, and are hired into lower paying jobs? **NEXT SLIDE** Now, remember I said arrows should always point left to right? That's still true. Reverse causation is a general term and we use right to left arrows as shorthand, but they are almost always just special cases of collider bias or confounding. **NEXT SLIDE** -- - For example, the causal question "Does income affect health?" has ambiguous temporal ordering. -- - Perhaps people who are sick a lot missed a lot of school and thus have lower paying jobs. .center[<img src="./assets/reverse_single.jpg" width="200">] -- - "Reverse causality" is a more general term for bias. It is usually a special case of *confounding* or *collider bias*. --- # Reverse Causation: Example - In reality, it is often a mix of both and causation is bidirectional. .center[<img src="./assets/reverse_double.jpg" width="200">] ??? For these sort of ill-defined intervention or abstract questions, the directionality is often bidirectional and we are trying to decompose the effects. **NEXT SLIDE** Another example question is the obesity paradox, which has a lot of problems. But the obesity paradox is that it seems like people of high BMI (obese) have better outcomes after certain outcomes than those with a normal BMI. For example, heart attacks. After a heart attack that requires surgical intervention, obese people have a lower mortality rate than normal weight people. **NEXT SLIDE** One reason has been reverse causality. That really sick people lose weight quickly and are at higher risk of death. **NEXT SLIDE** -- - Healthier people tend to have characteristics that also lead to higher income (relative to sicker people). - Higher income also allows you to live a healthier lifestyle (e.g., gym membership, health insurance, quality food, etc.) -- - **Obesity paradox**: People with high BMI seem to have better outcomes after heart attacks than those with normal BMI. -- - One explanation has been reverse causality — people who are very ill tend to lose weight rather quickly (and are also at higher risk of death). --- # Confounding .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - When a causal relation between two variables differs from their statistical association, the relationship is **confounded**. ??? There are other definitions that talk about like backdoor paths and all these conditions, but they are actually poorly defined and it is not that hard to find a pathological case where those conditions do not hold but the relationship is confounded. Instead, we will use the general definition of confounding. A relationship is confounded when its causal relation differs from its statistical association. **NEXT SLIDE** For example. X and Y do not exist here and yet if you ran a regression and did not control for U, you would find a statistical association. In this case, we have total confounding. **NEXT SLIDE** Similarly, in this DAG, there is a causal relationship but its not accurate because the backdoor path from X to U to Y would change the statistical association. This is partial confounding. **NEXT SLIDE** -- - For example, here the relationship between `\(X\)` and `\(Y\)` does not exist causally, but there is a statistical association induced by *total* confounding via `\(U\)`. .center[<img src="./assets/total_confounding.jpg" width="200">] -- - Similarly, below, even when a causal relationship does exist, the statistical association is poorly estimated due to *partial* confounding through `\(U\)`. .center[<img src="./assets/partial_confounding.jpg" width="200">] --- # Confounding .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - Confounding is specific to variables of interest. Here, `\(U\)` confounds the `\(X \rightarrow Y\)` relationship, but does not confound the `\(Z \rightarrow Y\)` relationship. .center[<img src="./assets/iit_iv.jpg" width="200">] ??? Note that confounding is specific to a pair of variables. Here, U confounds X to Y but it does not confound Z to Y. So we need to specify how a relationship is confounded. **NEXT SLIDE** Recall from our DAG rules that we could block U and it would allow for us to properly estimate X and Y. Also note that this is true for any single confounder but can also be done for a set of confounders. If we have a set of variables and blocking all of them allows us to estimate the causal effect, we call the set a sufficient set. (Also note that often times U will represent a sufficient set of variables and not just a single variable.) **NEXT SLIDE** Back to the Obesity Paradox. Another reason some people think the obesity paradox exists is due to confounding. For example, smoking will lead to lower weight but also much worse cardiovascular outcomes. **NEXT SLIDE** -- - Blocking a confounder or a *sufficient set* of confounders allows for the proper estimation of causal effect. .center[<img src="./assets/blocked_confounder.jpg" width="200">] -- - **Obesity paradox**: Smoking leads to lower weight (on average), but also worse health outcomes. --- # Confounding: Example .center[<img src="./assets/ice_cream_iq.png" width="800">] ??? This is a graph that came out last year in the Economist. On the X axis is the PISA reading score for different countries, which tells you the literacy of that country. On the Y axis is the number of liters of ice cream per year per capita. Does this indicate we should eat more ice cream to be smarter? Probably not. It's probably confounded by the income of the country. Also note it came out April Fools but many people took it seriously. --- # Collider/Selection Bias .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] - Collider bias is "the change in association between two variables when conditioning on their common effect". ??? Remember that association doesn't flow through a collider unless you block it. Just like you need to block appropriate variables, you can also block inappropriately drawing a false association. **NEXT SLIDE** The most common form is called selection bias, but you can also induce this type of bias by stratifying inappropriately, M-bias, or inappropriate covariate adjustment. **NEXT SLIDE** Here is what it looks like graphically. **NEXT SLIDE** Now, obesity paradox. I'm trying to use the same paradox to show that many questions can suffer from multiple sources of bias. Maybe obese patients who live long enough to have a heart attack are different from obese patients who don't. **NEXT SLIDE** -- - Most common form is selection bias, but can also occur due to stratification, M-bias, or improper covariate adjustment. -- - When we condition on some common effect (or a descendent of that common effect), we open up the path that was previously blocked by the collider. .center[<img src="./assets/selection_bias.jpg" width="200">] -- - **Obesity paradox**: Is there selection on patients who live long enough to get a heart attack? Perhaps these patients are different than the general obese population. --- # Collider/Selection Bias: Example .center[<img src="./assets/nba_players.jpg" width="600">] ??? Let's pretend in order to play in the NBA, you have to either be very fast or very tall. This is Isaiah Thomas. He's 5' 9". The average height in the NBA is 6' 7". What do we know about his athletic ability? We know he is fast, but we only know that because we are looking at NBA players. In the general population (that is, if we didn't condition on being in the NBA), there's no reason to believe there is a connection between height and speed. --- class: center, middle # Back to RCTs ??? Ok. We took a bit of a detour, but let's go back to RCTs and hopefully we can see why they are so powerful. --- # Strength of RCTs .center[<img src="./assets/partial_confounding_blocked.jpg" width="200">] - The main strength of randomization is that you eliminate any confounding — both measured and unmeasured. ??? Randomization is the only way we can ensure exchangeability and no confounding. It basically eliminates the U to X pathway. **NEXT SLIDE** Another way to think about it is if you consider Z as your randomization (that is, your coin flip) then you can regress Z to Y and the only effect Z has on Y is through X and thus represents your causal effect. This is called IV or IIT and we'll talk about it more with Zhe. **NEXT SLIDE** Lastly, a sufficiently powered RCT also reduces the chance for randomly luck. **NEXT SLIDE** Note, however, collider bias is still possible even in an RCT. -- .center[<img src="./assets/iit_iv.jpg" width="170">] - Quick aside: Another way to think about this is if we regress on the randomization (e.g., the coin flip), then we are estimating `\(Z \rightarrow X \rightarrow Y\)` for our causal effect, and this relationship is unconfounded. (This is known as IV or Intention-to-treat. More in a bit.) -- - Also address reverse causality through proper temporal ordering. And lastly, sufficient sample size eliminates random chance. -- - **NOTE**: Does not eliminate collider bias. --- class: center, middle # Thanks! ??? Ok. Going to hand it over to Zhe who will talk a bit more about the mathematical underpinnings of bias. --- # Sources - [Chapter 16 Methods in Social Epidemiology](http://publicifsv.sund.ku.dk/~nk/epiF14/Glymour_DAGs.pdf) by Maria Glymour - [The Economist: Ice Cream and IQ](http://www.economist.com/blogs/graphicdetail/2016/04/daily-chart) - [Miguel Hernan Causal Inference Book](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/) - [Smith and Pell — Parachutes and RCTs](http://www.bmj.com/content/327/7429/1459) - [Facebook Emotion Study](http://www.pnas.org/content/111/24/8788.full) --- class: center, middle # Additional slides --- # Example of M-Bias .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] .center[<img src="./assets/m_bias_example.jpg" width="650">] - Controlling for mother's diabetes status induces a correlation between low education and diabetes. --- # Example of Survivor Bias .footnote[.red[*]Ch 16 by Maria Glymour in Methods Social Epidemiology] .center[<img src="./assets/survivor_bias.jpg" width="600">] --- # Other measures of causal effect If exchangeability holds, we can express the causal effects in different ways: 1. **Absolute difference**: `\(ATE = Pr[Y=1|A=1] - Pr[Y=1|A=0]\)` 1. **Risk ratio**: `\(ATE = \frac{Pr[Y=1|A=1]}{Pr[Y=1|A=0]}\)` 1. **Odds ratio**: `\(ATE = \frac{Pr[Y=1|A=1] / Pr[Y=0|A=1]}{Pr[Y=1|A=0] / Pr[Y=1|A=0]}\)`