# AnalyticBridge

Subscribe to Vincent Granville's Weekly Digest:

# Predictive modeling: how to identify cause, consequence, feedback loops or external factor?

Let's say that you have a predictive model such as Y = f(X), where both X and Y can be multivariate. How do you determine whether X causes Y, or whether Y causes X, or whether some components of X cause some component of Y and the other way around? Or even the lack of direct association, such as in the model Income = f(Quality of your Health) -- the older you are, the worse your health, but the higher your income, "age" actually being the hidden variable explaining the relation.

How and how well does structural equation modeling -- and other techniques -- answer these questions?

Views: 28

### Replies to This Discussion

Statistics can’t determine causality. Ever. Only association. Causality isn’t a statistical phenomenon.

For years the government couldn’t say that smoking causes cancer in humans. Lots of evidence of association, but you can only determine causality with an experiment. i.e. assign one group to smoke and another group to not smoke. Obviously there are ethical problems with this type of experiment, which is why it wasn’t done.

“Statistical Causality” is a different animal. First, it requires you to alter the traditional definition of causality, usually called statistical causality, conditional causality, Granger causality, etc… Granger causality works with two variables – problems occur if a third is involved. If C causes A and B, A can Granger Cause B even though changes in A wouldn’t change B. So despite their names, they don’t imply true causality. There are tests of statistical causality with multiple variables in the VAR model. Granger won the Nobel Prize in Economics in part for his work on the subject. The real problem is that you have to know the true structure of relationships in advance, from God, in order for any of these things to really work. Which is why they don’t.
Sun,

A lot of work is being done nowadays under the Six-Sigma umbrella to determine cause and effect and there are so many different techniques available. Take a look at Root Cause Analysis, and Why-Because....

-Ralph Winters
What about time series analysis, autoregression etc.?

Simpler: I had a lot of cases of quite big change of two or more variables. In such cases You can analyse not only the relationship between variables, but also - let's say - the "first derivative" of them. In this way You can see which variables are really cause-effect interconnected.
For Your example above: if You see a big jump in the income, You can easily exclude its relationship with the age, as it doesn't change stepwise. The same can hold for the health...
So - first look at the data carefully... :-)

Andrzej

1

2

3

4

5

6

7

8

9

10