(2014-Sep-23, 21:08:42)menghu1001 Wrote:(2014-Sep-18, 15:26:09)Emil Wrote: P1 = predictor 1, P2 = predictor 2, etc.

V1 = outcome var 1, V2 = outcome var 2, etc.

I understand the labels, but I quoted this sentence "An interaction would be that a given predictor P1 is better at predicting variable V1 than P2, but that P2 is better at predicting V2 than P1" for another reason; it's that I don't understand the meaning of it. By saying V1 and then V2, you have in mind two different regression models.

Originally, the question was asked by Dalliard :

Quote:3) "Are some predictors just generally better at predicting than others, or is there an interaction effect between predictor and variables?"

Not sure what you mean by interaction here. The question is whether any of the predictors have unique predictive power.

What you said is whether or not the inclusion of interaction terms will affect the (independent) relative strength of your independent variables, within the same regression. But not two different ones.

Another thing I don't understand, it's because an interaction between predictors is aimed to answer the question about if adding interaction such as P1*P2, with or without squaring them (P1+P2+P2^2+P2^3+P1*P2+P1*P2^2+P1*P2^3), can change your independent coefficients. If the slopes are not linear but curvilinear, the addition of an interaction term will fit the data better. In general, what happens when an interaction is meaningful, is that the main effects (i.e., P1 and P2) will be attenuated. Even if one of the two predictors is more attenuated than the other, I don't think it's relevant here. The interpretation of the main effects becomes totally different when you add interactions. With an interaction, P1 and P2 are the effects net of the interaction, but the interaction itself includes and confounds the effect of both.

I remember several months ago when I attempted to perform regression with wordsum (dep) and race + SES (indep) variables. With the interaction of race*SES, the coefficient of race was near zero. A plot of the predicted values from the model revealed that at the very low SES levels, the BW gap in Wordsum was just meaningless, but that it increases considerably when SES increases. In such a situation, how can we say that race has become less important ?

You cannot say that SES is more important than race just because the interaction term nullifies the main effect of race, because the interaction term confounds the two effects. (when I say "more important", I am of course talking about the direct effects of the independent variables)

Meng Hu,

Here's a very simple scenario:

Imagine we have three predictors and 5 outcome variables.

The correlations between the three predictors and 5 outcome variables. Suppose we obtain their prediction vectors, i.e. the correlations between the predictors and each of the outcome variables.

Code:

`vec.a = c(.1, .2, .3, .4, .5) #vector a`

vec.b = c(.15, .15, .4, .6, .8) #vector b

vec.c = c(0.26, -1.12, -0.33, -0.34, 0.06) #vector c

DF.vec = cbind(vec.a, vec.b, vec.c) #dataset

DF.vec.cor = cor(DF.vec) #cortrix

round(DF.vec.cor,2)

Which gives:

Code:

`vec.a vec.b vec.c`

vec.a 1.00 0.97 0.11

vec.b 0.97 1.00 0.33

vec.c 0.11 0.33 1.00

So we see that the r between a and b is very high, so they function in the same way, but may not be equally strong predictors. However, c is clearly very different and has low r's with the other two. This means there is a predictor x outcome variable interaction.

I only talk of correlations, no regression models. I am not talking about adding interactions variable (e.g. a*b) in regression models.

I am also not talking about predicting unique parts of the variance in multiple regression.

Apparently, the term has confused some readers. What term do prefer me to use? Perhaps just talk about testing the generality vs. specifically of the predictors predictive power?

---

Also, Dalliard's point made in the review of the International S factor paper about the use of variables that have not been reversed holds here as well. If one reverses them so that predictors always predict something better with a positive value, the correlations will get smaller.

I have used the data as given by the sources and not biased them in any way. Reversing them arguably makes the results less interpretable e.g. using Islam prevalence to predict low-crime as opposed to high crime.

However, suppose one really wants to minimize correlations. Doing it consistently to make as many correlations positive as possible, the new mean abs. correlations are .54 (Norwegian datasets) and .90 (Danish). The Danish dataset is much better since it has much less sampling error (all 25 vars have near N=70). So even arguably biasing the results against the hypothesis yields a strong positive outcome.