Hello There, Guest!  

[OQSPS] Explaining Terrorism Threat Level Across Western Countries

#1
Journal:
Open Quantitative Sociology and Political Science

Authors:
Noah Carl

Title:
Explaining Terrorism Threat Level Across Western Countries on the Day of the Brussels Terrorist Attacks

Abstract:
The Foreign and Commonwealth Office of the British government reports a terrorism threat level for every country. This paper analyses variation in terrorism threat level across Western countries on 22nd March 2016, the day of the airport and metro station bombings in Brussels. It finds that percentage of Muslims in the population and military intervention in the Middle East are independently associated with terrorism threat level. In other words, Western countries which have a higher percentage of Muslims, and which have intervened militarily in the Middle East, tend to exhibit higher terrorism threat levels in March 2016. Despite a small sample size, these results are fairly robust across different specifications.

Key words:
Terrorism; Western countries; percentage Muslim; military intervention; Brussels

Length:
11 pages; 2,824 words


Attached Files
.pdf   Terrorism Threat.pdf (Size: 172.23 KB / Downloads: 434)
.xlsx   Terrorism Threat.xlsx (Size: 60.83 KB / Downloads: 372)
 Reply
#2
paper Wrote:The fact that many Islamist terrorist attacks have been perpetrated in Muslim countries (such as Turkey, Syria, Iraq, Tunisia and Egypt) constitutes rather strong evidence that at least some Islamist terrorists are motivated by grievances other than Western military intervention


A further option is that these countries despite being inhabited mostly by Muslims, have been allied with Western governments in attacks or perceived attacks on Muslims. Turkey is a member of NATO and has taken part in aggression against Muslim states: Turkey was part of the aggression in Libya and Afghanistan and as well as the current conflict in Syria. https://en.wikipedia.org/wiki/List_of_wa...ing_Turkey

paper Wrote:Terrorism threat level is treated as a linear scale, running from 1 (low) to 4 (high).


I think you mean ordinal or perhaps interval scale. Multiple regression/correlation assume interval scale.

You may also want to model it using latent correlations altho I'm not sure how one does this for multiple regression. For the bi-variate case, see http://john-uebersax.com/stat/tetra.htm.

paper Wrote:One major caveat concerning this measure is that it was not possible to discern how the FCO actually puts it together. In particular, it was not possible to rule out that the measure is partly based on information such as percentage of Muslims in the population or military intervention in the Middle East. If it is partly based on such information, then the analyses in Section 3 are somewhat tautological. In an attempt to discern how the measure is in fact constructed, two emails were sent to the FCO (see Appendix A). However, in both cases, the reply received was wholly uninformative: each one simply provided a link to the FCO’s travel advice page, namely FCO (2016). The analyses in Section 3 are predicated on the assumption that terrorism threat level is based on information such as secret intelligence reports, rather than demographic or foreign policy statistics.


I am happy you mentioned this problem. It is unfortunate that they did not provide more useful answers.

Do other countries or agencies publish similar terrorism threats? If they do, it would be nice to replicate the analyses with their measures, perhaps also use factor analysis if one could find a number of sources. I seem to recall that EUROPOL also publishes some similar data.

Finally, does the FCO still have their older terrorism threat estimates? It may be interesting to look at the threat level longitudinally. It is possible to estimate the proportion of Muslims in countries longitudinally as well (using country of origin information + a simple compositional model á la http://openpsych.net/ODP/2015/03/increas...1980-2014/).

paper Wrote:Three measures of military intervention in the Middle East were utilised: first, whether a country sustained any military deaths in the Iraq (Operation Iraqi Freedom) or Afghanistan (Operation Enduring Freedom) wars, as reported by iCasualities.org (2016a,b); second, whether a country sustained at least 50 military deaths in the Iraq or Afghanistan wars, as reported by iCasualities.org (2016a,b); and third, whether a country is part of the anti-ISIS military coalition, as reported by Wikipedia (2016). 21 countries (75%) in the sample sustained at least one military death in Iraq or Afghanistan; 8 (29%) sustained at least 50 military deaths in Iraq or Afghanistan; and 7 (25%) are part of the Anti-ISIS military coalition.


Is there a reason that these threshold variables are used instead of a (log transformed perhaps) death count? The number 50 seems arbitrary and may as well have been e.g. 25. One might also argue that one should use per capita death counts. The datafile only has the dummy coded variables, not the actual counts, unfortunately, so others cannot easily try a continuous approach.

There is also the question of why the countries only include Iraq and Afghanistan and not, say, Syria or Libya. ISIS is based in Syria and Iraq, so there would seem to be prima facie reason to include these two given that ISIS is the most active Islamic terrorist organization currently operating (as far as I know).

paper Wrote:Three control variables were utilised: first, GDP per capita at PPP for 2014, taken from OECD (2016a); harmonised unemployment rate for 2014, taken from OECD (2016b); and post-tax post-transfer Gini coefficient (a measure of income inequality), taken from OECD (2016c). Because there was no recent year in which the Gini coefficient was available for all countries in the sample, the maximum value observed between 2009 and 2011 was utilised. In order to reduce skewness, the logarithmic transformation was applied to GDP per capita.


Why were these particular variables chosen and not others? The paper does not mention any reason one might want to control for these variables.

The betas are not shown for the control variables. Why is this?

paper Wrote:(p < 0.001)


I am not a big fan of p values. I would be very happy if you instead used confidence intervals for the reasons given in e.g. http://pss.sagepub.com/content/25/1/7.

However, if you really want to use p values, perhaps it would a good idea to supplement with confidence intervals in important cases. Since there are relatively few datapoints, it would perhaps be best to use bootstrapped CIs because these do not involve parametric assumptions (are your variables normally distributed?).

paper Wrote:although 50+ military deaths in Iraq or Afghanistan is only significant at the 10% level


I would prefer that you drop any mentioning of 10% 'significance'. It's a too high level of alpha in my opinion.

-

What happens if one includes more than one of the military intervention against Muslim states predictors? I imagine they show appreciable levels of collinearity, so that may not yield anything useful.

paper Wrote:Note that percentage Muslim was Winsorized at its second largest value (namely 19%, for Israel), because its largest value (namely 99%, for Turkey) skewed the variable so substantially.


An alternative choice is log transformation. Was this also tried?

-

The datafile is attached. Could you instead place it on OSF? This is a better way to keep files and has built in versioning.

The variables are mentioned in the first sheet, but there are no links to the sources. Presumably these are web sources, so it would be very helpful with links.

-

Finally, how were the data analyzed? There is no code file attached so that others may review the analysis code.

-

There is some evidence that the countries with more Muslims also have Muslims that are on average more extreme. Muslims in Western countries are less extreme in their beliefs than Muslims in their home countries.

It is hard to find data about this. I analyzed Pew Research's Muslim dataset and found a clear general religious extremism factor that varied by country. Unfortunately, there are no Western countries in the dataset, so one cannot compare with the extremism of Muslims in e.g. Germany so easily. This means that there is substantial restriction of range decreasing the observed correlation. Still, there is a small positive correlations between mean extremism and proportion of the population that is Muslim. See the attached plot.

Blogpost: http://emilkirkegaard.dk/en/?p=5485

If this correlation is real and causal, it is a confound for your models.

-

I will do a full analytic replication of the analyses in the paper at some later point to confirm all the results. From the looks of it, the analyses conducted are fairly simple, so this should not take long.


Attached Files Thumbnail(s)
   
 Reply
#3
Many thanks for commenting on the manuscript. I attach a revised paper (with corrections highlighted), a revised pdf (without corrections highlighted), a datafile, and a Stata do file (in Word format).

Emil Wrote:A further option is that these countries despite being inhabited mostly by Muslims, have been allied with Western governments in attacks or perceived attacks on Muslims. Turkey is a member of NATO and has taken part in aggression against Muslim states: Turkey was part of the aggression in Libya and Afghanistan and as well as the current conflict in Syria. https://en.wikipedia.org/wiki/List_of_wa...ing_Turkey


I have changed the relevant sentence.

Emil Wrote:I think you mean ordinal or perhaps interval scale. Multiple regression/correlation assume interval scale.


I have changed the relevant sentence.

Emil Wrote:I am happy you mentioned this problem. It is unfortunate that they did not provide more useful answers.

Do other countries or agencies publish similar terrorism threats? If they do, it would be nice to replicate the analyses with their measures, perhaps also use factor analysis if one could find a number of sources. I seem to recall that EUROPOL also publishes some similar data.


I am not personally familiar with any other terrorism threat measure similar in nature to the one provided by the FCO.

Emil Wrote:Finally, does the FCO still have their older terrorism threat estimates? It may be interesting to look at the threat level longitudinally. It is possible to estimate the proportion of Muslims in countries longitudinally as well (using country of origin information + a simple compositional model á la http://openpsych.net/ODP/2015/03/increas...1980-2014/).


That would be a very interesting analysis to conduct. However, I do not believe it would be feasible to obtain the relevant data from the FCO in any reasonable time frame, especially given how unresponsive they were to my emails.

Emil Wrote:Is there a reason that these threshold variables are used instead of a (log transformed perhaps) death count? The number 50 seems arbitrary and may as well have been e.g. 25. One might also argue that one should use per capita death counts. The datafile only has the dummy coded variables, not the actual counts, unfortunately, so others cannot easily try a continuous approach.


I have utilised log of 1 + military deaths instead of at least 50 military deaths in all the multiple regression models.

Emil Wrote:There is also the question of why the countries only include Iraq and Afghanistan and not, say, Syria or Libya. ISIS is based in Syria and Iraq, so there would seem to be prima facie reason to include these two given that ISIS is the most active Islamic terrorist organization currently operating (as far as I know).


Including these interventions doesn't really make any difference––unless I'm mistaken––since the countries that took part are simply a subset of those that participated in Iraq & Afghanistan. In addition, very few military deaths have been sustained by Western forces in Syria and Libya, and there doesn't seem to be any standardised database (equivalent to iCasualties) for those conflicts.

Emil Wrote:Why were these particular variables chosen and not others? The paper does not mention any reason one might want to control for these variables.


I have included a paragraph explaining why each control variable was chosen.

Emil Wrote:The betas are not shown for the control variables. Why is this?


Given that the focus of the paper is the effects of percentage Muslim and military intervention, I would prefer not to unnecessarily clutter the regression tables with more coefficients. This practice is quite common in economics and sociology.

Emil Wrote:I am not a big fan of p values. I would be very happy if you instead used confidence intervals for the reasons given in e.g. http://pss.sagepub.com/content/25/1/7.

However, if you really want to use p values, perhaps it would a good idea to supplement with confidence intervals in important cases. Since there are relatively few datapoints, it would perhaps be best to use bootstrapped CIs because these do not involve parametric assumptions (are your variables normally distributed?).


I have reported 95% confidence intervals for the raw estimates.

Emil Wrote:I would prefer that you drop any mentioning of 10% 'significance'. It's a too high level of alpha in my opinion.


I have eliminated mentions of 10% significance.

Emil Wrote:What happens if one includes more than one of the military intervention against Muslim states predictors? I imagine they show appreciable levels of collinearity, so that may not yield anything useful.


Yes––unless I'm mistaken––the collinearity is close to perfect.

Emil Wrote:An alternative choice is log transformation. Was this also tried?


The paper now reports that the log transformation was also tried, and it yielded highly similar results.

Emil Wrote:The datafile is attached. Could you instead place it on OSF? This is a better way to keep files and has built in versioning.


I'm not sure how to do this. I haven't used OSF before.

Emil Wrote:The variables are mentioned in the first sheet, but there are no links to the sources. Presumably these are web sources, so it would be very helpful with links.


I have provided links to the sources.

Emil Wrote:Finally, how were the data analyzed? There is no code file attached so that others may review the analysis code.


I have provided the Stata do file (in Word format) used to analyse the data.

Emil Wrote:There is some evidence that the countries with more Muslims also have Muslims that are on average more extreme. Muslims in Western countries are less extreme in their beliefs than Muslims in their home countries.

It is hard to find data about this. I analyzed Pew Research's Muslim dataset and found a clear general religious extremism factor that varied by country. Unfortunately, there are no Western countries in the dataset, so one cannot compare with the extremism of Muslims in e.g. Germany so easily. This means that there is substantial restriction of range decreasing the observed correlation. Still, there is a small positive correlations between mean extremism and proportion of the population that is Muslim. See the attached plot.

Blogpost: http://emilkirkegaard.dk/en/?p=5485

If this correlation is real and causal, it is a confound for your models.


Interesting point. I would prefer not to deal with this issue in the present paper, especially given the fact that there were no Western countries in your sample. Perhaps it could be looked at in a future analysis.


Attached Files
.docx   Terrorism Threat 2.docx (Size: 113.04 KB / Downloads: 468)
.pdf   Terrorism Threat 2.pdf (Size: 177.26 KB / Downloads: 378)
.xlsx   Terrorism Threat 2.xlsx (Size: 63.59 KB / Downloads: 327)
.docx   Terrorism Threat do.docx (Size: 13.77 KB / Downloads: 349)
 Reply
#4
Files have now been uploaded to OSF, as requested: https://osf.io/5tv3a/
 Reply
#5
This post contains my analytic replication of Noah's analyses. Quotes are from the paper unless otherwise stated. My code is here: https://gist.github.com/Deleetdk/a5913f7...0adb9a1456

Quote:"The mean terrorism threat level in the sample is 2.4, while the median is 2.5. "

Replicated.

Quote:"This variable ranges from 0 (Czech Republic) to 8.2 (France), with a mean of 3.1, and a median of 2.6."

Replicated, except I get a median of 2.52. Typo perhaps.

Quote:"21 countries (75%) in the sample sustained at least one military death in Iraq or Afghanistan; the mean number of military deaths sustained in Iraq or Afghanistan is 294, while the median is 11; 7 countries (25%) are part of the anti-ISIS military coalition."

Replicated.

Quote:"The correlation between terrorism threat level and percentage Muslim is r = .64 (p < 0.001; 95% CI = [.33, .95])"

Correlation replicated, CI differed. I get:

Code:
> cor.test(d_main$terror[d_main$west], d_main$muslim15[d_main$west])

    Pearson's product-moment correlation

data:  d_main$terror[d_main$west] and d_main$muslim15[d_main$west]
t = 4.2844, df = 26, p-value = 0.0002219
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.3555625 0.8196607
sample estimates:
      cor
0.6433038


This is an analytic CI. The paper does not say which kind of CI is used, so I assumed it was an analytic.

Quote:"When percentage Muslim squared was included in a model of terrorism threat level alongside percentage Muslim it was not significant (p > 0.1), indicating minimal non-linearity."

Replicated. R output:

Code:
Call:
lm(formula = "terror ~ poly(muslim15, 1)", data = d_main, subset = d_main$west)

Residuals:
     Min       1Q   Median       3Q      Max
-2.33148 -0.57178  0.03772  0.43508  1.85348

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)          3.258      0.256  12.726 1.13e-12 ***
poly(muslim15, 1)   27.366      6.387   4.284 0.000222 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8865 on 26 degrees of freedom
Multiple R-squared:  0.4138,    Adjusted R-squared:  0.3913
F-statistic: 18.36 on 1 and 26 DF,  p-value: 0.0002219

Call:
lm(formula = "terror ~ poly(muslim15, 2)", data = d_main, subset = d_main$west)

Residuals:
    Min      1Q  Median      3Q     Max
-2.2881 -0.3663 -0.1067  0.6465  1.6725

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)
(Intercept)          -7.519      8.858  -0.849    0.404
poly(muslim15, 2)1 -315.886    282.091  -1.120    0.273
poly(muslim15, 2)2  -72.623     59.668  -1.217    0.235

Residual standard error: 0.8784 on 25 degrees of freedom
Multiple R-squared:  0.4466,    Adjusted R-squared:  0.4024
F-statistic: 10.09 on 2 and 25 DF,  p-value: 0.0006133


Quote:"The standardised difference in terrorism threat level by any military deaths in Iraq or Afghanistan is d = 0.67 (p > 0.05; 95% CI = [–0.20, 1.54])."

I get d = .69.

R output:

Code:
d estimate: -0.6888248 (medium)
95 percent confidence interval:
       inf        sup
-1.6418342  0.2641846


It was difficult to calculate a p value in r for the SMD. However, I think I managed to do it and got .13.

Quote:"The correlation between terrorism threat level and log of 1 + military deaths is r = .40 (p = 0.037; 95% CI = [.03, .77])"

Correlation and lower CI replicated, upper CI did not. R output:

Code:
> cor.test(d_west$terror, d_west$deaths2)

    Pearson's product-moment correlation

data:  d_west$terror and d_west$deaths2
t = 2.1992, df = 26, p-value = 0.03696
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.02693856 0.67010334
sample estimates:
      cor
0.3960353


Quote:"When log of 1 + military deaths squared was included in a model of terrorism threat level alongside log of 1 + military deaths it was not significant (p > 0.1), indicating minimal non-linearity."

Replicated.

Quote:"The standardised difference in terrorism threat level by part of anti-ISIS military coalition is d = 1.34 (p < 0.001; 95% CI = [0.60, 2.08])."

I get 1.63.

R:

Code:
Cohen's d

d estimate: -1.632013 (large)
95 percent confidence interval:
       inf        sup
-2.6807130 -0.5833124


I did not replicate the p value because the function I used rounded the number down to 0 (not your fault).

Quote:"Table 1. Effects of Muslim percentage and military intervention in the Middle East on terrorism threat level among Western countries."

Replicated.

Quote:"Table 2. Effects of Muslim percentage and military intervention in the Middle East on terrorism threat level among all OECD countries."

Replicated.

Quote:"Table 3. Effects of Muslim percentage and military intervention in the Middle East on terrorism threat level among OECD countries located in Europe."

Replicated.

--

I'd like to resolve these slight discrepancies but note that the results generally replicated.
 Reply
#6
I'd like to request that you add the main numerical datapoints of interest to the abstract: effect sizes and sample size. If too many effect sizes, I usually give ranges or the mean/median.

I'd like to request that you provide exact p values instead of inequalities. This allows for better estimation of the strength of the evidence from reading and also allows for automatic checking with data mining tools. See e.g. https://peerj.com/preprints/1642/ When the numbers are very small it is better to use scientific notation, e.g. 1.4 * 10^-5.

Noah Carl Wrote:I have changed the relevant sentence.


Come to think of it. It is better to use a proper method than assuming that a clearly non-normal, non-continuous variable is both. At least as a robustness check. I used ordered logistic regression (http://www.ats.ucla.edu/stat/r/dae/ologit.htm) and tried the 3 main models with the western subsample. Results were similar to your OLS regressions.

Noah Carl Wrote:Given that the focus of the paper is the effects of percentage Muslim and military intervention, I would prefer not to unnecessarily clutter the regression tables with more coefficients. This practice is quite common in economics and sociology.


It is a dangerous argument to make that because a practice is common, it is okay. I'd like that you add the full results to the supplementary materials (either in the appendix or output files in the OSF repository).

Noah Carl Wrote:Yes––unless I'm mistaken––the collinearity is close to perfect.


I checked, the correlations are not that strong: r's .33, .50, .69.

In two places you use the fact that adding a second order term did not result in a p < alpha to argue that there is no non-linearity. This conclusion is too strong. There are many kinds of non-linearity many of which are not detected by this crude method. In my experience, detecting non-linearity requires a somewhat large sample size (or very strong associations), larger than this study has. So, I think that if you tone down the language, it is alright what you did.
 Reply
#7
Thanks to Emil for undertaking a replication of my analyses. I have uploaded a new Stata do file to the OSF page for this paper: https://osf.io/5tv3a/

Emil Wrote:Replicated, except I get a median of 2.52. Typo perhaps.


This was indeed a typo.

Emil Wrote:Correlation replicated, CI differed... Correlation and lower CI replicated, upper CI did not.


I had not applied the Fisher transformation to the raw confidence intervals included in the Stata OLS output. Reported confidence intervals have been changed accordingly, and are the same as Emil's. (They were calculated using the cii2 command, as shown in the new Stata do file.)

Emil Wrote:I get d = .69... I get 1.63...


I calculated d-values by simply taking the regression coefficients from OLS models in which the dependent variable (namely, terrorism threat level) had been standardised. In other words, B1 in the following model was used as an estimate of d:

terrorism_threat_level_z-score = alpha + B1(any_deaths_Iraq_Afghanistan)

When computing Cohen's d directly (using the esize command in Stata), I get the same results as Emil. I presume the discrepancies arise from different methods of calculating the pooled standard deviation. I chose to use the OLS method, rather than calculating Cohen's d directly, because the results are more comparable with the conditional standardised differences obtained from the multivariate models (Tables 1-3). I will report the exact d-values in the paper, if preferred.

Emil Wrote:I'd like to request that you add the main numerical datapoints of interest to the abstract


I will do this once the discussion regarding d-values has been resolved.

Emil Wrote:I'd like to request that you provide exact p values instead of inequalities.


I will include exact p-values for raw estimates in the next version of the paper.

Emil Wrote:It is better to use a proper method than assuming that a clearly non-normal, non-continuous variable is both.


In my experience, logistic and probit regression (binary and ordered) nearly always produce highly similar point estimates (average effects) to OLS, so I prefer to use the latter, given its greater simplicity and ease of interpretation. But I can note in the paper that results were similar when using ordered logistic regression.

Emil Wrote:I'd like that you add the full results to the supplementary materials


I will include an Appendix or file of supplementary analyses with the next version of the paper.

Emil Wrote:I checked, the correlations are not that strong: r's .33, .50, .69.


My mistake. Would you suggest that I tried utilising an additional variable, namely number of major military interventions in the Middle East, ranging from 0-4?
 Reply
#8
In reply to Noah's post.

Noah Carl Wrote:I calculated d-values by simply taking the regression coefficients from OLS models in which the dependent variable (namely, terrorism threat level) had been standardised. In other words, B1 in the following model was used as an estimate of d:

terrorism_threat_level_z-score = alpha + B1(any_deaths_Iraq_Afghanistan)

When computing Cohen's d directly (using the esize command in Stata), I get the same results as Emil. I presume the discrepancies arise from different methods of calculating the pooled standard deviation. I chose to use the OLS method, rather than calculating Cohen's d directly, because the results are more comparable with the conditional standardised differences obtained from the multivariate models (Tables 1-3). I will report the exact d-values in the paper, if preferred.


Yes, it is because your method uses the total sample SD, whereas it is customary to use the pooled SD (i.e. the average of the SD within each group weighted by the group sizes). I replicated your number using the total sample SD.

I'd like that you report the normal d value (pooled SD-based). The whole point of using standardized mean differences is to get measures that are comparable across analyses and studies that do not use the same units. As far as I know, it is most common to use the pooled SD and this is the default method unless otherwise specified in statistical programs/languages. Thus, to make your result most comparable with other studies, you should use the pooled SD.

Noah Carl Wrote:In my experience, logistic and probit regression (binary and ordered) nearly always produce highly similar point estimates (average effects) to OLS, so I prefer to use the latter, given its greater simplicity and ease of interpretation. But I can note in the paper that results were similar when using ordered logistic regression.


Please include a paragraph/sentence mentioning that you tried using ordered logistic modeling (or whatever you deem appropriate) and that the results were similar. I agree with your experiences.

Noah Carl Wrote:My mistake. Would you suggest that I tried utilising an additional variable, namely number of major military interventions in the Middle East, ranging from 0-4?


I tried all the possible OLS models with your predictors. I always do this unless it is impossible because there are too many predictors, e.g. >15; the number of models to try is 2^p-1, so with 6 predictors, it's only 127 models to try. This takes a few seconds even without using parallel computing.

The best model according to BIC was:

Muslim + any death + part of anti-ISIS + gdp + unemploy

which had adjusted R2 of .77. The best model with only one of the intervention predictors has adjusted R2 of .67, so there is a bit of evidence that using multiple intervention predictors is superior.

In general however, this approach tends to overfit models by using too many predictors. One could use lasso regression with cross-validation to get more robust results. I did this for you. Cross-validation has a random component, so I ran it 500 times and summarized the results. The results indicated that all the predictors were useful predictors.

Code:
muslim15 any_1 deaths2 part_1 gdp_log unemp14 ineq0911
mean               0.431 0.266   0.107  0.507   0.105   0.228    0.072
median             0.432 0.262   0.108  0.508   0.101   0.225    0.072
sd                 0.010 0.072   0.004  0.008   0.041   0.043    0.005
mad                0.004 0.057   0.002  0.003   0.036   0.034    0.002
fraction_zeroNA    0.000 0.014   0.000  0.000   0.046   0.000    0.000


The last row is the proportion of runs that produced a zero coefficient for that predictor, i.e. found it to be useless. As can be seen, the worst predictor was found to be useless only 4.6% of the time. Based of this, I'd tentatively (because of the sample size) conclude that it is best to use all the predictors together.
 Reply
#9
The latest versions of the pdf and the do file have been uploaded to the OSF page (the word version of the paper has been deleted, to avoid redundancy): https://osf.io/5tv3a/

Emil Wrote:I'd like that you report the normal d value (pooled SD-based).


I have done so.

Emil Wrote:Please include a paragraph/sentence mentioning that you tried using ordered logistic modeling (or whatever you deem appropriate) and that the results were similar.


I have done so.

Emil Wrote:I tried all the possible OLS models with your predictors. I always do this unless it is impossible because there are too many predictors, e.g. >15; the number of models to try is 2^p-1, so with 6 predictors, it's only 127 models to try. This takes a few seconds even without using parallel computing.

The best model according to BIC was:

Muslim + any death + part of anti-ISIS + gdp + unemploy

which had adjusted R2 of .77. The best model with only one of the intervention predictors has adjusted R2 of .67, so there is a bit of evidence that using multiple intervention predictors is superior.

In general however, this approach tends to overfit models by using too many predictors. One could use lasso regression with cross-validation to get more robust results. I did this for you. Cross-validation has a random component, so I ran it 500 times and summarized the results. The results indicated that all the predictors were useful predictors.


Since the purpose of my paper is simply to show that percentage Muslim and military intervention in the Middle East are independently associated with terrorism threat level, I would prefer not to include additional discussion pertaining to model fit.
 Reply
#10
I note that you added effect sizes to the abstract. However, you left out the confidence intervals which are necessary for the interpretation.

The sample size is also not noted. I would add this e.g. thusly:

"This paper analyses variation in terrorism threat level across Western countries (N=28) on 22nd March 2016,"

I would add the standardized betas. E.g.

"It finds that percentage of Muslims in the population and military intervention in the Middle East are independently associated with terrorism threat level (standardized betas .42 to .56 and .47 to 1.00 for percentage Muslim and military intervention, respectively)."

Otherwise, I have no further comment and approve this paper when the matter with the abstract has been settled.
 Reply
 
 
Forum Jump:

Users browsing this thread: 1 Guest(s)