Hello There, Guest!  
 Previous 1 2 3 4 5 6 7 Next   

[ODP] An update on the narrowing of the black-white gap in the Wordsum

(2014-Oct-15, 18:55:13)Chuck Wrote: To be clearer: Either (a) explain why I'm wrong about this or (b) briefly acknowledge the issue in your paper.

Not sure what you mean. Earlier, you said "It's not clear which better represents the true cohort effect, which would best be indexed by looking at same age persons at widely different times." but if I control for age, that means the d*cohort interaction is expressed in terms of constant age. When you control for age, it is fixed for all other independent vars. So, when using age as control var, I'm looking "at same age persons at widely different times", no ?

Besides, I also wanted to say that I have fixed few words (whites, wordsum) that have not been capitalized. More importantly, I will redo all my regressions (OLS and tobit). I have just read a few moments ago one of my blog post (draft) I have written in Word, several months ago. It says that when we use age in regression (and even in correlation) we should always add the squared (age^2) and cubed (age^3) terms of the main effect age. This is because in many instances, age is not linearly correlated with many things. Of course, in my analysis, age is not the key thing, but controlling for age with at least squared and cubed terms is more effective than not controlling for these terms. Still, this is annoying, that I knew this problem with nonlinearity of age, but I have completely forgotten to do that. I look like a fool; this is pathetic.

I attach the picture of nonlinearity of age with wordsum, just to show you. The second graph depicts the curve for blacks and whites. There is a positive bw gap interaction with age because older blacks lose a lot of "IQ" points whereas old and middle-aged whites are stagnating.

The syntax was :

gen age2 = age^2
gen age3 = age^3
gen bw1age = bw1*age
gen bw1age2 = bw1*age2
gen bw1age3 = bw1*age3
tobit wordsum age age2 age3 [pweight = weight], ll ul
predict preda, xb
scatter preda age
tobit wordsum age age2 age3 bw1 bw1age bw1age2 bw1age3 [pweight = weight], ll ul
predict predb, xb
graph twoway (scatter predb age if bw1==0, msymbol(Oh)) (scatter predb age if bw1==1, msymbol(O)), legend(label(1 black) label(2 white))

I will update the whole thing when it's done. If someone has some other things they would like me to include, just say it.

Attached Files Thumbnail(s)
Quote: I attach the picture of nonlinearity of age with wordsum, just to show you. The second graph depicts the curve for blacks and whites. There is a positive bw gap interaction with age because older blacks lose a lot of "IQ" points whereas old and middle-aged whites are stagnating.

This could potentially be a real effect, not some quirk. There is evidence that higher g protects against diseases of old age such as Alzheimer's. Indeed, one of the replicated g SNPs is one of the Alzheimer's SNPs too.

Or maybe it is something else.
(2014-Oct-15, 21:18:47)Emil Wrote: There is evidence that higher g protects against diseases of old age such as Alzheimer's. Indeed, one of the replicated g SNPs is one of the Alzheimer's SNPs too.

I agree it's a potential explanation. That said, for blacks who lose their IQ points, an environmental hypothesis can say that it's due to the cumulative effect of pervasive racism in the job market, that keeps black wages low, and this causes them to live poorly, etc. They can come up with so many plausible theories that I would prefer not to take risks in assuming that higher IQ protects against disease.
(2014-Oct-15, 20:20:15)Meng Hu Wrote: So, when using age as control var, I'm looking "at same age persons at widely different times", no ?

I was conflating two issues. Imagine the following scenario: Hypothetical Wordsum data (with the same age distribution that you have):

year1 0.7699
year2 0.6358
year3 0.6787
year4 0.7120

year5 0.7699
year6 0.6358
year7 0.6787
year8 0.7120

year9 0.7699
year10 0.6358
year11 0.6787
year12 0.7120

MH's data

year13 0.7699
year14 0.6358
year15 0.6787
year16 0.7120

Based on your year 13 to 16 data you couldn't possibly infer year 1 to 4 data. So, no you are not looking at same age persons at widely different times. (What I mean is that the results don't imply that the gap in e.g., 1925 or 1950 was much larger than that in 1975 or 2000.) If you do the analysis correct, though, you can estimate the independent effect of cohort for years 13 to 16. But you have to use a viable ACP (age, cohort, period) model, (arguably) such as discussed in the attached papers. See, for example: http://yangclaireyang.web.unc.edu/age-pe...chapter-7/

Attached Files
.pdf   A MIXED MODELS APPROACH TO.pdf (Size: 116.49 KB / Downloads: 558)
.   Variance Function Regression in Hierarchical Age-Period-Cohort Models Applications to the Study of Self-Reported Health. (Size: 1.14 MB / Downloads: 122)
There are two things in my latest version (still working).

First, instead of using age variable, I use a serie of dummy vars of age ((18/26=1) (27/35=2) (36/44=3) (45/55=4) (56/69=5)). I think it's better than to use age+age^2+age^3, because with dummies, the reference cateogory has a direct impact on the intercept. For example, if age dummy3 is the reference category, the intercept will reflect this, and should be interpreted as the wordsum score for race=0, cohortdummy=1, agedummy=3. As the mean age of the entire sample is 41.48, I think it's more appropriate to use dummy and specify agedummy3 as the reference category. I did just that, but as expected, the result is left unchanged compared to what you have in version 2.

Second, as there is no recent report of reliability of the Wordsum, I have attempted to estimate the Cronbach reliability, using Cronbach's alpha in Stata. The reliability is 0.71 for total sample (but with age<70), while for blacks, it's 0.63 and for whites, it's 0.70. Those are comparable with the numbers reported by H&H (0.63 and 0.71). However, for dichotomous variables, it's better to use Kuder-Richardson (KR-20) reliability method. But the numbers are identical, see below :

keep if age<70

replace wordsum = . if wordsum<0
replace wordsum = . if wordsum>10

replace worda = . if worda<0
replace worda = 0 if worda==9
replace wordb = . if wordb<0
replace wordb = 0 if wordb==9
replace wordc = . if wordc<0
replace wordc = 0 if wordc==9
replace wordd = . if wordd<0
replace wordd = 0 if wordd==9
replace worde = . if worde<0
replace worde = 0 if worde==9
replace wordf = . if wordf<0
replace wordf = 0 if wordf==9
replace wordg = . if wordg<0
replace wordg = 0 if wordg==9
replace wordh = . if wordh<0
replace wordh = 0 if wordh==9
replace wordi = . if wordi<0
replace wordi = 0 if wordi==9
replace wordj = . if wordj<0
replace wordj = 0 if wordj==9

ssc install kr20

kr20 worda wordb wordc wordd worde wordf wordg wordh wordi wordj if !missing(wordsum) & bw1==0

Kuder-Richarson coefficient of reliability (KR-20)

Number of items in the scale = 10
Number of complete observations = 3550

                     Item       Item     Item-rest
  Item   |  Obs   difficulty  variance  correlation
worda   | 3550     0.7854     0.1686     0.2691
wordb   | 3550     0.8293     0.1416     0.4023
wordc   | 3550     0.1285     0.1120     0.2212
wordd   | 3550     0.8408     0.1338     0.3905
worde   | 3550     0.6882     0.2146     0.4460
wordf   | 3550     0.5645     0.2458     0.4025
wordg   | 3550     0.1741     0.1438     0.1130
wordh   | 3550     0.1454     0.1242     0.2309
wordi   | 3550     0.6530     0.2266     0.3028
wordj   | 3550     0.1121     0.0995     0.2410
   Test  |          0.4921                0.3019

KR20 coefficient is 0.6366

kr20 worda wordb wordc wordd worde wordf wordg wordh wordi wordj if !missing(wordsum) & bw1==1

Kuder-Richarson coefficient of reliability (KR-20)

Number of items in the scale = 10
Number of complete observations = 18606

                     Item       Item     Item-rest
  Item   |  Obs   difficulty  variance  correlation
worda   | 18606    0.8369     0.1365     0.2627
wordb   | 18606    0.9459     0.0512     0.3144
wordc   | 18606    0.2434     0.1842     0.3696
wordd   | 18606    0.9561     0.0419     0.2853
worde   | 18606    0.7681     0.1781     0.4391
wordf   | 18606    0.8296     0.1413     0.4290
wordg   | 18606    0.3574     0.2297     0.3974
wordh   | 18606    0.3225     0.2185     0.4274
wordi   | 18606    0.7986     0.1609     0.2707
wordj   | 18606    0.2672     0.1958     0.4444
   Test  |          0.6326                0.3640

KR20 coefficient is 0.7015

The corresponding syntax in SPSS is :

RECODE wordsum (0 thru 10=COPY) (ELSE=SYSMIS) INTO GSSwordsum.
RECODE worda (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_a.
RECODE wordb (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_b.
RECODE wordc (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_c.
RECODE wordd (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_d.
RECODE worde (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_e.
RECODE wordf (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_f.
RECODE wordg (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_g.
RECODE wordh (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_h.
RECODE wordi (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_i.
RECODE wordj (0=0) (1=1) (9=0) (ELSE=SYSMIS) INTO word_j.

SELECT if age<70.


  /VARIABLES=word_a word_b word_c word_d word_e word_f word_g word_h word_i word_j

Using the data set here, you get similar result (given rounding). The above syntax is for Cronbach's alpha, not the KR-20 alpha. I don't know how to do that on SPSS, but the results are identical, so it's not really a problem.

Now, the problem is what Chuck has said, that both H&H and my method can't really answer the question of whether the BW gap has diminished or not (due to possible confound with age and race gap over time). The method I should have employed is probably a hierarchical linear mixed regression model. I'm not familiar with the technique, so I need more time before I submit the new version.

Of course, I can instead admit that my method cannot disentangle age, period and cohort effects, and Chuck said it would be OK this way, but I think it's better to try HLM. I have to say, however, that if I employ HLM in my additional analyses, the HLM is not tobit, and thus does not correct for censored distribution. But I think it's not too much of a problem. OLS and tobit regressions have produced similar results.
I have been busy asking people around. At least one guy accepted to help me, but he is busy and I have to wait. Just to let you know in advance, my linear mixed effects (LME) model contradicts my tobit regressions.

Here's the syntax and output :

replace wordsum = . if wordsum<0
replace wordsum = . if wordsum>10

recode cohort (1905/1920=1) (1921/1930=2) (1931/1939=3) (1940/1947=4) (1948/1953=5) (1954/1959=6) (1960/1968=7) (1969/1979=8) (1980/1994=9), generate(cohort9)

replace cohort9 = . if cohort9>9

recode age (18/23=1) (24/28=2) (29/33=3) (34/38=4) (39/44=5) (45/50=6) (51/56=7) (57/62=8) (63/69=9), generate(age9)

gen bw1age9 = bw1*age9

. xtmixed wordsum bw1 || cohort9: age9 bw1age9

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -46852.106  
Iteration 1:   log likelihood = -46852.106  

Computing standard errors:

Mixed-effects ML regression                     Number of obs      =     22156
Group variable: cohort9                         Number of groups   =         9

                                                Obs per group: min =       877
                                                               avg =    2461.8
                                                               max =      3663

                                                Wald chi2(1)       =    253.03
Log likelihood = -46852.106                     Prob > chi2        =    0.0000

     wordsum |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
         bw1 |   1.122804   .0705859    15.91   0.000     .9844581     1.26115
       _cons |   4.751633    .134117    35.43   0.000     4.488769    5.014497

  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
cohort9: Independent         |
                    sd(age9) |   .1354198   .0394182      .0765443    .2395803
                 sd(bw1age9) |   .0628879   .0207408      .0329485    .1200323
                   sd(_cons) |   .3193018   .0947142      .1785296     .571074
                sd(Residual) |   2.001279   .0095137      1.982719    2.020012
LR test vs. linear regression:       chi2(3) =   401.60   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

predict xtfitted9, fitted

twoway (scatter xtfitted9 cohort9, msymbol(Oh) jitter(0)) (lfit xtfitted9 cohort9 if bw1==0, lcolor(red)) (lfit xtfitted9 cohort9 if bw1==1, lcolor(green)), by(age9) legend(cols(4) size(small)) title("Wordsum trend", size(medsmall)) legend(order(2 "black" 3 "white"))

The coefficients of random effects tell you nothing at all. I think the most important stuff is the graph. So I attached the nine graphs generated by Stata. As you see, the reduction in BW gap is real in all of the nine categories of age, but it's relatively small.

Now, I should talk about the logic of mixed effects. Here, race variable is the fixed effect variable, in the fixed component of LME. Cohort9 is a random intercept, i.e., the mean wordsum score is allowed to freely vary over my cohort categorical vars. Finally, age and its interaction with race are used as random slopes within Cohort9; that is to say, I'm specifying that age varies across cohorts, but that age*race also varies across cohorts. To explain why race should be in the fixed component, although using it as random intercept does not change meaningfully the result, is due to the fact that the random component of LME evaluates the deviation across the values of your variables. With a dichotomy var like bw1, I don't think it's wise to do that. Generally, I see people, notably in medical or experimental studies, using LME models with groups as fixed vars, and individuals (i.e., id) as random intercepts with random slopes such as doses, tests, scores, etc. freely varying across individuals. On the other hand, I see many times in textbooks (specifically, Stata books) some examples using region (4) and census divisions (9) in the US. The N of categories is small, but it is still doable. A good introduction of LME is presented by a certain "Chuck Huber" at Stata blog, here.

What I did is not exactly what Yang & Land (2008) did, but I don't care. I think it does what I want.

In any case, if LME tells the truth, the controversy raised by H&H about the difference in period and cohort effect is mostly (but not entirely) due to age effects that were wrongly taken as cohort effects. Chuck was probably right.

Attached Files Thumbnail(s)
Any updates on this submission?
I'm working on it. As you already know, multilevel regression is very complex. Probably no less complex than SEM, MGCFA and IRT. I thought I should published that two weeks ago. Then, I came across a problem of the application of sampling weight in multilevel models, which is not straightforward. Brady T West helped me with that. But then, I tried to find a way to compare the random intercept model and random slope model using likelihood ratio test and a measure of effect size. I contacted many authors. They say that they are not aware of any other measure than R² and the Cohen's F² measure (preferably, I would recommend f² but not R²) to compare the effect size of a given (set of) variables. But the problem here is that in many papers and books I have read, f² and R² are applicable only to the fixed effect component of the multilevel model, not the random component. So, I give up with this. The other way of comparing model fit is to use likelihood ratio test (LRT) for nested models (which I have). But then I read this page, which says that the application of sampling weight in analyses involving maximum likelihood estimation does not allow LRT, on the grounds that "The “likelihood” for pweighted or clustered MLEs is not a true likelihood" and "Where there are pweights, the “likelihood” does not fully account for the “randomness” of the weighted sampling". Worse, in Stata, when you conduct regression with robust standard error and compute LRT between two nested models, you have an error, which says that LRT is not suited with models applying robust standard errors. But in Stata, when you use sampling weight in multilevel regression, you automatically get the robust standard errors, instead of ML standard errors. Of course, I conducted LRT without sampling weights, for both ML and REML estimations, but I would like to do the same test with sampling weight applied. The link recommends using Wald test, but I have heard this is not the best way of doing things, especially when the number of groups (i.e., 2nd level variable) is small, which is true in my articles, where the latest version has only 21 cohort groups (the recommended number of groups by some statisticians is at least 30 or 50 for having unbiased standard errors); some authors prefer to use the Hausman's test, but I still need to read what it is exactly.

It's the only thing I need to find answers, because the other points made in my article are good, and there is no black-white gap narrowing at all. I think it should be ready soon (I hope, because I keep saying this for many days now).
You should probably recruit an external reviewer for this paper. I don't know anyone else here is very familiar with these methods. I am certainly not. Perhaps an econometricist or statistician.
Every affirmation I make concerning multilevel model, and tobit regression, were backed by references (books and pages). And they are available (one can email me if they want the listed reference). Unless I didn't understand the paragraphs of the texts I referred to, it's unlikely I'm mistaken. You'll see when the latest version is available. If you want, I will cite in this thread all of the relevant paragraphs of the books/articles I have cited, and you'll see for yourself.

I would like to encourage some experts to review it, but given my extremely low response rate to my very modest, short questions (e.g., what's the consequence of ignoring slope-intercept covariance, can we use an effect size measure for random component in multilevel regression), I don't think they will spend more time in reviewing a 21-page article. If someone accepts reviewing it, I will keep you informed.
 Previous 1 2 3 4 5 6 7 Next   
Forum Jump:

Users browsing this thread: 1 Guest(s)