congress109: Ideology in Political Speeches

Description

Phrase counts and ideology scores by speaker for members of the 109th US congress.

Arguments

Value

congress109CountsA simple_triplet_matrix of phrase counts indexed by speaker-rows and phrase-columns.
congress109IdeologyA matrix containing the associated repshare and common scores [cs1,cs2], as well as speaker characteristics: party (`R'epublican, `D'emocrat, or `I'ndependent), state, and chamber (`H'ouse or `S'enate).

Details

This data originally appears in Gentzkow and Shapiro (GS; 2010) and considers text of the 2005 Congressional Record, containing all speeches in that year for members of the United States House and Senate. In particular, GS record the number times each of 529 legislators used terms in a list of 1000 phrases (i.e., each document is a year of transcripts for a single speaker). Associated sentiments are repshare -- the two-party vote-share from each speaker's constituency (congressional district for representatives; state for senators) obtained by George W. Bush in the 2004 presidential election -- and the speaker's first and second common-score values (from http://voteview.com). Full parsing and sentiment details are in Taddy (2011; Section 2.1).

References

Gentzkow, M. and J. Shapiro (2010), What drives media slant? Evidence from U.S. daily newspapers. Econometrica 78, 35-7. The full dataset is at http://dx.doi.org/10.3886/ICPSR26242.

Taddy (2011), Inverse Regression for Analysis of Sentiment in Text. http://arxiv.org/abs/1012.2098

Examples

Run this code

data(congress109)

## Inverse Regression Sentiment Modeling 
fit <- mnlm(congress109Counts, congress109Ideology[,6:7], normalize=TRUE, bins=10)
par(mfrow=c(1,2))
plot(fit, type="reduction", v=congress109Ideology$repshare, xlab="Republican Vote-Share",
	  covar=1, pch=21, bg=c(4,3,2)[congress109Ideology$party], main="1st common score")
plot(fit, type="reduction", v=congress109Ideology$repshare, xlab="Republican Vote-Share", 
	  covar=2, pch=21, bg=c(4,3,2)[congress109Ideology$party], main="2nd common score")

## example usage of the predict method
predict(fit, type="reduction", newdata=congress109Counts[c(68,388),])
predict(fit, type="response", newdata=congress109Ideology[c(68,388),6:7])[,c(995,997)]

## example usage of summary method
summary(fit, y=congress109Ideology$repshare)

## A small topic model 
par(mfrow=c(1,1))
tpx <- topics(congress109Counts, K=15)
plot(tpx, group=congress109Ideology$party=="R", col=c(4,2), labels=c("Dem","GOP"))
summary(tpx)