Let’s say you want to use social scientific tools to predict the outcome of the Democratic nomination, by predicting the winner of each of the remaining state primaries/caucuses. You want to evaluate competing predictions:
1. Because Sanders relies on support from independents, he will start faring worse as more and more states hold closed primaries—thereby securing the nomination for Clinton rather easily.
2. Because Clinton relies on overwhelming support from African American voters, she will start faring worse as the remaining states do not tend to have very large African American populations—giving Sanders a chance to catch momentum and catch in the delegate count.
Imagine that you decided to develop a statistical model to analyze these competing claims, based on data that are available from the 25 or so contests that have already taken place this year (Iowa, New Hampshire, South Carolina, and so on). Using those data, your model examines the percentage of the vote that has gone to Sanders (relative to Clinton) in each state, looking at how that percentage varies in each state according to
a) The percentage of independents that compromise the voting day electorate in each state, and
b) The percentage of African American voters that comprise the voting day electorate in each state.
To make sure you’re really evaluating the casual impact of each of these, you decide to also include state characteristics that could account for any observed relationships between the Sanders percentage of the vote and the percentages of the electorate that are independent and African American.
Namely, you include:
a) Whether a state is a caucus or primary
b) Whether the state is in the American South
c) Whether the state went for Barrack Obama in the Electoral College in 2008
d) Percentage of the electorate compromised of Millennials
You then then perform your analysis, comparing Sander’s vote share across the contests that have already taken place. Specifically, you look at the percentage increase or decrease in Sander’s vote share that is associated with each percentage point increase or decrease in independents in the electorate from one state or another, and You do the same thing with regard to the percentage point increases or decreases of African Americans in the electorate from one state to another, while also taking account of those other things mentioned above.
You then compare the size of the effect associated with the African American variable to the size of the effect associated with the percentage independent variable, and you Calculate what Sander’s vote share would have been in each state had there been no independents (closed caucus) and no African Americans—which enables you to project what Sander’s vote share will be in each of the 25 state contests that have yet to take place, thereby allowing you to predict the overall outcome of the nomination.
1. What is the unit of analysis in your study?
2. What is the dependent variable?
3. What are the 2 independent variables?
4. What are the control variables?
5. Is this an observational, correlational, or quasi-experimental design? Why?
6. It is impossible to test your hypotheses using an experimental design. Why?
7. Evaluate the internal validity and external validity of the design. Fully explain your evaluations.
8. Evaluate the validity and reliability of your independent variable and control variable measures. Do NOT just simply state that they’re reliable and valid or not, but demonstrate that you know what those things mean.
a. How much error do you think is present in your control variable measures, specifically? Explain.
b. What are the consequences of that, in terms of your ability to draw meaningful conclusions from your statistical results?
Having read through Chap 1-5 in Representing Red and Blue (only skimming Chapters 2 and 3)
1. What is the research question and hypothesis, generally speaking (don’t get into the weeds of measurement yet) in this half of the book?
2. What is the unit of analysis in all of the analyses performed in Chapters 4 and 5?
3. What is the instrument that is used to operationalize the study in chapters 4 and 5?
4. What is the dependent variable in chapter 4? What is it in Chapter 5?
5. What type of research design do the analyses described in Chapter 4 represent? What about Chapter 5?
6. What is the independent variable in Chapter 5, conceptually speaking? What is the specific measure used to capture that concept? Evaluate that measure in terms of validity and reliability.
7. Why are control variables so present in Chapter 4 but not in Chapter 5?