Network Analysis - Statistical Analysis of Social Network Data
Prerequisites and content
Prerequisite knowledge for the course includes the fundamentals of probability and statistics, especially hypothesis testing and regression analysis. This intermediate level course assumes that students can interpret the results of Ordinary Least Squares, Probit, and Logit regressions. They should also be familiar with the problems that are most common in regression, such as multicollinearity, heteroscedasticity, and endogeneity. Finally, students should comfortable working with computers and data. No prior knowledge of R or network analysis is required.
The concept of “social networks” is increasingly a part of social discussion, organizational strategy, and academic research. The rising interest in social networks has been coupled with a proliferation of widely available network data, but there has not been a concomitant increase in understanding how to analyze social network data. This course presents concepts and methods applicable for the analysis of a wide range of social networks, such as those based on family ties, business collaboration, political alliances, and social media.
Classical statistical analysis is premised on the assumption that observations are sampled independently of one another. In the case of social networks, however, observations are not independent of one another, but are dependent on the structure of the social network. The dependence of observations on one another is a feature of the data, rather than a nuisance. This course is an introduction to statistical models that attempt to understand this feature as both a cause and an effect of social processes.
Since network data are generated in a different way that many other kinds of social data, the course begins by considering the research designs, sampling strategies, and data formats that are commonly associated with network analysis. A key aspect of performing network analysis is describing various elements of the network’s structure. To this end, the course covers the calculation of a variety of descriptive statistics on networks, such as density, centralization, centrality, connectedness, reciprocity, and transitivity. We consider various ways of visualizing networks, including multidimensional scaling and spring embedding. We learn methods of estimating regressions in which network ties are the dependent variable, including the quadratic assignment procedure and exponential random graph models (ERGMs). We consider extensions of ERGMs, including models for two-mode data and networks over time.
Instruction is split between lectures and hands-on computer exercises. Students may find it to their advantage to bring with them a social network data set that is relevant to their research interests, but doing so is not required. The instructor will provide data sets necessary for completing the course exercises.
Day 1: Fundamental of Network Analysis
- Why undertake network analysis?
- How network analysis differs from other statistical methods
- Elements of networks (Nodes, links, modes, attributes, matrices, graphs)
- Key concepts (directionality, symmetry)
- Survey methods
- Working with network data in R
Day 2: Descriptive and Inferential Statistics
- Degree distributions
- Centrality (degree, betweenness, closeness, power)
- Components and cores
- Triads, triples, and transitivity
- Correlation and the Quadratic Assignment Procedure
- Random graphs
- Descriptive and inferential statistics in R
Day 3: Exponential Random Graph Models (ERGMs)
- Goodness of Fit
- Working with one-mode and two-mode ERGMs in R
Day 4: Network Data over Time Using Temporal ERGMs
Day 5: Student Presentations and Extensions of ERGM
- Student Presentations
- Additional extension of ERGMs, if time allows
- Concluding Discussion
Breiger, Ronald L. 1974. "The Duality of Persons and Groups." Social Forces 53 (2): 181-190.
Burt, Ronald S. 1992. Structural Holes: The Social Structure of Competition. Cambridge, MA: Harvard University Press. Pp. 8-49.
Butts, Carter T. 2008. “network: A Package for Managing Relational Data in R.” Journal of Statistical Software 24 (2): 1-36.
Butts, Carter T. 2008. “Social Network Analysis with sna.” Journal of Statistical Software 24 (6): 1-51.
Cranmer, Skyler J., Bruce A. Desmarais and Jason W. Morgan. 2020. Inferential Network Analysis. New York: Cambridge University Press.
Cranmer, Skyler J., Philip Leifeld, Scott D. McClurg, and Meredith Rolfe. 2017. “Navigating the Range of Statistical Tools for Inferential Network Analysis.” American Journal of Political Science 61 (1): 237-251.
Denny, Matthew J. 2016. “Getting Started with GERGM.” https://www.mjdenny.com/getting_started_with_GERGM.html
Emirbayer, Mustafa. 1997. “Manifesto for a Relational Sociology.” American Journal of Sociology 103 (2): 281-317.
Freeman, Linton C. 1977. “A Set of Measures of Centrality Based on Betweenness.” Sociometry 40 (1): 35-41.
Gould, Roger V., and Roberto M. Fernandez. 1989. “Structures of Mediation: A Formal Approach to Brokerage in Transaction Networks." Sociological Methodology 19: 89-126.
Granovetter, Mark. 1973. “The Strength of Weak Ties.” American Journal of Sociology 78 (6): 1360-1380.
Heaney, Michael T. 2014. “Multiplex Networks and Interest Group Influence Reputation: An Exponential Random Graph Model.” Social Networks 36 (1): 66-81.
Heaney, Michael T., and Philip Leifeld. 2018. “Contributions by Interest Groups to Lobbying Coalitions.” Journal of Politics 80 (2): 494-509
Heckathorn, Douglas D. 1997. “Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations.” Social Problems 44 (2): 174-199.
Hunter, David R., Mark S. Handcock, Carter T. Butts, Steven M. Goodreau and Martina Morris. 2008. “ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks.” Journal of Statistical Software 24 (3): 1-29
Krackhardt, David. 1992. “The Strength of Strong Ties: The Importance of Philos in Organizations.” Pp. 216-239 in Nitin Nohria and Robert Eccles, eds., Networks and Organizations: Structure, Form, and Action. Boston, MA: Harvard Business School Press.
Laumann, Edward O., Peter V. Marsden, and David Prensky. 1983. “The Boundary Specification Problem in Network Analysis.” Pp. 18-34 in Ronald S. Burt and Michael Minor, eds., Applied Network Analysis, eds. Beverly Hills, CA: Sage.
Leifleld, Philip, and Skyler J. Cranmer. 2019. “A theoretical and empirical comparison of the temporal exponential random graph model and the stochastic actor-oriented model.” Network Science 7 (1): 20-51.
Leifeld, Philip, Skyler J. Cramner, and Bruce A. Desmarais. 2018. “Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals.” Journal of Statistical Software 83 (6):1-36.
McPherson, Miller, Lynn Smith-Lovin, and James M. Cook. 2001. “Birds of a Feather: Homophily in Social Networks.” Annual Review of Sociology 27: 415-444.
Morris, Martina, Mark S. Handcock, and David R. Hunter. 2008. “Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects.” Journal of Statistical Software 24 (4): 1-24.
Podolny, Joel M. 2001. “Networks as the pipes and prisms of the market.” American Journal of Sociology 107 (1): 33-60.
Scott, John T. 2017. Social Network Analysis, 4th ed. London: Sage.
Strogatz, Steven. 2010. “The Enemy of My Enemy.” New York Times (February 14).
Watts, Duncan. 1999. Small Worlds: The Dynamics of Networks Between Order and Randomness. Princeton: Princeton University Press. Pp. 3-40.
75%: There will be one written computer-based problem set on Monday through Thursday (for four assignments in total). Time will be allocated in class to complete the assignments, which must be submitted each day.
25%: On the final of day of the course, each student will make a presentation to the class on the results of her or his research project for the week. Giving a presentation to the course is required to receive a satisfactory grade in the course.