Researchers tend to co-author with colleges with similar research interests, while they predominately only cite those with differing interests (Ding, 2011). In this report we test the effect on network structure when various co-authorship propensities are altered. We start with a null model resembling Newman’s astrophysics co-authorship network (2004), and vary the degree to which authors are willing to coauthor with others who have dissimilar interests to see how this effects macro-level network properties.

By altering the propensity of simulated researchers to match with co-authors of dissimilar interests, we can model the effect on the overall disconnectedness of a particular field, and the role individuals with differing distributions of expertise have on that structure. In doing so, we explore and make quantifiable the possible benefits of higher cross-expertise collaboration in the academic sphere.

Literature Review

Individuals have some agency regarding who they choose to associate with. While the utility of numerous weak ties has been explored (Granovetter, 1977), individuals will likely invest more heavily in some relationships over others. The selection of these close associates can be based on dyadic properties such as homophily (McPherson, Smith-Lovin, & Cook, 2001), necessity of resources (Lee, 2014), or based on the potential associates’ position in a broader social network (Buechel & Buskens, 2008; Jackson & Wolinsky, 1996). Regardless of reason, trusting relationships are understood to facilitate the transmission of resources between individuals (Coleman, 1988).

One example of this exists within the co-authorship networks of academia. Co-authorship networks differ structurally across disciplines (Newman, 2004), thus the ability for social resources to flow across these networks also differ. It has been shown that strong, direct ties between collaborators produces higher quality research, while working with colleges with similar research interests only influences research output (Gonzalez-Brambila, Veloso, & Krackhardt, 2013). Further, betweenness centrality has shown to be a particularly good predictor of preferential treatment in a co-authorship network, as new entrants to the network approach these brokers when making their first connections (Abbasi, Hossain, & Leydesdorff, 2012). We examine the effects on these metrics using the following model.

Methods

We specify the following parameters in out model: - The number of authors in the model (\(N\)),

The number of iterations the model will run (\(T\)),
The starting threshold \(\sigma_0\) each agent uses to decide if another agent is similar enough in interests to co-author with (so that the individual thresholds \(\sigma_i\) for each agent \(i\) are equal to \(\sigma_0\) at the start of the model),
A decay parameter \(\gamma\) indicating how quickly agents will seek out agents who are less similar in interests,
The maximum number of co-authors for any one agent (\(c_{max}\)),
The max number of times an agent can get rejected by someone before giving up on forming a link with them (\(r_{max}\)), and
The rate parameter \(\lambda\) of the exponential distribution used to determine the talent level in our 5 sub-topics, producing a talent vector \(\beta_i\) for each author \(i\).

The model proceeds as follows. At a given time \(t\), we randomly allow each node \(p\) to initiate a connection with another node \(q\). If a node already has a partner, it is not eligible to be paired with. At time \(t + 1\), each of these connections are tested for affinity (trust). If the similarity of \(q\)’s interests to \(p\) (measured as \(1-JS(\beta_p,\beta_q)\), where \(JS(\dot)\) is the Jensen-Shannon distance) are above \(p\)’s similarity threshold \(\sigma_p\), and if \(q\) has not already formed \(c_{max}\) other connections, then this connection is solidified as a trusting relationship and an edge is formed between \(p\) and \(q\). Otherwise, the connection is dropped, and \(p\) ``lowers their standards’’ by decaying their threshold \(\sigma_p\) by a factor of \(\gamma\).

Results

Bridge Example

Example Model with \(N\) = 9, \(T\) = 50 , \(\sigma_0\) = 0.75, \(\gamma\) = 0.08, \(c_{max}\) = 3, \(\lambda\) = 3.

Here we see an example random model using the above parameters. At \(T\) = 1, we see agents attempting to create a new tie at random (grey tie). As the model progresses, agents decide to either keep that relationship (green tie) given their current \(\sigma\) and co-author saturation, or they drop that relationship and start to look elsewhere. Agents may return to an alter they have already talked to at a later point with lower standards, and may form a connection then, but only until the number of attempted connections equals \(\lambda\) (3), at which point that alter is removed from the list of potential partners the agent will attempt talking to in the future.

Click on the nodes to see their sub-topics and TWD!

If the pop-up does not appear, zoom in or out on the network and re-select the node.

Effect of Lowering Decay Parameter on Betweenness Centrality

Here we see what happens if agents become more willing to co-author with people with different specialties, separated by the ‘talent’ of the agent. To construct the talent metric for a given author \(p\), we compute the entropy of their talent vector \(H(\beta_p)\) and then multiply this entropy by the ‘total talent’ \(\| \beta_p \|_1 = \sum_{i}(\beta_p)_i\), to produce a ‘talent-weighted-diversity’ measure \(TWD(p) = \| \beta_p \|_1H(\beta_p)\). We then plot the betweenness centrality of the authors with the lowest, highest, and median TWD scores, for varying levels of ‘expectation lowering rates’ \(\gamma \in \{0.00, 0.01, \dots, 0.19, 0.20\}\). Following the literature, we would expect the most ‘effective’ co-authorship networks to be those in which the high-TWD authors serve as ‘connectors’ between sub-field clusters. And thus we can read a preliminary result from the plot: that the ‘optimal’ rate of ‘expectation lowering’ – the rate which best facilitates high-TWD authors serving as connectors – is approximately \(\gamma = 0.11\), i.e., an \(11\%\) decay after each failed connection.

We then run an empirical model calibration process as follows. First, we compute summary statistics for both the empirical data, in this case the average degree, the average distance between two nodes, and the clustering coefficient. We then learn weights to align these three quantities such that they have comparable magnitudes. Finally, we perform a grid search over the decay rates \(\gamma \in \{0.00, 0.01, \dots, 0.19, 0.20\}\), compute the weighted sum of absolute differences, and compute the argmin of these sums across the choices of \(\gamma\), giving us the choice of \(\gamma\) which generates a model most similar to the real-world coauthorship network. After calibration of the model with the astrophysics co-authorship network, however, we see that astrophysicists essentially exhibit overly-high ‘standards’, failing to lower their similarity thresholds at a rate which would best allow the emergence of inter-sub-field connectors. Expanding from calibration of the toy model with \(N = 9\) to calibrating larger co-authorship networks (\(N \in \{100, 200, 400, 800\}\)), we find that this result continues to hold: the astrophysics co-authorship network ‘matches’ with networks generated by \(\gamma \in [0.05,0.09]\), while the optimal \(\gamma\) for calibrated networks remains almost identical at \(0.11\).

Discussion

Our network evolution models was parameterized to mimic the network metrics found in Newman’s astrophysics co-authorship network. We see agents creating distinct clusters by sub-topic affinity, and two actors acting as brokers between distinct components. Both of these broker agents have high sub-topic affinity, and relative TWD. The structural network position of these agents would garner them preferential treatment within the network, and from new individuals attempting to join the network. These nodes have been theorized by Abbasi and Hossain (2012) to mimic established researchers within departments, whose networks link across departments and school, while also introducing new agents (graduate students) to the co-authorship network.

We see in the decay parameter graph that the parameter closest to the real life network is not the most beneficial for any agent (though it is moderately helpful for those of average talent). Our model indicates that if researchers were to be just slightly more willing to reach outside their specific interest and sub-fields, brokerage between topics will increase for academics across the spectrum of talent. This increase in brokerage would not only help ideas travel between cliques in the larger network, but may increase the quality of academic pursuits.

Conclusion

In creating this model, we showcase how small changes in the co-authorship behavior of researchers can greatly impact the structure of a field. Given that the type and quality of co-authorship ties have been shown to impact research quality and output, knowing what co-authorship behaviors to encourage can help enhance the scientific pursuit. With these findings, we hope to encourage researchers to reach farther outside their immediate specialties, and explore co-authorship opportunities with colleges whose interests are dissimilar.

References

Abbasi, A., Hossain, L., & Leydesdorff, L. (2012). Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. Journal of Informetrics, 6, 403–412. https://doi.org/10.1016/j.joi.2012.01.002

Buechel, B., & Buskens, V. (2008). The dynamics of closeness and betweenness.

Coleman, J. S. (1988). Social Capital in the Creation of Human Capital. Knowledge and Social Capital, 17–41.

Ding, Y. (2011). Scientific collaboration and endorsement: Network analysis of coauthorship and citation networks. J Informetr, 5, 187–203. https://doi.org/10.1016/j.joi.2010.10.008

Gonzalez-Brambila, C. N., Veloso, F. M., & Krackhardt, D. (2013). The impact of network embeddedness on research output. Research Policy, 42, 1555–1567. https://doi.org/10.1016/j.respol.2013.07.008

Granovetter, M. S. (1977). The strength of weak ties. In Social networks (pp. 347–367). Elsevier.

Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M., & Morris, M. (2003). Statnet: Software tools for the Statistical Modeling of Network Data.

Jackson, M. O., & Wolinsky, A. (1996). A Strategic Model of Social and Economic Networks. Journal of Economic Theory, 71(1), 44–74. https://doi.org/10.1006/jeth.1996.0108

Lee, N. H. (2014). The Search for an Abortionist: The Classic Study of How American Women Coped with Unwanted Pregnancy before Roe v. Wade. Open Road Media.

McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology, 27, 415–444. https://doi.org/10.1146/annurev.soc.27.1.415

Newman, M. E. (2004). Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci U S A, 101 Suppl 1, 5200–5205. https://doi.org/10.1073/pnas.0307545100

“R Core Team”. (2017). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.

Trust and Suitable Partners

Jeff Jacobs, Political Science, Columbia University Jared Joseph, Sociology, University of California - Davis

June 21, 2018