Homework - Academics choose research topics

Homework - Academics choose research topics.

By Fabian Held and Antonio Miscio

Introduction

We developed a simple agent-based model of academic research. Our agents are born as PhD students, graduate into full-blown academic and attempt several research projects in the course of their academic life. Different factors affect the choice of what topic to study, some are related to the researcher’s characteristics and some with the research environment in which it operates. We analyse the long run effect of different parameters of interest on the amount of knowledge discovered, the number of academics still active in research and the number of initial “schools of thought” still surviving after the passage of time.

Theory and assumptions

When academics choose which projects to work on they are typicall influenced or constrained by a number of factors. Our assumptions are intended to represent what in our view are the most relevant factors characterizing this choice and can be classified in three areas: knowledge base, academic’s own characteristics, peers.

· Knowledge base: new research topics build on the existing knowledge base

· Academic’s own characteristics: researchers have a field of expertise, i.e. a field on which they have done some work in the past; they are also born with a natural inclination for a particular area of research, for instance natural vs. social sciences; finally academics’s choice of topics also depends on factors such as age, whether they hold a tenured position, how stubborn they are, we condense all the latter into one factor assuming that they are highly correlated.

· Peers: the research environment and influence of peers can matter in different ways, for instance it can act as a signal of what topics “sell” well at the moment, what is the predominant paradigm in the discipline, or which recent development are the most promising.

Moreover, in addition to working on some projects on their own in some cases academics team up with one another to produce multidiciplinary research projects, in which case the resulting topic is a hybrid object.

Finally, academics are “born” as PhD students and at the beginning of their career they are closely associated with a more senior academic, this has obviously an impact on the initial topic chosen by the student.

Implementation

We used Netlogo to model the following elements:

· Knowledge: Knowledge is represented by a two dimensional grid which is initially unexplored (patches are initially black), i.e. ideas exist but are not known until an academic successfully discovers them (patches become white). Each angle in the 0-360 degrees range is an area of knowledge (e.g. 60 degrees = biology, 95 degrees = anthropology)

· Academics: Academics are born (more on this to follow) in a location on the grid, somewhere at the frontier of discovered knowledge. They are endowed with a natural inclination for a discipline, represented by their ideal angle a; they are aware of the position of their peers on the grid; they choose research topics and attempt a project in the direction of the chosen topic; in some cases they generate offsprings (i.e. little Phds); over time acedemics age and stop doing research;

· Choice of a topic: an academic starts by observing her own current position; she also observes where are all the other peers (i.e. agents located within a certain distance from herself) and calculates in what direction (an angle b) she should walk to reach the average peer position starting from her current position; finally a topic is selected by taking a weighted average of a and b, where the relative weight on the natural inclination increases with age (two interpretations: academics become more stubborn, academics are less influenced by peer researchers).

· Project: once a topic (i.e. a direction) has been determined following the procedure detailed above, the academic attempts a project, i.e. she walks in that direction on the grid until she finds an area of knowledge which has not been explored yet. Projects are successful with some probability which depends on the environment, that is the more knowledge has been explored already in the neighborhood of the new topic, the more likely it is that a new project on that topic will be successful. When a project turns out to be successful (cell turns white) the academic can continue doing research and a new PhD student is generated at that location. If instead it is unsuccessful then the academic stays on that patch until another academic successfully explores it.

· PhD students: Even though a new PhD student is born in the same location as where his supervisor is at that moment, he’s endowed his own a, independently of the supervisor’. After the first period, the PhD student “graduates” and becomes a full-blown academic. We also introduced a parameter to represent the survival rate of PhD students in the academia because we want to analyse what happens once we allow for a variable number of them to quit the academia at the end of the PhD and join the private sector.

· Multidisciplinarity: a multidiciplinary project is the result of a random match of two academics, the hybrid projects inherits a trait from each academic, i.e. the current position of one of them and the preferred direction a of the other one

Netlogo applet starts here

Netlogo applet ends here

Results

We explored the model's behaviour throughout a sensible range of its parameter space. The four model parameters PhD retention rate (15%, 20%, 30%, 40%), initial number of academics (10, 20, 30, 40, 50), probability of multidisciplinary research (1%, 3%, 5%) and maximum age of academics (5, 7 projects) served as independent variables and the model was run 25 times for every combination of these parameters. The maximum duration was 70 steps, because initial exploration showed that at that time the model had either stopped because all academics had disappeared - or the population of academics becomes so large that NetLogo slows down substantially.

Dependent Variables were the maximum amount of knowledge discovered, the number of academics still actively pursuing research and the number of original ideas or "schools of thought" that had persisted until the end of the model.

We used regression analysis to assess the strength of impacts of model parameters on model development.

All independent variables turned out to be statistically signifficant, except for the probability of multidisciplinary research:

Experiment 1: MaxKnowledge ~ PhDSurvival + InitAgents + pMultidisciplinarity + MaxAge

Coefficients:	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-11451.79	1804.06	-6.348	4.49e-09 ***
PhDSurvival	170.02	25.97	6.547	1.71e-09 ***
InitAgents	54.04	17.63	3.065	0.00271 **
pMultidisciplinarity	23.43	152.69	0.153	0.87831
MaxAge	1098.66	249.34	4.406	2.37e-05 ***

Signif. codes: *** 0.001; ** 0.01; * 0.05.

Residual standard error: 2731 on 115 degrees of freedom

Multiple R-squared: 0.384, Adjusted R-squared: 0.3626

F-statistic: 17.92 on 4 and 115 DF, p-value: 1.831e-11

Experiment 2: Turtles.at.End ~ PhDSurvival + InitAgents + pMultidisciplinarity + MaxAge

Coefficients:	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-18430.45	3394.79	-5.429	3.20e-07 ***
PhDSurvival	256.80	48.87	5.255	6.89e-07 ***
InitAgents	96.17	33.18	2.899	0.004488 **
pMultidisciplinarity	65.77	287.32	0.229	0.819352
MaxAge	1721.94	469.20	3.670	0.000369 ***

Signif. codes: *** 0.001; ** 0.01; * 0.05.

Residual standard error: 5140 on 115 degrees of freedom

Multiple R-squared: 0.3011, Adjusted R-squared: 0.2768

F-statistic: 12.38 on 4 and 115 DF, p-value: 2.077e-08

Experiment 3: Original.Ideas.Left ~ PhDSurvival + InitAgents + pMultidisciplinarity + MaxAge

Coefficients:	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-1.693538	0.411095	-4.120	7.18e-05 ***
PhDSurvival	0.022373	0.005918	3.781	0.00025 ***
InitAgents	0.010000	0.004018	2.489	0.01424 *
pMultidisciplinarity	0.018750	0.034794	0.539	0.59100
MaxAge	0.150000	0.056818	2.640	0.00944 **

Signif. codes: *** 0.001; ** 0.01; * 0.05.

Residual standard error: 0.6224 on 115 degrees of freedom

Multiple R-squared: 0.1944, Adjusted R-squared: 0.1664

F-statistic: 6.937 on 4 and 115 DF, p-value: 4.872e-05

Furthermore the results and dynamics are provided graphically:

Link to result tables

Further research

Given the limited time available for this homework we had to remove from the model a number of interesting features which are worth further investigation. In particular we would have liked to have knowledged represented by a rugged landscaped instead of a uniform one. The interpretation could be that some areas of knowledge are harder to discover than others (hence a rugged environment where cells have a different altitude) and therefore there is a lower likelihood that a research project is successful in discovering that area of knowledge.

Although we modeled to some extent a notion of incremental knowledge, our patches discretely change from a value zero to one (from undiscovered to discovered knowledge) as soon a a project is successful. We could instead imagine that academics, when successful, add a certain amount of understanding to that patch, and therefore we could have multiple academics adding understanding to the same patch.

Moreover, the matching mechanism behind a multidisciplinary project is entirely random, a more plausible one would instead be conditional on some characteristics of the academics such as their past record of projects, natural inclination, etc.

Finally, all projects in this model only last one period and only one project per period can be attempted, that is an academic attempts a project only once and if she happens to be unsuccessful then she’s stuck there for some time. The model would gain some more realism by allowing academic to work on the same project for longer than one period and to work on multiple projects at the same time. This features can of course be interacted with the project record of the academic and should depend on the research environment too.