Article Text

PDF

A glossary of terms for navigating the field of social network analysis
  1. Penelope Hawe1,
  2. Cynthia Webster2,
  3. Alan Shiell1
  1. 1Department of Community Health Sciences, University of Calgary, Canada and School of Public Health, La Trobe University, Australia
  2. 2School of Marketing, University of New South Wales, Australia
  1. Correspondence to:
 Dr P Hawe
 Department of Community Health Sciences, University of Calgary, 3330 Hospital Drive NW, Calgary, Alberta T2N 4N1, Canada; phaweucalgary.ca

Abstract

Social network analysis is the study of social structure. This glossary introduces basic concepts in social network analysis. It is designed to help researchers to be more discriminating in their thinking and choice of methods.

  • social capital
  • social network analysis
  • social structure
  • social relations

Statistics from Altmetric.com

“Network” is an increasingly popular word in health research and health services delivery. The word is often used synonymously for “partnership”, “collaboration”, “alliance”, or even “group”. But other times, it is used with more specific intention to describe the relationships that exist between groups of individuals or agencies, and the resources to which membership of such groups facilitates access. These relationships can be investigated empirically. The role of one’s personal social networks in the development of morbidity and mortality is a well established field of inquiry in social epidemiology.1–3 Networks and network resources are also an important component of the growing literature on social capital.4,5 Network analysis is becoming popular in infectious disease epidemiology, especially HIV.6–8 There is also a strong tradition of using inter-organisational network analysis to investigate patterns of healthcare delivery such as referral patterns, service integration, coordination, and collaboration.9–13

Social network analysis is the study of structure.14 It involves relational datasets. That is, structure is derived from the regularities in the patterning of relationships among social entities, which might be people, groups, or organisations. Social network analysis is quantitative. It has a long history in sociology and mathematics and it is creeping into health research as its analytical methods become more accessible with user friendly software. See Wellman for an overview of the concept of the social network and a history of network analysis.15

From a network perspective, it is the structure of the network and how the structural properties affect behaviour that is informative, not simply the characteristics of the network members (the latter comprise attribute datasets). This glossary has been prompted by observations in health research that many investigators ask questions about properties of social networks (for example, “how many people would you usually socialise with in a typical month”) and then call this “social network analysis.” But it isn’t, and our field of inquiry can easily become confused or compromised as a consequence. In such studies, questions about network composition might characterise people in terms of gender, or occupation, for example. Questions about network structure might include size of a network (“How many close friends do you have?”) or the frequency of interaction (“How many times per month do you have dinner with close friends?”). Questions about function might include the social support or the resources (social capital) that a person draws upon from that network. These might be questions about the amount or quality of informational support, material support, or emotional support provided.16 Such specific information has proved remarkably powerful in explaining some variation in health.17–19 But it provides only a partial view of a person’s social networks. Missing is any information on the position of the person within the network, of the relationships between other network members, of the characteristics of the network structure (whether it is dense or loose), of the ties that connect actors (whether they be strong or weak), and of the relationships between network structure and position, and access to the resources embedded within those networks. True network data, such as these, can add enormously to our understanding of how physical and social environments impact on health and behaviour.20,21

In this glossary we provide introductory level, non-technical definitions of the main concepts and measures used in network analysis. See Wasserman and Faust22 and Scott23 for a comprehensive review and the web page of the International Network for Social Network Analysis (http://www.sfu.ca/~insna) for a guide to texts, journals, conferences, and statistical software. Our purpose is twofold. Firstly, we want to prevent mislabelling in this field, in particular where the rhetoric of social network analysis is invoked but then coupled with measures that are limited to properties of individuals. Secondly, by elucidating the range and complexity of concepts in this arena, we want to promote more discriminating research of social phenomenon in the health sciences. By this we mean more precise methods and analysis tied to more precise hypotheses about properties of networks, but more particularly, their structure.

BASIC ELEMENTS IN NETWORK ANALYSIS

Actors are network members that are distinct individuals (for example, clients of a health service, residents of a neighbourhood) or collective units (for example, health organisations within a community).

Relational ties link actors within a network. These ties can be informal (for example, whether people in one organisation know people in another organisation) or formal (for example, whether one organisation funds another). Actors can have multiple ties with other actors, a feature known as multiplexity.

TYPES OF NETWORKS

One mode networks involve relations among a single set of similar actors, such as information exchange among physicians within a hospital.

Two mode networks involve relations among two different sets of actors. An example would be the analysis of a network consisting of private, for profit organisations and their links to non-profit agencies in a community. Two mode networks are also used to investigate the relationship between a set of actors and a series of events. For example, although people may not have direct ties to each other, they may attend similar events or activities in a community and in doing so this sets up opportunities for the formation of “weak ties.”24

Socio-centric or complete networks consist of the relational ties among members of a single, bounded community. An example would be relational ties among all of the teachers in a high school.

Ego-centric or personal networks are defined from a focal actor’s perspective only. This refers to the ties directly connecting the focal actor (ego) to others (ego’s alters) in the network, plus ego’s views on the ties among his or her alters. An example would be if we asked a teacher to nominate the people he/she socialises with outside of school, and then asked that teacher to indicate who in that network socialises with the others nominated.

NETWORK DATA COLLECTION

Saturation surveys are used to map complete or whole networks. Relevant relational data (for example, type of relation, strength of tie) are collected from each actor in the network allowing a complete analysis of network relations and the resources embedded therein. For fairly small networks (50 actors or less), each actor can be provided with a list of all actors in the network and asked to indicate those with whom she or he has a particular relation (and any other relevant relational information such as strength of the tie). For relatively large networks each actor can be asked to recall freely her or his relations within the specified network.

For ego-centric networks, in which it is not possible to survey every network participant, two methods of data collection can be used: name generators and position generators.

Name generators involve asking a focal actor for the names of people to whom he or she is connected in a particular way. Connections might involve identifying people with whom the focal actor “discusses important matters with”25 or with whom he or she “frequently socialises with.”26 A snowball sampling technique is typically followed, in which a set number of focal actors are randomly chosen from the larger population to interview initially.27–29 From the list of names generated by the focal actors, called the actors’ alters, either all of the named individuals are then interviewed or a specified number of alters are randomly chosen to be the next interviewed. This procedure continues for a fixed number of steps.

Name generators such as these typically elicit strong ties in dense network sectors.30 To identify weaker ties in more wide ranging network sectors, acquaintance name generators can be used.31 Name generators should be followed up by a series of “name interpreter” questions, designed to elicit information about the named actors, their characteristics, relations to the focal actor, and their relationship to the other named alters. Without information on the interrelationships among the alters, no structural analyses can be performed.32

Position generators are used to identify people who fill particular valued roles or positions such as lawyers, physicians, or politicians and who therefore have access to a range of resources (for example, information, skills, links to other networks).33,3 The roles are specified by the analyst and the focal actors are asked if they know anybody in each of these roles. As with name generators, name interpreter questions should follow.

Data about networks are obtained in much the same way as it is for data about individuals in traditional health research. That is, it relies principally on interviews, self completed questionnaires, document analysis, diary methods, and observation. Issues to do with the reliability and validity of these data sources are often similar to those in attribute data collection, and a useful review is provided by Marsden.35 On the whole, people are generally better at recalling typical or routine relationships and interactions than they are on transactions that occur with highly specific time frames.35 Informant “accuracy” in studies of social structure is an interesting conceptual issue and one that encourages researchers to reflect carefully on the theory underlying their analysis of social structure. For example, if an actor says that he/she has a tie with a particular alter, but the researcher finds that the alter does not verify it, does that mean that the tie does not “exist”? Or is the subjective cognition of the tie by the actor the most important interpretation in this context?35 Another important methodological area of research in social networks is the issue of how to select samples and set boundaries for networks—that is, deciding who is “in” and “out” of the study.36

MEASURES OF NETWORK STRUCTURE

Network data are collected at the individual level, but as the following definitions indicate, the analyses occur at the structural level.

Data from a network survey are typically entered into a database as a square actor by actor similarity or distance matrix. Presence of a tie is indicated with a “1” and no tie is indicated by a “0”. Table 1 is a matrix of network relationships among 19 organisations. It shows data generated from the question “From this list, can you identify which organisations your own organisation currently sits on committees with?” If strength of tie is being investigated (for example, how much a person likes another person, or how regularly a person socialises with another person), this is represented as valued data (that is, typically numbers from 1 to 5, with 5 being the highest strength.). In a similarity matrix a large number in the corresponding cells connecting actors indicate strong ties. Just the opposite is the case in a distance matrix. A distance matrix is like a road map, larger numbers denote greater distances between actors or in other words weaker ties. These data can be converted into graphs and analysed using special network analysis software packages, such as Ucinet 6 (Harvard, Analytic Technologies), Pajek (http://vlado.fmf.uni-lj.si/pub/networks/pajek/default.htm), and StOCNET (version 1.4, Groningen, ProGAMMA/ICS).

Table 1

 Square matrix illustrating committee ties between 19 organisations

Graphs are visual representations of networks, displaying actors as nodes and the relational ties connecting actors as lines. The data in table 1 are represented as a graph in figure 1. Immediately we see that three of the organisations have no formal committee ties and nine of the organisations have many committee links to one another. One organisation, actor 19, clearly is in a unique position being the only organisation connecting six other organisations to the larger group of nine.

Figure 1

 Graphical display of an interorganisational network with 19 actors.

Cohesion describes the interconnectedness of actors in a network. There are three common measures of cohesion:

Distance between two actors in a network (or nodes in a graph) is calculated by summing the number of distinct ties (lines) that exist along the shortest route between them. So in figure 1 actor 15 is a distance of 5 from actor 11. This is the notion of “degrees of separation” made familiar to many by a popular play.37

Reachability measures whether actors within a network are related, either directly or indirectly, to all other actors.38 Actors who are not connected to any other actors are called isolates. With the exception of the three isolates (actors 4, 16, and 18), all of the remaining actors in figure 1 can reach one another.

Density of a network is the total number of relational ties divided by the total possible number of relational ties. There are 56 ties out of a possible 342 for the interorganisational network in figure 1, giving a density of 0.164.

Density is one of the most basic measures in network analysis and one of the most commonly used notions in social epidemiology. Some network structures are particularly advantageous for certain functions. For example, dense networks are particularly good for coordination of activity among the actors (because everyone knows everyone’s business). The downside is that such networks entrench particular value systems and norms. In a classic study of family networks, Bott39 showed that loose knit networks are particularly useful if an actor wants to deviate from the norms of his or her immediate social circle.

Subgroup measures show how a network can be partitioned.

A component is a portion of the network in which all actors are connected, directly or indirectly, by at least one tie. By definition, each isolate is a separate component. There are four components in figure 1, one large component and three isolates.

A clique is a subgroup of actors who are all directly connected to one another and no additional network member exists who is also connected to all members of the subgroup.40 A total of 11 cliques are found in figure 1: {1,6,7}; {6,7,11}; {6,7,19}; {3,7,19}; {3,7,11}; {1,3,7}; {1,3,10}; {1,2,3, 9}; {2,3,9,19}; {6,9,19}; {1,6,9}. Note the substantial amount of overlap among the actors identified in each of the cliques. An analysis of the overlapping allows the core members of the network to be identified. The core members of the network are actors 3 and 7 both of whom are in six cliques, four of which overlap.

Clique analysis is the most common technique used to identify the dense subgroups within a network. Subgroup detection has been a particularly important element in diffusion and adoption studies.41 The main network theory used in these studies is Granovetter’s (“the strength of weak ties”).24,42 This theory proposes that information spreads rapidly through densely knit subgroups because actors are strongly connected to one another and they directly share the information. Access to new information, however, comes into strongly connected groups through sources with external connections, which are likely to be weak.

One of the most well known network experiences, the small world phenomenon43,44 combines the notions of connectivity and subgroup clustering. It is the surprising, often reported experience that everyone in the world is able to reach one another by going through a small number of others. A small world graph is formalised as a sparse network that is highly clustered, containing a large number of actors, none of whom are dominant.45 These structures can have dire consequences for the spreading of diseases, as the highly clustered structure creates a sense of isolation yet the short global divisions among clusters allows for rapid infection.46

Centrality measures identify the most prominent actors, that is those who are extensively involved in relationships with other network members.47 Centrality indicates one type of “importance” of actors in a network: in lay terms, these are the “key” players.

Degree centrality is the sum of all other actors who are directly connected to ego. It signifies activity or popularity. Lots of ties coming in and lots of ties coming out of an actor would increase degree centrality. In figure 1, actor 19 has the highest degree centrality with nine direct ties and actor 3 is the next most central with eight direct ties.

Closeness centrality is based on the notion of distance. If an actor is close to all others in the network, a distance of no more than one, then she or he is not dependent on any other to reach everyone in the network. Closeness measures independence or efficiency. With disconnected networks, closeness centrality must be calculated for each component. For the largest component in figure 1, actors 19 and 3 again are the most central. Actor 19 is the most independent actor with a total of only 22 ties connecting it to all other organisations in the component, while actor 3 requires 25 ties.

Betweenness centrality is the number of times an actor connects pairs of other actors, who otherwise would not be able to reach one another. It is a measure of the potential for control as an actor who is high in “betweenness” is able to act as a gatekeeper controlling the flow of resources between the alters that he or she connects. Actor 19 is by far the most powerful actor in the network depicted in figure 1. All actors in the network must go through actor 19 to reach actors 8, 13, and 17, and with the exception of actors 14 and 15, all actors also must go through actor 19 to reach actor 12.

These measures of centrality are purely structural measures of popularity, efficiency, and power in a network, namely that the more connected or central an actor is the more popular, efficient, or powerful. However, some actors may wield power while being on the boundary of the network. For example, some organisations within an inter-organisational network can exercise power by refusing to lend their credibility to the network. They remain on the periphery structurally, but are able to influence the direction the network takes entirely because of their size, reputation, or through the power of sanctions. Such organisations typically have considerable resources in their own right (status or authority).

To capture this complexity, the hypotheses leading the network analysis have to be specific and tailored to the context. Qualitative data alongside the quantitative analysis may be vital to a full understanding.

Role and position measures reveal subsets of actors whose relations are similarly structured.

Structural equivalence identifies actors that have exactly the same ties to exactly the same others in a network.48 In figure 1, the only actors that are structurally equivalent are actors 8, 13, and 17, all of whom are tied to actor 19 and no others. Actors 2 and 9 are very close to being structurally equivalent, both are connected to actors 1, 3, and 19, but actor 9 has one additional connection to actor 6.

Regular equivalence is a relaxation of structural equivalence.49,50 Actors who are “regularly equivalent” have identical ties to equivalent, but not necessarily identical, others. For example, two mental health agencies that provide the same services but to different clients are “regularly equivalent” but not “structurally equivalent” as they do not service exactly the same people. Regular equivalence finds actors 8, 13, and 17 as well as actors 5 and 15 to be similarly positioned. In figure 1 all of these actors are only connected to one other actor.

One might hypothesise that actors who occupy similar positions or similar roles would behave similarly. This can be a fascinating field for exploration of local social structures. For example, when contrasting networks from one place to another, we may learn that even though an actor might carry the same name as another actor (for example, “father” in the analysis of family networks or “community health service” in an investigation of community agencies), those actors may behave and relate differently in their own local contexts. Fathers in one local cultural context may occupy role positions more like mothers in another. Public sector community health agencies in one context may behave more like private for profit agencies in another. Within a single network actors with different names may occupy similar positions in that network. Rich opportunities for investigation are thus provided. For example, to identify and develop the potential of “natural helpers” in a community51 as a prelude to the design of community intervention to promote health.

CONCLUSION

Various different functions and types of social networks may be critical for different health outcomes at different times and at different ages and stages. We predict an expansion in the use of network analysis in health research, as researchers better appreciate the nested multilevel environments (or contexts) within which behaviour occurs. We see this as part of the new frontier of complex networks and complex interventions. An excellent overview on this is provided by Newman.52 In population health, the structure of networks and the dynamics of local processes may prove critical to understanding the way actions and interactions in local settings “cumulate into outcomes at higher levels (communities, populations).”53 Health researchers should take more opportunity to become familiar with, and more discriminating about, the way they theorise social relations and measure social structures.

Acknowledgments

PH and AS are Senior Scholars of the Alberta Heritage Foundation for Medical Research, Canada. PH holds the Markin Chair in Health and Society at the University of Calgary.

REFERENCES

View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles