Article Text

Download PDFPDF

Assessing social-emotional development in children from a longitudinal perspective
  1. S A Denham1,
  2. T M Wyatt1,
  3. H H Bassett1,
  4. D Echeverria2,
  5. S S Knox3
  1. 1
    George Mason University, Virginia, USA
  2. 2
    Battelle Memorial Institute, Columbus, Ohio, USA
  3. 3
    University of West Virginia, Morgantown, West Virginia, USA
  1. Dr S A Denham, George Mason University, 4400 University Drive, Fairfax, VA 22030-4444, USA; sdenham{at}


This paper provides an overview of methodological challenges related to the epidemiological assessment of social-emotional development in children. Because population-based studies involve large cohorts and are usually multicentre in structure, they have cost, participant burden and other specific issues that affect the feasibility of the types of measures that can be administered. Despite these challenges, accurate in-depth assessment of social-emotional functioning is crucial, based on its importance to child outcomes like mental health, academic performance, delinquency and substance abuse. Five dimensions of social-emotional development in children are defined: (1) social competence; (2) attachment; (3) emotional competence; (4) self-perceived competence; and (5) temperament/personality. Their measurement in a longitudinal study and associated challenges are discussed. Means of making valid, reliable assessments while at the same time minimising the multiple challenges posed in the epidemiological assessment of social-emotional development in children are reviewed.

View Full Text

Statistics from

The importance of accurate in-depth assessment of social-emotional functioning during development has important implications for public health from infancy to adulthood because it predicts and is associated with mental health, academic performance, delinquency, substance abuse and workplace performance.13 For example, when developmental milestones of social-emotional competence are not negotiated successfully, children are at risk not only for psychopathology411 but also for multiple behaviour problems, poor school performance and drug abuse. In contrast, the successful development of such functioning is associated with adaptive resilience in the face of stressful circumstances.12

Given the crucial nature of social-emotional development, the first aim of this paper is to describe and enumerate developmental milestones for five dimensions that have previously been found to be associated with optimal (and non-optimal) outcomes for children and even adults. The second objective is to make recommendations concerning appropriate assessments for these constructs at different developmental stages. The dimensions are: (1) social competence; (2) attachment; (3) emotional competence; (4) self-perceived competence; and (5) temperament/personality. In addition, the current discussion will address the co-action of parental influences with these various dimensions. These specific dimensions were chosen because they form a comprehensive theoretical structure of interrelated intrapersonal (eg, self-perceived competence, temperament) and interpersonal (eg, social competence) constructs,13 14 which means that they are fundamental to how people function in social and familial relationships.

To illustrate, there is ample evidence that socially competent children demonstrate more positive school behaviours and fewer diagnoses of psychopathology than children who lack social competence.9 15 There are also numerous findings that individuals with secure attachments during infancy and childhood develop positive social-emotional competence, cognitive functioning, physical health and mental health.16 17 Even academic outcomes such as focused attention and participation in class, as well as higher grade point averages, are associated with secure attachment.18 The link between secure attachment and later positive outcomes extends well into adolescence, particularly in areas of peer and romantic relationships, school competence and psychological adjustment.1923

Links also are being found between emotional competence and children’s social competence, mental health and academic success.24 For example, both positive and negative expressiveness show measurable effects on how well children get along with peers,25 and children more adept at understanding emotion are also more socially competent from the perspective of both teachers and peers.26 In addition, both concurrent and earlier emotional expressiveness and regulation are related to adolescent social skills, prosocial behaviour, popularity, disruptive behaviour and aggression.27 Moreover, developmental change in self-perceived competence is related to changes in motivation and to school-related affect and anxiety,28 as well as to depressive symptomatology.29 Finally, infants’ and children’s negatively reactive temperaments increase their risk of behaviour problems, whereas temperamentally-based attention regulation is related to successful school functioning.30 In sum, these dimensions singly and additively predict many positive outcomes from infancy through young adulthood.

Social-emotional development does not take place in a vacuum but is substantially affected by a child’s rearing environment. The assessment of parenting practices and parent/child interactions is therefore important. For example, punitive power-assertive techniques used as a preferred mode of discipline have detrimental effects on a child’s social-emotional competence, whereas more authoritative but less authoritarian patterns are associated with more positive outcomes.31 Similar findings extend to adolescent school achievement, peer deviance and delinquency.32 33 In particular, emotion-related parenting practices also support the development of social-emotional competence.34 In determining the developmental trajectories of the social-emotional constructs to be considered here, parenting behaviour is a crucial contextual variable, knowledge of which (along with other important aspects of the environment) will greatly enhance the ability to understand change and stability in development.

Our second aim, along with discussion of the dimensions and milestones of social-emotional development across the first two decades of life, is to suggest promising means of assessing change and stability in the dimensions of social-emotional development, and our third aim is to provide an overview of methodological challenges related to the longitudinal epidemiological assessment of such social-emotional development in infants, children and young adults. As noted by Knox and Echeverria in the introduction to this special issue, population-based studies involve large cohorts and are usually multicentre in structure; because of these attributes, they have cost and participant burden issues that affect the feasibility of the types of measures that can be administered. A measure that has been demonstrated to be the “gold standard” for predicting a particular outcome in smaller studies may simply not be feasible because of the size of the cohort, the administration, training and/or coding time required, and the simultaneous need to measure many other aspects of social-emotional and other domains. Longitudinal assessment of social-emotional development in children also presents additional challenges, one of which is finding measures that assess the same theoretical constructs across developmental stages. If one is trying to assess the influence of an exposure on the developmental trajectory of a factor such as social competence, then one must be sure that the same construct is being assessed at each measurement time point. Given that the children get older and develop cognitively between measurements, new assessments may need to be developed separately for different developmental stages. Thus, special care must be taken when choosing measures to see that construct validity is maintained across time and informants. Unfortunately, much preliminary psychometric work related to longitudinal construct validity across developmental stages of social-emotional development still needs to be done. These are but two of a number of pressing methodological issues that will addressed in this review.

In the following sections we will first outline the dimensions and their developmental milestones, illustrating the measures we have determined to be good choices for assessing children’s social-emotional competence, both theoretically and empirically (eg, via their psychometric properties). We will then describe, with specific examples, the methodological challenges of such measurement. Finally, it is important to note that the constructs and measures that we are discussing involve competencies, not psychopathology or problems. Failure to successfully navigate development in these areas can lead to both behavioural and psychological problems. However, issues related to the more detailed measurement of psychopathology are addressed in the paper by McClellan et al elsewhere in this supplement.


Any study of development must take into consideration the cognitive and maturational changes that occur as a children get older. A developmental task represents a given culture’s definition of typical development at different points in the lifespan, describing the accomplishments (eg, becoming autonomous) that we expect during a particular period.35 36 Table 1 outlines the general developmental tasks that should be assessed in each dimension of social/emotional development for each developmental period, and will be referenced as we discuss each dimensions and its measures. The recommendations are those of the current authors. At times one measure will tap more than one dimension; when this occurs, it will be so noted.

Table 1 General developmental tasks that should be assessed in each dimension of social/emotional development for each developmental period

Dimension 1: Social competence

The first dimension is social competence, which we define as effectiveness in developmentally appropriate social interactions.14 Each measure should address a number of age-appropriate specific skills crucial to social effectiveness. In general, however, the specific skills should include such elements as cooperation, helpfulness, ability to resolve conflicts (ie, measures that assess effectiveness in social interaction). The types of tasks that will typify these characteristics and the expectations associated with them will vary with age. Sufficient breadth of coverage at each age level for the various skills that illustrate such effectiveness is desirable and is illustrated in the following measures chosen by our team for the National Children’s Study.

Infancy and toddlerhood

As noted in table 1, individual differences in social interaction begin during infancy. Even babies show interest in people and social interaction with adults and children their own age. Toddlerhood marks the inception of peer interactions and relationships, along with incipient prosocial behaviours and empathy.31

Table 2 shows measures selected to be used to assess social competence at this and the following age ranges. To obtain information on infants’ and toddlers’ social and emotional competence, the Infant-Toddler Social-Emotional Assessment (ITSEA) or its brief version (BITSEA) was chosen because no other measure for the age period captures, in one measure, so many specific aspects of social competence (eg, the measure includes scales for compliance, prosocial behaviour and peer interaction). The ITSEA also taps aspects of emotional competence, so will be treated also in that section.37

Table 2 Recommended assessments for social competence by developmental stage


During this age period, many aspects of social and emotional competence begin to blossom. Children’s interactions with peers increase in frequency and importance, but relationships with adults remain important (see table 1). Peer relationships also become more complex, with clear evidence of specific friendships and peer status. At the same time, children become capable of prosociality, making it crucial for them to regulate the emotional arousal so often attendant with peer interaction. To obtain information about these important aspects of social competence (especially prosocial interaction), the Social Competence Behaviour Evaluation (Short Form; SCBE-30) is recommended owing to its close adherence to developmental milestones of social competence (already noted), and the attention it gives to both expression and regulation of emotions (see emotional competence dimension).38 In short, the SCBE-30 conforms most closely to the construct definitions of both social and emotional competence.

Alternatives or additions noted in table 2 include the Devereux Early Childhood Assessment39 and Penn Interactive Peer Play Scale40 which tap important aspects of social and emotional competence. The Penn Interactive Peer Play Scale was developed to measure very similar dimensions to the SCBE-30, but also to be particularly ecologically valid in its focus on play and its creation in consultation with early childhood educators and care providers. The Devereux Early Childhood Assessment accesses issues pertaining to attachment and self-control (which includes elements of emotion regulation, part of the emotional competence dimension covered here) and is very quick to complete.

During the preschool period the Social Skills Rating System, which allows for multiple informants, can also begin to be used.41 It is recommended as an adjunct to the SCBE-30 because of its longitudinal range from this age through late adolescence. The Minnesota Preschool Affect Checklist allows observers to capture a complete snapshot of children’s social and emotional competence in 20 min of observation.42 Sociometric ratings performed by children’s peers can also begin to be used in this stage of development, when peer views of social competence become both possible and useful.43 Finally, the social scales of the Berkeley Puppet Interview (ie, peer acceptance and rejection, bullied by peers, asocial with peers, social inhibition, over-aggression/hostility, relational aggression, prosocial behaviour) give the child’s own view of his/her social competence.44 These last three measures add important additional information to that obtained from parental report.

Grade school

During grade school, many aspects of social competence remain important while increasing in complexity. Children’s interactions with peers—both in terms of dyadic friendships and overall peer likeability—become absolutely crucial to their social success and subjective well-being. Successful negotiation of this stage of development is accompanied by a decrease in aggression as the child becomes more nuanced in social interaction. Furthermore, children themselves can now report (even more reliably and via pencil-and-paper measures) on important aspects of their own peer experience. It is therefore recommended that usage of the age-appropriate version of the Social Skills Rating System be continued. The Rochester cluster of social skills measures (ie, the Parent-Child Rating Scale, Teacher-Child Rating Scale and Child Rating Scale) was not the primary recommendation because it is less comprehensive than the Social Skills Rating System and edged out by that measure in terms of psychometric properties.45 46

Sociometric ratings and self-report questionnaires on loneliness, social avoidance and anxiety, social experiences of aggression and prosocial behaviour, and friendship quality are also recommended.43 These self-report measures were chosen as the most concise measures germane to children’s view of their own success or failure in the peer world.4749


During adolescence many already-mentioned aspects of social and emotional competence remain important and continue to increase in complexity. Within social competence, children’s relationships with peers of the same and opposite sex are becoming more intimate; adolescents are balancing relationships with parents and peers, as well as an increasing need for independence. It is therefore recommended that use of the age-appropriate version of the Social Skills Rating System be continued, in addition to sociometric ratings, to the extent that these can be feasibly obtained.

Late adolescence/early adulthood

Although social competence remains important during late adolescence and early adulthood, the recommendation is to retain only measures of emotional competence, attachment and personality/self. The wealth of data already obtained across numerous years for social competence, as well as the fact that the Social Skills Rating System reaches maximum usefulness in secondary school, make these assessments less meaningful. Lessening the reporting burden from newly independent young adults also increases the probability for compliance in a group that may be more difficult to retain as study participants.

Dimension 2: Attachment

Attachment begins as the deep and enduring connection established between a child and his/her caregiver in the first several years of life. This ability to form a positive attachment with the primary caregiver reverberates throughout a child’s life, becoming a foundation for his/her ability to form close relationships with others. Furthermore, the properties of childhood attachment and adult attachment are much the same and show similar characteristics.50

Several properties of attachment are found throughout early childhood, adolescence and even into adulthood. “Proximity seeking behaviour” is an individual’s attempts to remain within a self-defined protective range. This range is constricted during threatening situations where closer proximity to the attachment figure is needed. In the “secure base” phenomenon, the presence of the attachment figure fosters security and leads to exploration. “Separation protest” is typically manifested when there is a threat to the accessibility of an attachment figure which leads to protest and attempts to avoid separation. In fact, attachment feelings and orientation towards the attachment figure during such times of threat reflect the quality of attachment referred to as “elicitation by threat”. Children as well as older individuals also attempt to substitute other figures as their secure base at times; where this attempt is unsuccessful, even when quality of care and attention is equivalent, “specificity of the attachment relationship” is seen. “Inaccessibility to conscious control” refers to how feelings of attachment and separation protest persist even after permanent separation (eg, death). Furthermore, for some individuals, attachment has been found not to wane through habituation. Separation produces “persistence”, which can be identified by pining; it only slowly abates and does not desist but, when prolonged, is incorporated into a despairing outlook. The final unique characteristic of the attachment relationship is “insensitivity to the attachment figure’s behaviour”, where the attachment persists even where the attachment figure’s behaviour is abusive. This problem can result in the association of feelings of anger or miscue with attachment feelings which may give rise to internal and interpersonal conflict.

The security of attachment relationships appears to be the most important aspect of attachment for overall functioning. As already noted, this security is often related to competence in other domains of social, emotional and even cognitive development. To measure attachment at differing age periods, it is recommended that parent-toddler/child relationships be measured through early childhood. During middle childhood and adolescence, relationships with peers should be examined and, finally, towards the end of adolescence, attachment with romantic partners can be assessed.50

Infancy, toddlerhood and preschool

Clearcut attachments to caregivers find their inception during the first 2 years of life and remain important thereafter (table 1).51 Table 3 shows measures selected to be used to assess attachment at this and subsequent age ranges. To measure attachment in infancy within epidemiological studies, the Attachment Q-sort (AQS) is strongly recommended with mothers as informants to avoid the lengthy training and observation times required of independent observers.52 Maternal AQS sorts have been shown to have validity in US samples.53 54

Table 3 Measures to be used to assess attachment across measure technique and across each developmental period

During the preschool period, attachment relationships to caregivers other than parents are often developed, but those with parents remain paramount. An additional Attachment Q-sort rating by mothers at 3.5 years of age is therefore recommended. A different assessment method for attachment in this age range uses the child’s perspective and is referred to as the Narrative Story Stem Test.55 This measure shows much promise in predicting children’s behavioural strengths and difficulties,56 but requires extensive coding and training, rendering it most useful for substudies rather than as a core measure in large epidemiological studies. The Student-Teacher Relationships Scale, while interesting in terms of examining non-parental attachment relationships, would also be more useful for substudies.57 Finally, the Devereux Early Childhood Assessment mentioned earlier during the discussion of social competence includes an attachment subscale and constitutes the minimum that should be completed by mothers and teachers.

Grade school and adolescence

As middle childhood progresses into adolescence, children remain close to parents but also to teachers and peers and, during the later part of this period, they begin to explore romantic relationships (table 1). Children of this age are more capable of reporting on their own security of attachment and their view of different attachment relationships.

We therefore examined the existing self-report measures of attachment to parents and peers usable for this age range. The Attachment Security Scale58 and Inventory of Parent and Peer Attachment are recommended because they both have adequate psychometric properties as well as the ability to assess attachment to both parents (mothers and fathers) and peers.59 For adolescents and young adults who are moving into relatively stable romantic relationships, the Hazen and Shaver Scale60 is recommended for later in the adolescent period and thereafter; it is widely used in research and well validated. There are other attachment questionnaires but they are far longer.

Dimension 3: Emotional competence

The third dimension relevant to social/emotional development is emotional competence, which we define as the multifaceted ability strategically to be aware of one’s own and others’ emotions and to act on this awareness, to negotiate interpersonal exchanges and regulate emotional experience.61 Constituent elements of emotional competence include abilities to: (1) express and experience a broad variety of well-modulated, but not incapacitating, emotions; (2) regulate the experience and expression of emotion when “too much” or “too little” emotional experience or the expression of emotions interferes with one’s intrapersonal or interpersonal goals; and (3) understand one’s own emotions as well as those of others. Given this definition, recommended assessment measures at all age periods include expression and experience, regulation and understanding of emotions.13

There may be a certain amount of overlap between skills related to emotional competence and those associated with social competence, simply because all aspects of social interchange involve emotion. Skills of emotional competence such as understanding others’ emotions may support social competence.62

However, we consider it important theoretically and methodologically to differentiate the elements of emotional competence from other competencies because they are uniquely predictive of optimal interpersonal and intrapersonal functioning.13 Furthermore, not all of the emotional competencies—for example, regulating internal experience of emotion (as opposed to its outward expression) or understanding one’s own emotion—necessarily relate to social experience. For example, although a child’s control of her nervousness at a piano recital does serve to make a better “presentation of self” to the audience, her parents and her teacher (via her displayed emotions), it also serves an arguably more important function not related to social interaction, namely that of allowing her to continue to function well during the recital and to feel self-confident about her abilities afterwards. Correctly identifying such anxiety in oneself is probably an important prerequisite to successful emotional regulation. In sum, although there are connections between social and emotional competence, they possess unique aspects that render it important to measure both.

Another source of overlap in the related dimensions described here is between the expressiveness aspects of emotional competence and the reactivity component of temperament; both involve emotional response. At the same time, emotion regulation and the regulatory component of temperament also share features.63 64 Some children high on the temperament dimension of negative affectivity can be easily angered in many situations. Others high on this temperament dimension are also anxious and fearful in new situations and become easily saddened.

Surgency is an aspect of temperament associated with, among other things, positive emotional expressiveness and high level pleasure. Effortful control is an aspect of temperament associated with the abilities to focus and shift attention voluntarily, and to disengage attention from one’s own perspective to attend to another’s; it may be seen as fundamental to emotion regulation.65 Despite these associations of temperament with emotional expressiveness and regulation, there are important distinctions between the two dimensions. For example, the reactivity aspect of temperament emphasises intensity (eg, time to peak expression, frequency, duration, recovery time) that seems to be a general aspect of personality and not limited to specific emotions Conceptualisation and measurement of emotional competence, for example, more often focus on the specific emotions experienced or expressed by the child, along with contextual parameters (eg, to whom the emotion is expressed, when it is suppressed, specific strategies used for regulation, etc).

It is important to note these links among the dimensions of social-emotional development in order to make difficult decisions about which measures to include and exclude in a large longitudinal epidemiological study. Given the aforementioned considerations, it seems that attention needs to be given to all three dimensions of social-emotional development: social competence, emotional competence and temperament.

Infancy, toddlerhood and preschool

Children are capable of expressing all the “basic” emotions (ie, happiness, sadness, anger and fear) by the end of toddlerhood. They also evidence clearly discernible emotionally expressive styles, as well as the rudiments of emotion understanding (eg, shown in social referencing of caregivers’ emotions) and regulation (although regulation needs to be supported by adults). Table 4 shows measures recommended for assessment of emotional competence at different developmental stages. To obtain information on infants’ and toddlers’ emotional competence, the Infant-Toddler Social-Emotional Assessment or its brief version is recommended because it allows the assessment of two dimensions with one instrument. Along with its scales on social competence (previously mentioned), it taps several aspects of emotional competence (eg, emotional positivity, emotional awareness and negative emotionality).

Table 4 Measures to be used to assess emotional competence across domain and across each developmental period

During the preschool period the elements of emotional competence (ie, expression, understanding and regulation) emerge even more distinctly and adult-independent emotion regulation appears. Children express more nuanced emotions such as ambivalence as well as social emotions like guilt, shame and empathy. They understand many basic expressions and states of emotion and show rudiments of understanding others’ unique perspectives on emotional experience.13 With this evolving emotional maturity (table 1), obtaining a differentiated view of emotional competence becomes easier and its components are more easily assessed.

Emotional expression can be tapped via parent report on subscales of the Negativity and Surgency components of the Rothbart Child Behavior Questionnaire (also important for Temperament assessment, with the advantage of parallel measures across ages and informants),66 direct assessment via the Emotion Matters protocol67 and observers’ completion of the Minnesota Preschool Affect Checklist.42 For emotion knowledge, the Affect Knowledge Test is recommended as a direct assessment with the child; this aspect of emotional competence is predictive of many later social outcomes.68 69

For emotion regulation, teachers can complete the self-control scale of the Devereux Early Childhood Assessment and the Emotion Regulation Checklist.39 70 Information on emotion regulation can also be obtained via parent report on the Rothbart Child Behavior Questionnaire effortful control component,66 direct assessment via the Emotion Matters protocol67 and observers’ completion of the Minnesota Preschool Affect Checklist.42 It is notable here that one parental questionnaire (the Rothbart Child Behavior Questionnaire), two very short teacher questionnaires (the Devereux Early Childhood Assessment and the Emotion Regulation Checklist), one observer checklist (the Minnesota Preschool Affect Checklist), and one direct assessment (Emotion Matters) can yield important detailed information on both emotional expressiveness and regulation, central components of emotional competence.

Grade school

At this age the elements of emotional competence (ie, expression, understanding and regulation) remain important. In terms of emotional expression, subtlety—or “the cool rule”—becomes the norm (table 1). Cognitive emotion regulation strategies (eg, “thinking about something else”) are at times independently used. The unique emotional perspectives of others, as well as cultural display rules, are now capable of being understood.

In terms of expression/experience and regulation of emotion, reports were recommended to be obtained from parents, teachers and the children themselves. Thus, for expression of emotions, parents and teachers can complete the Emotional Expressiveness and Affect Intensity Scales71 72 as well as age-appropriate Rothbart scales of temperament. Children can begin to complete the How I Feel Scale,73 the Positive and Negative Affect Scale,74 as well as the Test of Self Conscious Affect and the Bryant Empathy scale.75 76 Given their new-found use of display rules and more subtle expression of emotions, it is important to get their own report of emotional experience. Among measures of emotion understanding, the Kusché Affect Interview was chosen as the most comprehensive.77

Adolescence/early adulthood

Emotional expression, regulation and understanding remain important and are often increasingly subtle and sophisticated (table 1). More and more aspects of emotional competence are obtainable via self-report, with parent and teacher report becoming far less important. The Emotional Expressiveness Scales, the Positive and Negative Affect Scales and the Affect Intensity Scale,72 74 all of which have been obtained from various reporters since grade school and remain among the most accessible and well-used measures of emotional expressiveness and experience, should be completed by the adolescent/young adult only (rather than by parents and/or teachers). The Test of Self Conscious Affect can also still be used and, new for the period, the Rothbart Early Adolescent (or Adult) Temperament Questionnaire-Revised is recommended. The Bryant Empathy Scale is still usable, but only early in adolescence.

For emotion regulation, adolescents and young adults alike can provide important information via the Trait Meta-Mood Scale and Berkeley Emotion Regulation Scale.78 These two scales were chosen from a number of emerging scales, not only because of their useful psychometric properties but also for their conceptual clarity.

The more comprehensive and increasingly well regarded Mayer-Salovey-Caruso Emotional Intelligence Test and its youth version79 could also be used as an emerging means of measuring all these aspects of emotional competence, including understanding. Emotion knowledge is also assessed via the Toronto Alexithymia Scale.80 81

Dimension 4: Self-perceived competence

Children’s self-perception of competence is a multidimensional construct that increases in complexity and differentiation with age.82 83 It is defined as one’s evaluations of one’s own abilities, including the child’s own assessment of his/her cognitive, physical and social abilities, especially in comparison with those of others. Logically, evaluations by peers and teachers contribute to these self-evaluations of abilities, and thus evaluations by others are associated with children’s self-perceived competence.84 Self-perceptions are important because they influence the child’s task motivation and performance.85 86

Self-perceived competence is distinguishable from both self-esteem and self-concept. Specifically, self-esteem is a global affective evaluation of the self and can be difficult to measure with adequate psychometric reliability and validity owing to the number of different evaluative components lumped into one overall index.87 A few measures with good psychometric properties do exist for assessing self-esteem in children and adolescence, but the literature convincingly portrays the better specificity and predictive power of “self-perceived competence”. Self-concept, when used appropriately as a term, refers to the descriptive components a child or adolescent would use to answer the question “Who am I?” (eg, “I am a girl, I do well in school, I live in Maryland and own a poodle”). Such descriptions, although inherently interesting and changing across time, are probably not as germane as self-perceived competence to the study of social/emotional developmental trajectories, especially when the goal is to study the effects from a broad range of exposures on development.

Assessment of self-perceived competence is therefore specifically recommended instead of self-concept or self-esteem, not only because of the arguable flaws in measuring those constructs87 88 but also because self-perception measures are more domain-specific. For example, a child may feel quite competent in one area (eg, cognitive performance) but inadequate in another (eg, peer relations). Furthermore, these measures are themselves important reflections of developmental outcomes but are also predictors of future outcomes, as noted earlier (table 5).

Table 5 Measures to be used to assess self-perceived competence/temperament/personality across measure technique and across each developmental period


Preschool children begin to show differentiated self-perceptions (table 1) which could be assessed with the Berkeley Puppet Interview.44 For example, the academic scales of the Berkeley Puppet Interview can be administered in one more 20 min interval to form—along with the Berkeley Puppet Interview scales already administered regarding self-perceived social competence—a complete evaluation of self-perceived competence.

Grade school and adolescence

It is during middle childhood that earlier notions of self-perceived competence are solidified.20 Views of self become much more complex and associations with school success and psychological health become more pronounced.28 29 Self-perceived competence should be assessed via the Multidimensional Self Concept Scales.89 There are many scales that purport to tap important aspects of children’s self-esteem and self-perceived competence, but the Buros Mental Measurement Yearbook notes that the Multidimensional Self Concept Scales are among the very best validated and conceptually substantiated of all those available.

Dimension 5: Temperament/personality

The fifth dimension important to social/emotional development is temperament/personality. Temperament is defined as individual differences in reactivity and self-regulation assumed to have a constitutional basis; also “the characteristic phenomena of an individual’s emotional nature, including his susceptibility to emotional stimulation, his customary strength and speed of response, and the quality of his prevailing mood, these phenomena being regarded as dependent upon constitutional make-up”.90 Current theoretical and empirical views of temperament emphasise these reactivity and regulation dimensions. We have already noted the overlap of this dimension with aspects of emotional competence but argue that, despite this overlap, much important unique variance remains and thus the dimension should be tested.

As noted earlier, emotional reactivity specifically refers to the speed and intensity with which individuals respond to stimulation.91 The construct of executive function, also related to temperament, refers to the processes that affect the initiation, inhibition or modification of behaviour, including those related to effortful control.92 Although maturation contributes to the growth of executive control, young children vary in this capacity. Such individual differences led Rothbart and Bates to describe effortful control, a component of executive function, as an important core temperament characteristic.93

Rothbart and colleagues often refer to these two overarching dimensions of temperament as reactivity and regulation. In general, reactivity is related to negative outcomes whereas regulation/effortful control is most often related to positive outcomes, particularly in interaction with environmental factors.63 Because of their clear theoretical and empirical value, temperament measures of reactivity and regulation are recommended.

Starting in the neonate, the biological predispositions of temperament are modified by environmental influences; gene × environment interactions are the rule rather than the exception. Although temperament is considered generally to be “constitutional” in origin, it has been demonstrated in animal research that the physiological processes underlying reactivity can be permanently altered by early rearing practices94 and by stress, illustrating the importance of gene × environment interactions.95 Aspects of these biologically-based temperamental characteristics can change at varying time points from infancy through the adolescent period; they may increase, diminish or be silenced altogether, based on the pattern of reinforcement, parental nurturance and other exposures during development. The processes by which these changes occur include learning processes, environmental elicitation, environmental construal, social comparison processes, environmental selection and environmental manipulation.96

During this process, aspects of temperament become elaborated into individual difference characteristics more similar to adult dimensions of personality.96 97 These individual difference characteristics include (but are not limited to) sociability, social inhibition, dominance, negative emotionality, aggressiveness, prosocial disposition, persistence/attention, mastery motivation, inhibitory control and activity level.

During childhood these individual difference characteristics begin to mature into cognitive and affective representations that are quickly and frequently activated (ie, personality traits). Personality is defined as “the dynamic organisation within the individual of those psychophysical systems that determine his unique adjustments to his environment”.96 98 There is evidence that child/adolescent personality dimensions are associated with—and become increasingly similar to—the “Big Five” personality traits in adulthood which include the following dimensions:97

  • Extroversion: active, assertive, energetic, enthusiastic, outgoing, surgent and talkative versus inwardly oriented, requiring a lower level of stimulation and reserved.

  • Agreeableness: appreciative, forgiving, generous, kind, sympathetic and trusting versus hostile, selfish, unsympathetic, uncooperative, rude and mistrustful.

  • Conscientiousness: efficient, organised, planful, reliable, responsible, thorough, able to delay gratification and has high aspirations versus careless, negligent and unreliable.

  • Neuroticism: anxious, self-pitying, tense, touchy, unstable, worrying and moody.

  • Openness to experience or intellect: artistic, curious, imaginative, creative, has wide interests and insightful versus shallow and imperceptive.

During adulthood these dimensions of personality or closely-related demarcations are differentially associated with successful adaptation including academic attainment, work competence, rule-abiding versus antisocial conduct and romantic and friend relationships.99 100 As yet, it remains difficult, to pinpoint exactly how childhood/adolescent personality affects these later outcomes. Longitudinal designs and more dynamic models of personality development are needed to resolve such process-oriented questions.101

It is obviously difficult to disentangle temperament and personality because they are inter-related. On the one hand, temperament is seen as more biologically-based and most often studied in infants and children. However, it can be assessed through adulthood, and research suggests links between temperament dispositions and the Big Five personality factors.64 It is therefore difficult to specify any excision of one of these constructs or the other to save time and/or money. Based on the process of change outlined above, however, the importance of temperament may be more important in younger children, whereas the emergence of specific personality traits may take precedence in adolescents.

Infancy, toddlerhood and preschool

Infants show distinct patterns of self-regulation and reactivity and, through the toddler period, moderate continuity is seen in these dimensions (table 1). The regulatory aspects of the temperament of toddlers become more and more important due to anterior brain development. Temperamental patterns begin to be consolidated slowly into personality during the preschool period.

Table 5 shows measures selected to be used to assess temperament at this and the following age ranges. A number of measures are available for measuring temperament, usually overlapping in content to a great extent. The Rothbart Scales were chosen for two reasons: (1) the questionnaires are derived from documented neuroscientific findings and take an integrative approach, cutting across social and cognitive areas with parallel measures available from infancy to adulthood; (2) the item content of the questionnaires best fits the important social-emotional constructs put forward here, particularly in their emphasis on reactivity, regulation and the relations of the reactivity and regulation with personality.

Grade school, adolescence and early adulthood

From grade school through early adulthood, continuity of temperament and personality remains evident and aspects of personality become more differentiated, although regulatory and reactivity components of temperament remain important (table 1). The Big Five Questionnaire for Children (see also parallel measure for adults) was therefore chosen because of its parallel with well-studied dimensions of adult personality,102 excellent psychometric properties and its ability to obtain information from parent, teacher and self-reports. Temperament can still be well assessed via the Rothbart scales throughout the period.

Influence of parenting

Although it is important to evaluate the social and emotional status of children and adolescents, the behaviour of adults is also pivotal because of its influence on the development of these attributes. It is therefore necessary to identify elements of parenting that are important in fostering or hindering social-emotional competencies across developmental epochs. Examples of socialisation dimensions include both those related to general parenting practices (positive and negative) as well as those more specifically related to socialisation of emotional competence.

Commonly accepted dimensions of parenting which have been found to contribute to later child and adolescent outcomes103 include styles of warmth, including affection, sharing activities, and limit-setting, including structuring the child’s environment and having “house rules”. Reasoning/inductive discipline and power assertive/punitive discipline are important specific parenting practices. These parenting styles and practices, in interaction with other factors such as temperament, influence important child and adolescent outcomes, although there are suggestions that some of the effects may be culture-specific.104

At the same time, there are even more specific important parenting practices related to the socialisation of emotional and social competencies.13 105 These dimensions include parents’ openness to teaching about emotions, reacting to children’s emotions and expressing emotions around the child.

Efforts by parents to teach about emotions are related to how well children understand emotions.106 The ability to differentiate emotions is something that children are not born with, but learn through parenting, culture and other influences. This ability remains important throughout the lifespan as seen in Alexithymia (ie, the inability to describe and differentiate feelings on a psychological level)107 which is associated with an increase in somatisation in adults. Teaching about emotions can be carried out via parent-child reminiscences concerning emotional experiences the child has had, especially negative ones, or through helping a child label his/her emotions (eg, if a child who is on his way to the doctor starts complaining that his tummy is upset, the parent can say, “I think you are feeling anxious about the shot you are going to get and sometimes being anxious can make your tummy upset”). These types of conversations can be seen as impacting children’s developing emotion knowledge, but most especially their “emotional self-concept”, in that they contribute to children’s abilities to define the emotional self (“this is the kind of emotional person I am”); define the emotional-self-in-relation (“this is how I express and share my emotions with others”) and to understand their own emotion regulation (“this is how I cope with and resolve negative emotion”).

Reactions to children’s emotions are also an important dimension because they influence children’s expressiveness and emotion knowledge108—that is, supportive reactions are generally positively related to aspects of emotional competence and punitive reactions are generally negatively related. A parent’s own emotions are also an important dimension to consider because these emotions form the affective environment in which the child is being raised and are related to children’s own expressive styles and emotion knowledge.109

Preschool and grade school

During these age periods, obtaining information on the behaviour of socialisation agents is crucial. Specifically regarding their emotion socialisation behaviour, parent-report questionnaires were chosen about parental reactions to emotions, modelling of emotional expressiveness and teaching children about emotion (table 6). These measures were chosen in part because of their established psychometric utility, the dearth of alternatives and because they are usable across several years of the children’s lives. They include the Coping with Children’s Negative Emotions Scale, Self Expressiveness in the Family Questionnaire and Toronto Alexithymia Scale.80 81 108 110 Age-specific emotion teaching scales include the Emotion-Related Beliefs and Emotional Styles questionnaires;111 112 these are no longer applicable after the preschool period.

Table 6 Measures to be used to assess parenting, overall childrearing practices and the socialisation of emotional and social competence across each developmental period

With reference to overall parenting behaviour, the Parent Practices Questionnaire was chosen for its valid and reliable demonstration of commonly cited dimensions of parenting (ie, authoritative, authoritarian and permissive)113 as well as the ability to use the questionnaire through grade school; several potential alternatives are noted. The Parenting Feelings Questionnaire is also recommended at this age period because of the importance of parent affect.114


At this age period it remains important to obtain information on the behaviour of socialisation agents. The measures already noted for the grade school period remain usable, except for Coping with Children’s Negative Emotions Scale and Emotional Styles Questionnaire which is no longer age-appropriate. Finally, the Parenting Practices Scales of Robinson et al113 should be replaced by the Steinberg measure,115 again because of the importance of accessing age-appropriate item content. Furthermore, the Steinberg measure is adolescent report, acknowledging the importance of the adolescents’ newly independent views of their social surround.


Having outlined the domains of social-emotional development and parental socialisation, along with attendant assessment possibilities, it is important to reflect upon important methodological issues that need attention when these constructs are assessed in large-scale studies.

Continuity of assessment

Continuity of assessment is the first issue to consider when selecting assessment measures in any domain and was considered carefully in the choice of measures included here. Developmentalists grapple with the issue of whether the validity is homotypic (ie, construct defined and expressed in the same way across time and assessed by a very similar measure such that the earlier measures have predictive validity for the later ones) or heterotypic (ie, construct defined and expressed in a dissimilar way across time, necessitating new measures). Ideally, similar measures would provide continuous assessment of progress from the age of 1 year through to 21 years of age, and it is true that we tried to find measures or families of measures such as the Social Skills Reporting System, the Rothbart Temperament Scales, the Multidimensional Self-Concept Scales, the Mayer-Salovey-Caruso Emotional Intelligence Test, the Positive and Negative Affect Scale, and several of the parenting measures that could be used across several assessment points.66 74 78 89 116

However, the milestones of different developmental stages often require that different instruments be used to assess the same construct at different ages so that validity is, at least to some extent, heterotypic. For example, the qualitative changes in emotion knowledge from preschool through adolescence require differing assessment content—in this case, preschool children are working on basic understanding of the causes and expressions of emotions like happiness, sadness and fear, whereas adolescents are trying to making sense of much more subtle issues, including complex emotions like guilt and shame and others’ expression of blended emotions. Moreover, the ability to use certain measurement techniques changes with a child’s age. For example, to access children’s emotional expressiveness, it may be useful (and, arguably, necessary) to observe them during the preschool period, but not until grade school can self-report of emotional experience be easily obtained. Even within self-report (eg, within the dimension of self-perceived competence), developmental abilities may dictate specific methodologies as with the Berkeley Puppet Interview used during the preschool period and the paper-and-pencil Multidimensional Self Concept Scales which children can begin to complete during grade school. Thus, both construct-based and methods-based continuity of assessment is often ill-advised and even impossible, given the very developmental nature of the data and questions asked of it. We must find ways to work within these constraints.

Type I and type II errors

Another set of problems related to the continuity of assessment are conditions which elicit type I and type II errors. False positive and false negatives are always important issues with large data sets containing many measures, but are of particular concern in longitudinal studies where measures may be repeated by the same respondent at intervals short enough that respondents remember their answers (eg, on the Self Expressiveness in the Family Scale or the Coping with Children’s Negative Emotions Scale, both of which assess emotion-related aspects of parenting at contiguous age periods). In this case, “real” change would be hard to detect; there might appear to be no change where actual change exists (type II error or “false negative” findings).

Conversely, type I errors (false positive findings) can occur when the novelty of taking the test experienced during the first test period does not occur at the second because the examinee has habituated to the test’s requirements (eg, on the Berkeley Regulation Measure and Trait Meta-Mood Scales which focus on one’s private emotions, with items like “I’m ashamed of my mood” could seem very novel at their first completion). If there has been a relatively short lapse of time, the test score at the second session could appear better (ie, show improvement) when, in fact, the better score was based on test sophistication or increased ease at being asked about such things, not a change in emotion regulation. The subsequent measure, affected by the decrease in novelty experienced, could again make “real” change difficult to pinpoint.

Fortunately, in longitudinal studies spanning many years, these issues associated with repeated measures tend to be less of a problem simply because the time period between assessments tends to be quite long (often years rather than weeks or months). When temporal spacing of assessments is not a good solution, another option is the use of multilevel modelling in the analyses of results. It is crucial to note that, in repeated measures, observations within the child are generally correlated; this correlation is called intraclass correlation.117 One of the assumptions of general linear models (GLM; eg, ordinary least squares regression and analysis of variance) is the independence of observations; thus, using GLM when the intraclass correlations are present creates serious problems including underestimation of standard errors.

Underestimation of standard errors may lead to type I errors.118 For example, ordinary least squares regression may lead to the conclusion that there is an age-related change in self-perceived social competence when in fact the difference could be attributed to chance. In order to estimate correct standard errors in repeated measures, both time (the individual observations) and child levels need to be allowed for. Because individuals are observed at multiple time points, repeated measures can be considered as a special case of nesting of observations within a child; thus, there are at least two levels in repeated measures. By creating a multilevel model in which individual observations are set as level 1 (within-individuals) in each child (level 2: between-individual), both the time and child levels will be included in the analysis. Another advantage of using multilevel modelling in repeated measures is that it can handle missing data without using a listwise or casewise deletion of records.117

Item response theory (IRT)

Item response theory (IRT), rather than classical test theory (CTT), may be at least a partial answer to both ensuring continuity of assessment and minimising type I and II errors. It provides several advantages over CTT methods for constructing tests and examining measurement equivalence. Unlike CTT item statistics, which depend on the subset of items and persons examined, IRT item and person parameters are invariant. This invariance makes it possible to examine the contribution of items individually as they are added and removed from a test. Moreover, IRT allows researchers to calculate conditional standard errors of measurement based on a test information function rather than assuming an average standard error across all trait levels, as in CTT. This error calculation allows researchers to select items that provide maximum measurement precision in a particular trait range.119

Based on these principles, IRT could help in calibrating items, leading to high-quality item pools that measure latent traits equivalently across developmental stages. Moreover, different scales created from the item bank can be placed on a common metric and scores are interval rather than ordinally scaled. Our dimensions can be seen as latent traits for which item pools can be created; thus, for example, an item pool for emotion knowledge could be created, with equivalence across time of items from both the direct assessments of the puppet-based Affect Knowledge Test, the Kusché Affect Interview and the self-report Toronto Alexithymia Scale. Specific items could be selected from the item pools to best match subjects’ latent trait on the dimensions of interest; only those items that are appropriate for a specific developmental state of a child could be used, leading to precise measurement and minimising the risk of a child receiving the same set of items at two consecutive measurement time points. IRT has other benefits which will be addressed when discussing other methodological issues. However, it is important that these types of measures also have established criterion validity before use in epidemiological studies.

Additional criteria for inclusion/exclusion of specific assessment measures

There are several other important criteria to consider when choosing assessments for longitudinal epidemiological research in children.

All measures cited here should meet high psychometric standards for reliability and validity; from the CTT perspective, these are absolute minimum characteristics even before IRT methods are invoked. It is important to consider test-retest reliability, inter-rater reliability and internal consistency as crucial. Types of validity required include predictive validity (the assessment reliably predicts a future outcome), construct validity (often used to define a “latent” construct such as executive function or “general adjustment” which is operationally defined), content validity (the content of the items adequately assesses the field it is intended to cover), concurrent validity (correlates well with a measure that has been previously validated) and discriminant validity (correlates well with measures of the exact construct, but not as strongly with related constructs). It is also important to note that the criteria used to establish validity should be made explicit and built upon sound theory and/or previous research. Finally, external anchors to help interpret score change over time—otherwise referred to as practical utilities—are critical for sound psychometric properties. Unless all measures selected meet similarly high standards, results showing that some measures have greater predictive power than others may reveal more about the quality of the measures selected than about child development and the factors that influence it.

Examiner effects are the second critical psychometric property to consider. In other words, it is critical to think about whether characteristics of the examiner are likely to influence the results. Issues such as child or examiner gender, interacting with a stranger120 or the extent to which the examiner’s ethnicity matches with the child’s demographic characteristics may be important factors.121 It is also important to standardise training and certification of examiners and to maintain quality control through periodic observation and retraining where needed; such procedures must be followed, for example, in using the Affect Knowledge Test.

The use of multiple informants, where possible, are recommended because children’s behaviour is so often context-specific, making it difficult to determine what a child knows or can do from a brief assessment conducted by one informant at one specific point in time. In fact, meta-analytical studies on cross-informant ratings of child/adolescent behavioural and emotional problems122 and social competence123 reported only small to medium correlations between informants, and Renk asserts that use of multiple informant ratings of child and adolescent behaviour has become a “gold standard” method.124 These authors all concluded that using multiple informants is important for capturing a more complete picture of the child, using information from all the differing contexts in which the informants are situated.125 Optimally, the list of informants for these dimensions should include parents, teachers (preschool/childcare, elementary, high school), age mates (ie, peers) and children themselves, as well as independent observers.14 Although it is unlikely that such a large number of informants is feasible in an epidemiological study, it is important to keep in mind the concept of multiple perspectives. Each of these informants has a unique viewpoint that can enrich our understanding of the child’s strengths and weaknesses in the domain of social competence.125 Thus, social and emotional competence, in particular, are best measured using a multi-informant, multi-method perspective from preschool onward (ie, one that considers the effectiveness of social interaction across situations from differing perspectives using multiple sources of information in an effort to triangulate upon reliable valid information).126 The use of multiple informants may also help to guard against the problem of shared method variance, which is inherent when the same informant reports upon related phenomena, thus rendering it more difficult to differentiate hypothesised findings (or lack thereof) from the very fact that the same person is providing information across several constructs.

Multiple informants are also desirable because parent and teacher reports may reflect characteristics or biases of the respondent.123125 For example, similar to potential examiner effects already remarked upon, teacher reports especially may be biased according to child characteristics that include (but are not limited to) culture, ethnicity, race and gender. Moreover, discriminations among children tend to improve with teacher education, although teachers with more years of experience also tend to give children higher ratings. In addition, precautions should be taken to ensure teacher-rating tools do not lose sensitivity when used in multiple children; rating every child on each specific item rather than rating each child individually on all items is one way to guard against halo effects.

Another reason that multiple informants are useful is that aggregation allows one to partial out error variance due to unreliability of measurement.125 126 However, the issue of aggregation brings up a potential stumbling block in the uses of multiple informants: does one average the informants’ scores, determine an “optimal informant” for each construct or scale, or somehow weight informants’ information based on their competence to report or one’s confidence in their reporting?124 127 Van Bruggen et al note that confidence or competence-based weights provide significant gains in estimation accuracy over simply averaging informant reports.127 Romig suggests that even the use of non-weighted maternal, teacher, and adolescent self ratings as a group results in better explanation of variation in adolescent outcomes than any one informant’s rating alone.128 Following this logic, Kuo et al note that hierarchical linear modelling (HLM) can be used to integrate information on child outcomes from different sources;129 two advantages these authors note are that HLM allows assessment of the interactions between risk factors and informants and it uses all available data, even when data from one or more informants is missing. In sum, there is some disagreement between experts about whether to somehow amalgamate or keep separate the important information that multiple informants can provide. However, we find the classic reference of Rushton et al126 to be persuasive if one can assure, through assessment documentation and training, that the construct definition is relatively invariant across informants.

Regarding the cost of assessment in terms of time, skill and equipment used, we struggle with the very real trade-offs between scientific adequacy and the logistical demands of large studies. For example, in terms of measuring attachment during infancy, the complex nature of the construct renders many measurement techniques very time-consuming in training observation and coding. Specifically, although the Strange Situation has been the “gold standard” in attachment research for decades,130 131 it would be prohibitive in terms of training, administration and coding. In contrast, the observation and sorting time for each mother to complete the Attachment Q-Sort is justified, given the huge theoretical and empirical importance of the construct.

What this paper adds

  • This paper enumerates methodological challenges related to the epidemiologic assessment of social-emotional development in children.

  • Despite many challenges, accurate in-depth assessment of social-emotional functioning is crucial, and this paper shows how such assessment can be performed from a developmental perspective across dimensions of: (1) social competence; (2) attachment; (3) emotional competence; (4) self-perceived competence; and (5) temperament/personality.

However, some investigators try to shorten or simplify measures ill advisedly. When valid scales of a construct have been developed, they cannot simply be changed, shortened or otherwise pulled apart to be “mixed and matched” with other items without validating the construct validity of the new combinations and/or changed or shortened scales. Changes to the Attachment Q-sort involving, for example, Likert scale ratings rather than Q-sort methodology are unacceptable because of the degradation of the measure’s validity and reliability with resultant creation of an essentially “unknown quantity”. Many decisions such as those noted here need to be made when social-emotional measures are selected for longitudinal studies.

Policy implications

  • Accurate and comprehensive assessment of social-emotional development is necessary in epidemiological studies of children.

  • Accurate in-depth assessment of social-emotional functioning is crucial, based on its importance to child outcomes like mental health, academic performance, delinquency and substance abuse.

  • The assessments and answers to challenges given in this paper suggest ways that research policy may be shaped.

Here, too, IRT can be useful. Shorter targeted scales can be as reliable as the longer scales required by CTT. Furthermore, the utility of IRT could be maximised if it is combined with the use of computerised adaptive test (CAT) technology (ie, item banking allows for the development of CATs that reduce respondent burden and increase reliable measurement by using a methodology that targets in on a respondent’s true score). Items can be selected that provide the most information for each examinee. This process can dramatically reduce time and costs associated with test administration.119

Appropriateness to varying subpopulations is the final criterion for inclusion or exclusion of a measure which must be considered. Norms and psychometric data for measures must be obtained for diverse samples that represent the demographic characteristics of US children and families (based on our use of measures for the National Children’s Study; obviously, other culture-specific norms would be necessary for other usage). Large-scale studies provide an opportunity to obtain this information.

Looking at this issue from another angle, IRT could also be useful in allowing researchers to conduct rigorous tests of measurement equivalence across experimental groups. This is particularly important where cultural groups are expected to show mean differences on the attribute being measured. IRT methods can distinguish item bias from true differences on the attribute measured whereas CTT methods cannot.132 Given potential group mean differences, problems with existing instruments such as floor and ceiling effects also need to be rectified.

Two important subcriteria of this important issue include, first, the native language and dialect of the child/adolescent which must be considered when selecting, using or developing new measures. This can be a very difficult issue when children are partially bilingual. In many such cases, assessments cannot be fully performed in either language. Such a state of affairs suggests that composite scores, based on partial administration in each language, may be preferable. However, when children are not proficient in the original language of the scale, simple translations may not make equivalent measures. Second, cultural sensitivity must be considered when selecting constructs and instruments. Differences in cultural norms and values (eg, Asian and US Caucasian values regarding emotion regulation and child competence) have implications for using information gleaned from assessment measures selected here.133 Most behaviours (eg, self-regulatory behaviours) are important for human functioning in a variety of cultures, but the contexts for displaying these behaviours and the conditions that elicit them (or not) may differ. Ultimately, decisions about measurement probably depend in part on the purpose of the study. A desirable approach would be to operationally define a set of core expected outcomes, assess whether cultural differences moderate effectiveness and, if so, determine how and why.


This paper has discussed developmental milestones for social and emotional development through infancy to young adulthood and has suggested multiple measures that are appropriate for longitudinal epidemiological studies. Because successful negotiation of developmental milestones at any age has a significant impact on children’s concurrent and later well-being, it is crucial to understand stability and change of important dimensions of social and emotional development. In order to do so, implementation of a longitudinal design is necessary. There are, however, several challenges inherent to a longitudinal design such as cost, participant burden and finding/developing age-appropriate measures for each developmental milestone. The measures listed in this paper were chosen based on both theoretical constructs and empirical evidence (eg, psychometric properties). Even though the measures recommended have shown good to excellent validity and reliability, psychometric work related to longitudinal validity across developmental stages still needs to be done.


View Abstract


  • Competing interests: None.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.