Keeping in mind the unit 7 file (attached to order) Please answer, and support your answers with appropriate references:
Is the development of theory of mind continuous or discontinuous?
What is the evidence for and against the position that language is innate?
Unit 7 – Theory of mind and Language
Having focused on prenatal and neonatal development, we will now turn to focus on early childhood development and particularly the development of two areas of
cognition: theory of mind and the use of language.
At the end of this unit, you should:
1. Be able to describe and evaluate theories of and research on theory of mind development.
2. Compare and contrast models stage-like vs. continuous models of theory of mind development.
3. Critically compare and evaluate the nativist and constructivist views of language acquisition.
What is Theory of Mind (ToM)?
According to Lewis & Mitchell (1994), having a ‘theory of mind’ means having the “ability to make inferences about others’ representational states and to predict
behaviour accordingly”. For example, when we see someone behaving in a certain way (such as putting their umbrella up or saying “it is raining hard”), we will make the
inference that they have a belief about the current weather conditions (i.e. they believe that it is raining). We can use our understanding of their mental state to
predict other behaviour (e.g. driving to the shops rather than walking).
The term ‘theory of mind’ was coined by Premack & Woodruff (1978) and was actually used to refer to chimpanzees rather than humans. However, since the early 1980s, a
vast body of research on theory of mind in humans (particularly children and clinical populations) has been carried out. Although this term is now universally used to
refer to the understanding of mental states, a number of other terms have been used to describe similar phenomena. Whiten (1994) lists some of these:
Alternative labels for, and Concepts about, Mindreading (Whiten, 1994)
• Folk Psychology (Wundt, 1916)
• Consciousness of the feeling of their fellows (Thorndike, 1911)
• Imputation to others of firsthand experience (Lloyd Morgan, 1930)
• Naïve psychology (Heider, 1958)
• Second order intentionality (Dennett, 1971)
• Intersubjectivity (Trevarthen, 1977)
• Theory of mind (Premack & Woodruff, 1978)
• Metarepresentation (Pylyshyn, 1978)
• Belief-desire reasoning (Davidson, 1980)
• Natural psychology (Humphrey, 1980)
• Social referencing (Feinman, 1982)
• Mindreading (Krebs & Dawkins, 1984)
• Mental simulation (Gordon, 1986)
• Mentalising (Morton, 1989)
• Perception of intentionality (Dasser et al., 1989)
• (Mental) attribution (Cheney & Seyfarth, 1990)
• Mentalistic theory of behaviour (Perner, 1991)
• Representational theory of mind (Perner, 1991)
Someone with a theory of mind imputes mental states to himself and others. He understands that other people and himself have beliefs, thoughts, knowledge, feelings,
desires etc. Crucially this ability allows us to grasp the concept that our behaviour is a consequence of what we believe to be true, which doesn’t always correspond
to what is actually true.
For example, imagine a scenario in which you tell a lecturer that you couldn’t make it to a lecture because your car broke down. The lecturer will probably believe
this to be true, even if the reality is that you had a very late night and couldn’t get out of bed in time. Of course, this can get complicated: If the lecturer saw
you out the night before then they probably won’t believe your story and will presume that you couldn’t get out of bed. You, however, will believe that the lecturer
Another example is the ‘double cross’ spy stories: The Russians think that the spy is working for them, but the spy is actually pretending that he’s working for the
Russians whilst working for the British all the time. This, of course, can get more complicated. In the ‘double double cross’, the spy thinks that the Russians think
that he is working for them, but in reality the Russians know that the spy is double crossing them and is working for the British. All of this relies on us and the spy
and the Russians having a theory of mind, which means that we are able to see that what people believe to be true can be different from what actually is true. In
particular we are able to reason about what others believe to be true – this is why it’s called a ‘theory’ of mind; it’s the ability to hold reasoned rational
explanations for the behaviour of others.
What about ToM in Children?
Of interest to us, of course, are the questions of how and when this ability to understand other people’s (and our own) mental states develops. The ‘classic’ answer to
the question of when theory of mind develops is that children below about 4 years of age do not have theory of mind. This finding results from a number of experiments
which try to determine whether children are aware of false beliefs – beliefs that differ from reality – and whether they can distinguish between appearance and
False Belief Tests
There are two basic types of false belief task: The ‘unexpected transfer’ task and the ‘deceptive box’ task. There are lots of variations on the unexpected transfer
task, but the original is Wimmer and Perner’s (1983) ‘Maxi and chocolate’ test. This type of task involves two dolls or actors, one of whom performs an action of which
the other is unaware. In the ‘Maxi’ test, children are presented with a story about a boy called Maxi who had a bar of chocolate (see Figure 1). The scene is Maxi’s
kitchen. Not being a greedy boy, Maxi decides to save some of his chocolate to eat later so puts it in the green drawer. Maxi then leaves the room to play outside. In
his absence his mother arrives and starts to make a cake. To do so she needs to use some of Maxi’s chocolate so takes it out of the drawer, uses some, then returns
what is left to a different, blue drawer. Subsequently maxi returns and wants to finish off his chocolate. The participants are asked: Where will Maxi look for his
Figure 1 – Maxi & Chocolate Task
Source – Mitchell, P. (1997). Introduction to theory of mind: children, autism & apes. London: Arnold.
Children over about 4 years get it right, answering that Maxi will look in the green drawer (where he left it and where he will still believe it to be). Children under
the age of 4 tend to get it wrong, answering that Maxi will look where the chocolate really is – in the blue drawer. These findings have classically been interpreted
as showing that young children cannot understand the minds of others and cannot understand that another person can hold a belief that is different from reality and
different from the child’s own belief. This is similar to Piaget’s ideas of egocentrism and lack of conservation, in which children’s thought processes are restricted
to what they know and to what they see and hear; they cannot solve problems logically.
The other main type of false belief task is the ‘deceptive box test’ (Perner Leekam and Wimmer, 1987), also often referred to as the ‘Smarties experiment’ (see Figure
2). In this experiment, children are shown a Smarties tube and asked “what is in here?”. The children will answer “Smarties”. The tube is then opened and it is
revealed that there is actually a pencil in the box. The experimenter then says “Your friend Billy hasn’t seen this box. When he comes in what will he say is in
there?”. Children over the age of 4 say “Smarties”, children under 4 tend to answer that their friend will think there is a pencil in the tube. Again, the younger
children cannot understand that others can hold a belief that is false; they are misled by what they know to be true. Interestingly, Gopnik and Astington (1988) have
shown that younger children not only fail to predict their friend’s false belief but fail to remember their own false belief. When, after the participants have been
shown the pencil, they are asked “what did you originally say was in the box?”, they tend to answer “a pencil”, seemingly forgetting that they originally said
Figure 2 – The Deceptive Box Task
Source – Mitchell, P. (1997). Introduction to theory of mind: children, autism & apes. London: Arnold.
False belief is not the only aspect of theory of mind that has been tested. Another is children’s perception of the world and whether they can distinguish between the
way the world really is and the way the world seems. This has been examined using ‘appearance-reality’ tasks.
Flavell, Flavell and Green (1983) showed children a sponge that had been shaped and painted to look like a rock and asked them what it was. All the children said it
was a rock. The children were then asked to hold the sponge, making it clear that it was in fact a sponge. The children were then asked two questions: What does it
look like? And what is it really?. Children under the age of four made two types of errors. They either said that it was a sponge and that it looked like a sponge or
they said that it was a rock and that it looked like a rock. Four year olds, however, were able to recognise that although it looked like a rock it was in a fact a
sponge. They could understand that appearance and reality can differ.
Representational Deficit Theory and ‘Theory’ Theory
A fairly dominant theoretical view of the development of theory of mind is the ‘representational deficit’ theory. This is based on Piaget’s idea that development
occurs in stages. Stage theorists such as Josef Perner (1991) suggest that theory of mind cannot be acquired by children until there is a radical shift in their
thought processes. Younger children have a representational deficit which means they cannot mentally represent reality and a false belief of reality at the same time.
Once this shift in understanding has taken place, children have a theory of mental states which they can use to explain and predict behaviour. This idea is sometimes
referred to as ‘cognitive deficit theory’, ‘conceptual change theory’ or ‘theory theory’. There is sometimes disagreement between theorists regarding whether they
agree with each other or not. Many people place John Flavell, Alison Gopnik and Henry Wellman alongside Josef Perner as they all argue for conceptual change. However,
Perner and Flavell both offer domain-general accounts of conceptual change, whereas Gopnik and Wellman discuss domain-specific changes within a number of domains
including biology and physics as well as mind. Wellman, in particular, often draws distinctions between himself and other theorists. When you read paper and chapters
which discuss theories of theory of mind, some will make distinctions between these (and other) theorists and others won’t.
More specifically, whereas Perner focuses on children’s developing understanding of representations in general (including physical representations such as photographs,
as well as mental representations), Gopnik, Slaughter and Meltzoff (1994) and Gopnik and Wellman (1992) describe four conceptual changes in the development of theory
of mind. They identify three precursors to a true theory of mind: Understanding of pretence, understanding of desire and understanding of the perceptions of others.
Gopnik and colleagues argue that children initially understand misrepresentation in the context of desire and perception and then extend it to belief. Each new
‘theory’ replaces the previous one. Before 30 months, children have a foundational non-representational understanding of perception, evidenced by their use of joint
attention and understanding of object permanence. From 30 months onwards, children develop an understanding of perception and desire. They can understand that if
someone desires something, this will affect their actions. At around three years, children show a more complex understanding of desires and perspectives. They can
understand that other people’s desires may conflict with their own (e.g. different food preferences) and they begin to understand perceptual misrepresentation (e.g.
that a green object viewed through a pink screen can look black). Finally at about four years, children realise that they can generalise the notion of
misrepresentation (that what you see isn’t necessarily what is real) from perspective contexts to belief contexts. They now have a theory of mind.
In addition to the evidence presented above – which strongly suggests that change of some form is taking place around 4 years of age – these ideas are also supported
by evidence that the radical shift in thought processes from failure to success at theory of mind tasks is universal, regardless of cross-cultural boundaries. Avis and
Harris (1991), noting that all false belief tasks had been conducted on children in western cultures, went to the Cameroon to investigate theory of mind with a tribe
of hunter-gatherer Baka pygmies who had had little contact with western civilisation. Avis and Harris adapted the unexpected transfer test to involve a mango, a
cooking pot and a bowl. They found that, just as in western children, there seemed to be a radical shift in thought processes, although a little later than that
experienced by western children. Generally, children under five could not give the correct answer; children over five could. They concluded that the acquisition of
theory of mind as part of a stage of development was universal.
Challenges to Representational Deficit Theories
Over the years, there have been a number of challenges to the class of theory outlined above. These challenges can be categorised into four broad groups.
1. Claims that false belief tasks are flawed.
2. Evidence for earlier abilities
3. Evidence for later development
4. Claims that the apparent onset of theory of mind can be varied by task manipulations
Challenge 1: False Belief Tasks are Flawed
Most of the evidence for a stage-like change in children’s theory of mind ability comes from false belief tasks, which are experiments conducted in a lab. There have
been many researchers who suggest that this reliance on experiments distorts our view of children’s ability. In part, this is because the false belief tasks themselves
are flawed. In much the same way that researchers have criticised Piaget’s experiments for underestimating children’s ability, researchers have criticised false belief
tasks for misleading children. There are a number of variations of false belief tasks that try to show that young children under four years can actually pass these
For example, Lewis and Osborne (1990) suggested that the question in the deceptive box task confuses children. They argue that asking children “what will he think is
in the box?” may be interpreted as “what will he think when he looks in the box”. They changed the question to “what will he think before he opens the lid” and found
that many more 3 year old children succeeded in the task than expected. Making precise time references helped young children acknowledge another’s false belief.
Another problem is that children may have difficulty understanding and integrating the key elements of the story. Lewis, Freeman, Hagestadt and Douglas (1994) adapted
the traditional unexpected transfer story so that instead of showing the participants dolls acting out the story, they showed participants a picture book story. In
this way the children were able to go over the key points of the story, move back and forth over the pages and get the sequence of events clear in their minds. Lewis
et al. found that the children who showed good comprehension of the story were more likely to succeed in the test.
A third problem was identified by Wimmer, Hogrefe and Sodian (1988). They suggested that all the false belief tasks show is that young children do not know that
‘seeing is believing’, that what you see happening is what you believe has happened. Adults and older children know that if you see something in side a box you know
that it is there. Wimmer et al. suggested that younger children do not make this link. They demonstrated this with an experiment. They showed children a box and asked
whether another person knew what was in it. In one condition, the other person had looked in the box, in others they hadn’t. They found that although three year olds
could judge whether the person had looked in the box or not, they could not judge whether the other person knew what was in the box. Whether they themselves knew what
was in the box didn’t make a difference; their answers to the question appeared to be random. In contrast, older children did not make random judgements. They made the
link between seeing and believing. Wimmer et al. suggested that what distinguishes a three year old from a four year old is the knowledge that information (or seeing)
leads to belief.
Another argument is that children are not able to articulate their false beliefs because they have implicit, rather than explicit knowledge. Freeman, Lewis and Doherty
(1991) modified the unexpected transfer task such that instead of asking children to tell them where the character in the story would look for the item, they asked the
children to act out what the character would do. They found that 66 older three year olds and young four year olds failed to verbalise the false belief but correctly
acted out the character’s journey to the empty location. This suggests that younger children may have an understanding of theory of mind but not be able to verbalise
Challenge 2: Evidence for early ToM (before 4 years)
Another line of argument is to show that although children younger than four may not be able to solve false belief tasks they show other theory of mind abilities very
early on which suggests that the achievements at four years are not the result of a conceptual shift in children’s thought processes but the culmination of a long line
of related development. In other words, the ability to reason about minds that develops at four is the product of continuous development of rudimentary theory of mind
skills throughout the pre-school years.
Judy Dunn has carried out a lot of observational work on children in naturalistic settings. She argues that laboratory experiments are meaningless to children as they
are detached hypothetical situations. She argues that we should study children in their natural worlds, in a setting which has real significance for them and allows us
to look at children performing at their best. Studies conducted in naturalistic surroundings suggest that younger children’s understanding of the feelings, intentions
and beliefs of others is more sophisticated than the experiments suggest, supporting the idea that theory of mind develops slowly over a number of years, starting much
earlier than age four and continuing well into later childhood.
Dunn (1988) studied 49 second-born children in their second and third years of life. She made unstructured observations at the children’s homes when the children were
playing with their mothers and their siblings. This allowed her to examine children’s behaviour within their families. One aspect she looked at was conflict between
siblings. Younger children (aged less than two) would carry out simple teasing behaviours, such as stealing their siblings’ special toys. Children aged between two and
three demonstrated more frequent and sophisticated teasing. For example, they might pretend to be their sibling’s imaginary friend. Carrying out this pretence involves
knowledge of what would upset their sibling and the ability to transform their own identity – both of which suggest a knowledge of mental states. Similar behaviours
include blaming a sibling for causing a dispute and drawing attention to a sibling’s naughty behaviour to draw attention away from their own behaviour. As children get
older, their excuses for bad behaviour become more sophisticated – from “I didn’t mean to” to “I’m going try and get the paint off” as a justification for staying on
the sofa and playing with the TV when the child’s mother wanted him to get off. This was obviously a blatant lie as there was no paint on the TV. This same growth of
sophistication can be seen in other contexts such as jokes, prosocial behaviour, cooperation, pretend play and conversations. All of these take place before the
supposed ‘shift’ in conceptual understanding.
There is also other evidence of very early abilities that seem to indicate early proto-theory of mind, such as deception. Deception involves theory of mind skills; in
order to deceive someone we have to have some knowledge of what they are likely to be thinking in order to determine what is most likely to mislead them. Lewis,
Stanger and Sullivan (1989) devised an experiment to test if three-year-old children could lie. They introduced the children to a toy that was covered by a sheet and
told each child that they were not allowed to touch the toy yet but that they would be able to play with it shortly. The experimenter then asked the children to wait
patiently while they did something important in another room. When the experimenter left the room the children were observed covertly. Of course, they were unable to
resist peeping under the cloth; 29/33 children looked at the toy. When the experimenter came back, the children were asked if they had looked under the sheet. 18 of
the 29 children who had looked under the sheet attempted to conceal their behaviour. 7 of them told a straight lie and 11 refused to answer the question. Importantly,
when Lewis et al showed videos of the children’s answers to a panel of adults, the adults were completely useless at detecting which children were lying and which
children were telling the truth.
Chandler, Fritz and Hala (1989) also demonstrated that young children could engage in deceitful behaviour in order to prevent someone from finding some hidden
treasure. Their study consists of a child and two experimenters in a room containing four dustbins, some treasure and a roll-along doll. One of the experimenters
leaves and the child is asked to conspire with the remaining experimenter to play a trick on their absent colleague. The child is asked to use the doll to hide the
treasure in one of the dustbins so that it can’t be found. The child does this, but unfortunately the doll leaves a trail because its feet have been in an ink pad. The
child is asked how they will trick the absent experimenter. Participants were able to this by wiping the inky footprints away (it had already been demonstrated that
this was possible). In addition, some children actually took hold of the doll and led her to a different bin – one that didn’t hold the treasure – thus laying a false
trail. Some of the youngest children were capable of carrying out these deceptions, even the two year olds. These results suggest that children can behave in a way
that shows that they can predict the mental states of others and modify their behaviour accordingly in order to deceive.
Communicative skills are another set of potential proto-theory-of-mind abilities. Butterworth & Jarrett (1991) argue that the fact that children below four years fail
on tasks that require them to reason about others’ minds does not necessarily mean that they have no earlier means to understand other minds. For example, babies as
young as six months are able to change their own line of sight to follow a change in the attention of another person. A change in the focus of an adult seems to signal
to the infant that something is worth looking at. If an adult turns slowly and deliberately to look at a target, babies turn their heads and find the target the adult
is looking at. Early in life, babies just look in the general direction of the adult’s gaze, but by about twelve months the baby will look in the precise location.
This occurs with pointing too. Babies will follow the line of their mothers’ finger and focus on the object that she is pointing at. This suggests that the infant
takes her own visual field to be in common with that of others which means that infants know in some sense that others can have a perspective. It can be argued that
this is a rudimentary knowledge of others’ minds. Whilst it is not a theory of mind and not reasoning about mind, it is still early awareness of mind.
Finally, Leslie (1987) discusses children’s early ability to pretend as an important development in understanding of mental states. Children start to pretend from
about 18 months. Leslie argues that pretence implies an understanding of mind in two respects. First, the toddler is somehow ‘simulating’ the existence of other minds.
If a toddler pretends that one brick is a mother and another is a baby then the toddler knows enough about what people are to be able to create a rich simulation of
the pretend people’s behaviour. Second, pretence also involves being able to keep track of reality despite being in the confines of a game. When pretending, children
aren’t confused about reality, they are patently aware that pretence is not the same as reality. In fact, if an adult treats the game as if it is real (e.g. asking
“are you a fairy?”), toddlers may offer responses such as “no, I’m a little girl!”
Challenge 3: Evidence for later development (post-4 years)
In addition to evidence for early abilities, there is also evidence for aspects of theory of mind which develop after the age of four. Knowledge about the minds of
others doesn’t just mean knowing when someone is holding a false belief. There are other developments that take place later. For example, Perner and Wimmer (1985)
studied second order belief attribution. They were interested in determining whether children understand that people hold beliefs not just about reality but about
other people’s beliefs. Participants were told a story which was enacted with dolls (see Figure 3): Mary and John saw the ice-cream van in the park. Mary went home for
some money and meanwhile John saw the ice-cream van move to the church. On her way back to the park, Mary unexpectedly saw the ice-cream van at the church, so her
belief about the van’s location remains true. John sets out to look for Mary whom, he is told, has gone for an ice cream. Participants were asked “Where will John
think that Mary has gone?”. Obviously, the correct answer is the park as John holds a false belief about Mary’s belief. Children don’t tend to get the answer right
until they are seven years old. The argument here is that if we are to say there’s a conceptual shift at age four we also need some way to explain the development of
Figure 3 – Second Order False Belief Task
Source – Mitchell, P. (1997). Introduction to theory of mind: children, autism & apes. London: Arnold.
Challenge 4: Evidence that the onset of ToM can be varied by varying the task
There is also evidence that older children and adults can be manipulated to fail theory of mind tasks. Peter Mitchell (1997) argues that the reason young children fail
false belief tasks is that they pay too much attention to reality (see further down for more detail on Mitchell’s theory). If this is the case, the young child is not
qualitatively different from older children and adults. Instead young children, older children and adults differ only in the degree to which they rely in reality in
making judgements. Therefore, it should be possible to induce false belief failure in older children and even adults.
Steverson (1996) carried out a variant of the deceptive box task, in which participants were asked to make a judgment about the false belief held by a puppet called
Sweep. In this task, the standard Smarties tube is presented and Sweep says there are Smarties inside. The tube is opened to reveal a pencil inside. The experimenter
says to the child “When Sweep first saw the tube he thought there was a pencil inside, didn’t he?”. Children aged 5 and 6 agreed with this judgement. It seems that
hindering the children by asserting (incorrectly) a true belief on the part of the puppet meant they failed the task. Could it be, however, that children simply agreed
with experimenter? No. There was also a control condition in which the same procedure was used with the sole difference that the experimenter claims that Sweep’s
initial belief was that there were jelly babies inside the tube. In this condition, children did not agree with experimenter. This suggests that it remains tempting
for children to judge all beliefs to be true even after five years old. The task simply needs to be manipulated to make it a bit harder to get the correct answer.
Adult false belief task (Mitchell, Robinson, Isaacs & Nye, 1996)
Kevin stood on a chair and looked into a jug on a shelf. He saw there was orange juice in the jug. Kevin then left the room. While Kevin was out of the room, Rebecca
entered the room, poured out the orange juice and replaced it with milk. Kevin returned to the room with Rebecca who announced that there was milk in the jug.
What does Kevin think is in the jug? Choose an answer.
a) Orange Juice
Pilot studies showed that people were more likely to say that Kevin will believe what he saw rather than what Rebecca told him. Mitchell et al. argued that under these
conditions when there’s no right or wrong answer it may be possible for participants’ own beliefs to distort their judgement of what Kevin would believe. So, in
certain circumstances, we can false belief task ‘failure’ in adults. Does this mean that adults have no theory of mind? Obviously not. Therefore, argue Mitchell et al,
we shouldn’t conclude that children who can be manipulated to fail false belief tasks have no theory of mind either.
Those are the four main challenges to the representational deficit theory and theory theory. We will now look at two other theories which claim that theory of mind is
present from birth: Alan Leslie’s Theory Of Mind Mechanism theory and Peter Mitchell’s Reality Bias theory.
‘Theory’ or Module?
Alan Leslie (1987, 1994, 2000; see also Scholl & Leslie, 1999, 2001) proposed a domain specific learning device, an innate theory of mind module referred to as the
Theory Of Mind Mechanism. Leslie argues that the development of theory of mind is a continuous process and that early task failure is a result of performance
limitations. He supports this assertion with evidence from the literature on autism (more on this in the section on autism).
Older autistic children seem to have a poor understanding of specific aspects of mind, such as false beliefs and the appearance/reality distinction. They are also
believed to have an inborn neurological deficit. Therefore, according to Leslie, there is a specific theory of mind module which is absent in autistic children. More
specifically, Leslie’s explanation involves two components, the theory of mind mechanism and a ‘Selection Processing’ process which inhibits salient, but unwanted,
responses. Leslie argues that typically developing children possess a theory of mind, but that the default response is to judge all beliefs as being true. The lack of
a mature selection processing process means that this default is not inhibited in favour of the false belief judgment.
Mitchell’s Reality Bias
Like Leslie, Peter Mitchell (1997) argues that a theory of mind is a product of evolutionary forces, part of our genetic inheritance and, therefore, innate. He points
out that to be able to understand the intent of your enemy, to be able to engage in deception, and to be able to communicate effectively – which relies on interpreting
the speech of others, which is much more efficient if you have a theory of mind – would give you a huge evolutionary advantage. Thus, if theory of mind is innate, it
must be an ability present in young children. Mitchell argues that, in false belief tasks, children are guided by something that prevents them making correct theory of
mind responses. He calls this the reality criterion and argues that young children are predisposed to follow the reality criterion which takes priority over the theory
of mind criterion. The reality criterion is of great relevance for young children because they will be unable to sustain an independent existence if they do not master
the nature of their physical environment. The physical environment is not static and predictable, therefore children need to be able to believe that what they know and
see is true. On the other hand, it is not necessary for young children to be able to interpret mental states as the attachment relationship with a caregiver means that
they will be cared for without needing an understanding of the caregiver’s mental states. So whenever the reality criterion and the theory of mind ‘belief criterion’
come into conflict, the reality criterion will dominate and so young children will make errors on false belief tasks.
For Mitchell, older and younger children do not differ in the fact that younger children do not possess a concept of belief. Instead, they merely differ in the degree
to which they are ‘locked into’ or misled by reality. As children age, the reality criterion becomes less important and the belief criterion becomes more important.
Consider the deceptive box task. When children are asked what they originally thought was in the tube, they are actually being asked to retrieve a memory of a false
belief. When they have seen that there is actually a pencil inside the box, this is tangible. The pencil competes more successfully for the child’s attention as a
candidate for judging what they used to think, owing to its physically tangible existence.
To test the idea that reality is more salient for young children, Mitchell and Lacohee (1991) modified the experiment to embody the child’s original belief in a
physical reality (see Figure5). They asked the participants what was in the tube. The children answered “Smarties”. The experimenters then asked the children to
identify a picture of Smarties from a set of pictures presented, and to post this picture into a special post box where it remained out of sight until the end of the
experiment. The tube was then opened to reveal the pencil. The tube was closed again and the children were asked what they had thought was in the tube when they posted
the picture into the postbox. In the standard version of the task (without the pictures and the postbox), only 23% of 3 and 4 year olds correctly remembered their
false belief. In the modified version, 63% of 3 and 4 year olds correctly remembered their previous false belief. This supports Mitchell’s prediction that children are
misled by reality.
Figure 5 – Modified Deceptive Box Task
Challenges to the Challengers
However, the stage theorists are not yet finished. Remember that there were four types of criticism of stage theories:
1. Claims that false belief tasks are flawed.
2. Evidence for earlier abilities
3. Evidence for later development
4. Claims that the apparent onset of theory of mind can be varied by task manipulations
Stage theorists have a series of responses to these challenges:
1. False belief tasks are not flawed
2. There is no evidence for early theory of mind as such
3. Early abilities may be precursors to a ToM, but this doesn’t mean that the shift at age 4 doesn’t exist
4. Evidence for later development (post-4 years) is irrelevant
Response 1: False belief tasks are not flawed
The first argument is that manipulations of false belief tasks which enable younger children to pass the tasks are artificially boosting children’s performance by
social scaffolding; the children are being implicitly told the answers. Even with manipulations, children under 4 still frequently fail the tasks. Perner (2000) also
reports that Lewis & Osborne’s 1990 study has not been replicated.
Response 2: There is no evidence for early theory of mind as such
The second argument is that although children may develop an understanding of other mental states such as wanting or emotions earlier, this is irrelevant. Perner
(1991) argues that it is only when children understand false belief that they achieve a truly representational theory of mind. He dismisses other abilities as
resulting from a ‘mentalist theory of behaviour’. His argument is that younger children do not really understanding others’ minds, they are just interpreting behaviour
in terms of their past experience of what that behaviour means. For example, when infants follow the gaze or the pointing finger of their caregivers, it may be that
infants simply understand that pointing or changes in another’s gaze are good predictors of where an object may be located. Similarly, other early abilities may be
explained without reference to another’s mind. Some (e.g. Trevarthen, 1990) have argued that the fact that infants respond to the facial expressions of their mother
indicate that they understand something (albeit rudimentary) of the emotions behind the expressions. Perner argues that they may simply understand the environmental
correlation between the emotional expression and whether or not an object or event is threatening. Thus, these achievements do not prove that children understand the
mental states of others.
Sodian, Taylor, Harris and Perner (1991) repeated Chandler et al’s (1989) treasure hunt experiment. They did find that children would lay false trails, but the
children still thought that other people would know where the treasure really was, despite the false trail. Also, they laid false trails even when asked to help
another person find the treasure. This suggests that children’s apparent understanding of deception is not as sophisticated as was first thought.
Regarding pretence, Perner argues that pretence and false belief differ in that in pretence it is not necessary to hold the representation as being true of the world.
In the Maxi task, children have to be able to know that Maxi really holds a false belief, whereas in pretence children only have to acknowledge that someone is
pretending. This is not a conflict between belief and reality. Furthermore, Perner, Baker and Hutton (1994) showed that children do not understand pretence as a state
of mind. They conducted an experiment in which participants were told a story about a girl called Jane who was feeding her rabbit. Participants were assigned to two
conditions. In both conditions there was no rabbit actually in the hutch. In one condition, the children were told that Jane knew that there was no rabbit in the
hutch; in the other condition Jane was shown as thinking that the rabbit was in the hutch. The children were asked if Jane was pretending that the rabbit was in the
hutch or whether she really thought the rabbit was in the hutch. Children aged 5 years and older distinguished between the 2 scenarios. They showed awareness that if
Jane didn’t know that the rabbit was absent she really thought the rabbit was in the hutch, but if Jane did know the rabbit was present she was only pretending that
the rabbit was in the hutch. Children under 5 answered that Jane was pretending in both situations. They failed to discriminate between pretending – feeding the rabbit
in full awareness that the rabbit was absent – and false belief – feeding the rabbit in the false belief that the rabbit is present. They concluded that children do
not understand pretence as a state of mind, therefore early evidence of pretence is not evidence of early theory of mind.
Response 3: Early abilities may be precursors to a ToM, but this doesn’t mean that the shift at age 4 doesn’t exist
Gopnik and colleagues’ version of ‘theory theory’ includes four conceptual changes in the development of theory of mind (see above). This account allows for children
to demonstrate ‘precursor’ abilities without denying the conceptual shift at four years old.
Response 4: Evidence for later development (post-4 years) is irrelevant
Similarly, although there are developments after four years (e.g. second and third order belief attribution), these are just developments that occur after the
conceptual shift. According to the stage theorists, this doesn’t mean anything. A comparison can be drawn with puberty. Puberty does not happen overnight, we still
keep changing for a few years, but no one would argue that puberty is a slow developmental sequence that starts at birth – it matures at adolescence. Thus, theory of
mind matures at about four years and then continues to develop from there.
Wellman et al’s (2001) Meta-Analysis
In 2001, Wellman, Cross and Watson conducted a meta-analysis on 77 articles or reports which reported results on false belief tasks. This encompassed 178 studies in
which there were a total of 591 conditions. In a meta-analysis, it is the conditions, rather than the studies or the participants, which are the unit of analysis. The
dependent variable was the proportion of children in a condition who made correct false belief judgements. Wellman et al. looked at a large number of variables, but
these are the significant findings:
There were 6 main effects (factors which influenced task performance), 5 of which did not interact with age. First, they found a better performance if deception was
the motive for the change (e.g. if the chocolate was moved in order to trick the protagonist). Second, children were more likely to pass if they carried out
transformation themselves. In some studies, children passively watched the narrative, in others they helped to enact the story by, for example, moving the chocolate
themselves. Third, performance was improved if target object was not present when the false belief question was asked (e.g. if the chocolate had been moved from the
drawer and eaten). The fourth facilitating factor was if the protagonist’s belief is explicitly stated, for example if children were told “Maxi believes the chocolate
is in the green drawer – where will he look for it?”. Fifth, there was an effect of country of origin. Children from the US, the UK and Korea performed similarly,
whereas children from Australia and Canada performed somewhat better and children from Austria and Japan performed somewhat worse. Finally, emphasis of time frame
facilitated performance. In other words, if children were told “where will he look first?”, they have a greater chance of success. However, this factor interacted with
age such that an effect was only found in children older than 4. This is possibly because including this information increases the length and complexity of the
questions, which may affect younger children’s performance.
However, the basic development trend was still observed. Wellman et al. reported that children aged 4 and over performed above chance, whereas children aged 3;5 and
younger performed below chance.
Language is unique to humans. Whilst other species communicate with body movements and/or vocalisations, these messages have fixed meanings which relate to concrete
events or needs in the present. Other species cannot combine bits of messages with bits of other messages to create new meanings. Human language, on the other hand, is
extremely flexible, allowing for an infinite number of combinations of words carrying meaning about abstract and hypothetical events which may be in the past or in the
future. Both adults and children produce utterances they have never heard before. In doing so, however, we adhere to the rules of grammar. We cannot just stick words
together in any order; children do not produce phrases such as:
want shops I go to the to
Whilst there have been attempts to get other species (notably chimps and gorillas) to use sign language (e.g. Gardner & Gardner, 1969; Premack, 1976; Patterson, 1978),
these animals use very few combinations of words other than those they have been taught and any they do produce tend to be limited to the concrete. In addition,
language is something that arises spontaneously in all human infants; there are no cases of animals attempting to use language without human teaching.
Not only is language a uniquely human ability, we become very good at comprehending and producing language in a relatively short period of time. As an adult, you know
around 50,000 words and you are capable of fitting these words together at the rate of 150 per minute whilst making few grammatical mistakes. Most of this development
occurs between the ages of 1 and 6, by which time children have a vocabulary of 6000 words and can produce most sentence types. Some argue that this rapid development
means that we must have innate knowledge of linguistic structures. Others argue that we have very powerful learning mechanisms.
Researchers who study the development of language are usually interested in one (or more) of four things:
Phonology: how children learn to distinguish between sounds how they learn to produce speech sounds
Semantics: how children learn words and the meanings of words
Grammar: how children learn about the structure of language
Pragmatics: how children learn to use language socially
This section will focus on the development of grammar, which is divided into syntax and morphology.
Syntax is concerned with how to fit words together into sentences. Children have to learn what grammatical categories words fit into. Here is a list of grammatical
categories with examples of words which belong in them:
Nouns dog, cat, bird, mummy, house, thing.
Pronouns I, we, you, she, that, these
Verbs like, love, run, jump, kiss
Adjectives big, hot, little, red
Adverbs quietly, loudly, fast, slow
Prepositions on, in, under, up
Conjunctions and, because, but
Determiners a, this, the, some, my, your
Infinitive to (as in to run, to jump)
Quantifiers half, all
Wh-words what, who, when
Children also have to learn what grammatical relations exist between these categories. Grammatical relations are things such as ‘subject’, ‘verb’ and ‘object’.
John Kissed Mary
Noun Verb Noiun
Subject Verb Object
These relations are important as they contribute to the way in which sentences are constructed. For example, the rule ‘subject-verb agreement’ means that the form of
the verb (i.e. whether it is singular or plural, past or present tense) ‘agrees’ with the subject of a sentence. In the phrase ‘Gemma loves cats’ the verb ‘loves’
takes a form relative to the subject (‘Gemma’). If verbs agreed with objects, we would end up with ‘Gemma love cats’.
In the example ‘John kissed Mary’, it would seem that the grammatical relations of subject and object are the same as the grammatical category of noun. However, the
categories of relations are broader than the categories of words. All the following sentences have the same subject-verb-object order:
John kissed Mary
The boy kissed the girl
The tall dark boy gently kissed the attractive pale girl
The subject and object can be either nouns or a noun phrases and the verb can be a verb or a verb phrase. A simple noun phrase can consist of an optional determiner,
optional adjectives and a noun. A verb simple verb phrase can consist of an optional adverb and a verb.
It is these grammatical relations we manipulate, rather than words, to create different types of sentences. The passive construction, for example, has the order
object-verb-partciple-subject. We have to move the whole grammatical relation (subject/object) not just the words (boy/girl) to create the same sentence in the
The attractive pale girl was kissed by The tall dark boy
Object Verb Participle Subject
Questions are another example:
Did The tall dark boy gently kiss the attractive pale girl
Auxilliary Subject Verb Object
In addition to learning about the relations between grammatical categories, children have to learn about morphology. Morphology refers to the inflections (or ‘little
bits’) of words, such as the ‘-ed’ which is added to regular verbs to create the past tense (kiss-ed) and the ‘-s’ which is added to nouns to make plurals (cat-s).
These added bits of words are called morphemes, a term which refers to the smallest meaningful units of language. Thus, ‘cats’ is composed of two morphemes: ‘cat’ and
‘s’. Little ‘function’ words such as ‘on’, ‘the’ and ‘are’ are also morphemes, as they cannot be subdivided into smaller units. This can make things confusing but when
we talk about the acquisition of morphology we’re usually talking about the addition of bits to words to make plurals, past tenses, etc.
The Sequence of Acquisition
Language development is very varied between children in terms of the age at which particular milestones are met. One child may start stringing words together before 18
months whereas another may not do so until a year later. However, the sequence of development – the order in which children use aspects of language – remains the same
At around 12 months children produce their first words. By 18 months they can produce about 50 words. Over the next 6 months there is a ‘vocabulary spurt’ in which
another 150 or so words are learnt. By 30 months, children can produce 600 words and by 6 years they can produce 10,000-15,000 words.
Towards the end of their second year, children start to put words together into 2 word utterances. These are very simple utterances but they are constructed, not
rote-learned. This is often referred to as the ‘telegraphic speech stage’. When sending telegrams people paid by the word so kept their messages as brief as possible
(e.g. “horse dead send money”). Children’s speech is like a telegram in that the more subtle aspects are omitted. Children in the 2 word stage leave out what we call
the grammatical morphemes. For example, they may say “mummy shoe” rather than “mummy’s shoe” or “two cat” instead of “two cats”. This, however, is generalisation.
There are again individual variations. Some children are ‘analytic’: they stick to the telegraphic style whereas others are ‘gestalt’: they seem to be trying to put
the morphemes in even when they don’t know what they are. Also there are languages in which it is impossible to speak without the grammatical inflection. In English we
can use the bare stem of a verb such as ‘talk’ (e.g. “I talk”, “you talk”). In Spanish there always has to be an inflection. The bare stem of ‘talk’ in Spanish is
‘habl-‘, which cannot be used without an inflectional morpheme. We can say “hablo”, “hablas”, “hablamos” and “habla”, but not just “habl-“.
The first 2-word utterances tend to have simple meanings:
recurrence more milk
non-existence allgone egg
nomination this truck
This is then extended to a broader range of meanings. Roger Brown (1973) listed some of the basic semantic relations demonstrated in two-word speech:
agent-action mummy push
action-object eat dinner
agent-object mummy pigtail
action-location play garden
entity-location cookie plate
possessor-possession mummy scarf
attribute-entity green car
demonstrative-entity that butterfly
Slobin (1970) noted that these initial meanings are largely the same across languages. Over the next 2 years, children start to add function words, verb endings and
plurals. Within any language, most children acquire these morphemes in about the same order, famously catalogued for English by Brown (1973):
Morpheme Example Age (months)
present progressive ‘-ing’ walking 19-28
Prepositions on, in 27-30
plural ‘-s’ cats 24-33
irregular past tense broke 25-46
Possessive Mummy’s 26-40
Articles a, the 28-46
regular past tense kissed 26-48
third person singular -s on verbs wants 26-46
irregular third person singular has, does 28-50
auxiliary/copula ‘be’ is, are 29-50
By 24 months children can produce sentences of up to 4 or 5 words; by 30 months they can produce sentences of up to 8 words (Fenson et al. 1994). Later developments
include the production of complex sentence types such as questions, negatives, passives and compound sentences (sentences consisting of two simpler sentences linked by
‘and’ or ‘because’).
Issues in the Acquisition of Grammar
There are a lot of quite complicated aspects of language that are mastered by young children, but how do they do it? Before we look at the theories which try to
explain this, there are a couple of issues to bear in mind.
Do children learn language using a ‘mental organ’ (or module)? Noam Chomsky’s ‘mental organ’ fits in well with Jerry Fodor’s (1983) proposal that the mind comprises a
number of specialised modules including a language system. Although modules must obviously interact, they are relatively independent in that at least some of the
principles of organisation of each module are not shared with other cognitive systems. Alternatively, is language acquisition a problem which is solved by general
intelligence? In other words, is language acquisition a domain-specific or domain-general process? Broadly speaking, nativists favour domain specificity and
constructivists often specify domain-general processes, so the question of modularity is strongly linked to the issue of learning vs. innateness.
Learning and Innateness
We know that all humans possess language (either spoken or signed), but animals do not so the capacity for language must involve heredity. However, a child growing up
in Tokyo speaks Japanese whereas the same child brought up in Derby would speak English, so the environment must also be involved. Any theory of language acquisition
has to take both factors into account. Steven Pinker (1995) argues that a theory which does not consider innate structure will result in a hypothetical child who does
not have a complex, true language, and that a theory which does not consider the environment will result in a child capable of acquiring one specific language but not
another. If we knew how the degree to which language can be learned, it would give us some idea of how much innate structure or knowledge is necessary. One way of
answering this question is to use a branch of computer science called learnability theory.
Gold (1967) used a formal theoretical proof to assert that no natural human languages are learnable as they contain a massive amount of positive examples but no
negative examples. According to learnability theory, the task of language learning involves a class of languages – all possible human languages – of which one is the
‘target’ language, a linguistic environment, a strategy for creating and testing hypotheses about the target language, and some way of judging whether a hypothesis is
correct. The basis of Gold’s argument is that if a language-learning child has a set of possible solutions to the problem of identifying the grammar of the target
language which includes at least one incorrect grammar, then the child will not be able to learn the target grammar, as an incorrect grammar can be used to generate
incorrect utterances. Gold’s work has satisfied many nativist theorists that language is not learnable, that it is not a soluble problem. However more recent research
using computational models has shown that it may indeed be possible to learn language on the basis of information contained in that language – we’ll return to this
In addition to addressing the issues of modularity and learning, a theory of language acquisition must account for the rapid acquisition of language, children’s
correct language use, children’s mistakes and the sequence of acquisition (why some aspects of language are always acquired before others). We will consider three
broad categories of theory: behaviourism, nativism and constructivism.
B. F. Skinner (1957) argued that language could be and was learnt, with no need for innate knowledge. Skinner suggested that language is acquired according to the
general laws of learning and is therefore no different to any other learnt behaviour. Language develops as a result of imitation of adult speech and adults’
reinforcement and gradual shaping of children’s babbling and speech.
However, if this were the case, we would expect children to produce only those utterances they have heard and the syntactic structures they use should be the same as
those found in adult speech. Evidence from children’s speech data demonstrates that this is not the case:
“Nobody don’t like me” (McNeill, 1970)
“Me tipen that over” (Fletcher, 1983)
Although these examples do not conform to a full adult grammar, they are not random combinations of words. They have a clear grammatical structure and they are clearly
novel utterances (i.e. combinations of words that the children had not heard before). Also, Brown & Hanlon (1970) noted that parents do not correct their children’s
grammar. Even when they do, children do not seem to pay attention to the corrections.
Here is an example of Martin Braine’s (1971) attempt to correct his 2 year old daughter:
Child: Want other one spoon, Daddy.
Daddy: You mean, you want the other spoon.
Child: Yes, I want the other one spoon, please, Daddy.
Daddy: Can you say ‘the other spoon’?
Child: Other … one … spoon.
Daddy: Say ‘other’.
Daddy: ‘Other spoon’.
Child: Other … spoon. Now give me other one spoon?
Here is another example from McNeill (1970):
Child: Nobody don’t like me.
Mother: No, say ‘nobody likes me’.
Child: Nobody don’t like me.
(Eight repetitions of this exchange)
Mother: No, now listen carefully; say ‘nobody likes me’.
Child: Oh! Nobody don’t likes me.
In response to Skinner’s (1957) claims that language is learned by imitation, Chomsky (1957, 1959) published the original nativist account of language acquisition.
Chomsky argued that children learn languages that are governed by highly subtle and abstract principles, and they do so without explicit instruction or any other
environmental clues to the nature of such principles. Hence language acquisition depends on an innate, species-specific module that is distinct from general
intelligence. Various modifications and updates have been made to basic nativist theory, however there are three main arguments used to justify an approach which
specifies innate linguistic knowledge.
First, the poverty of the stimulus argument states that the language heard by children does not contain all the information necessary to ‘decode’ it or interpret it
correctly. Not all language addressed to children consists of well-formed utterances. Children hear utterances that contain incomplete and ungrammatical sentences and
have a limited exposure to the full range of structures present in the language. Any given word may have a number of different interpretations, depending on context.
The Chomskyan argument is that this information is not present in the stimulus itself, so therefore must be present in the child. In addition children are exposed to
poor, incorrect grammar; how do children, in the face of complex spoken language, decide which utterances are grammatical and which are not unless they have innate
principles to guide them?
Second, children come to use sentences that never occur in their language-learning environment but they form very few ungrammatical utterances. They make errors that
suggest that they’re not simply imitating or mimicking the language they hear but are actually working out what the rules of language are. Children make
overgeneralisation errors; they create words such as ‘runned’ or ‘goed’ which follow the rules of grammar for regular verbs, but are extremely unlikely to have been
heard by the child. They seem to know somehow what the rules of English are – that ‘-ed’ is added to the end of a verb to create the past tense and that ‘-s’ is added
to a noun to make it plural. They then use these rules in situations where adults will never have used them. The implication of this is that children are being guided
by rules that govern the grammaticality of their production.
Third, the problem of no negative evidence; children do not generally receive feedback regarding the grammaticality of their speech. Without this feedback (or negative
evidence) children cannot learn which utterances are ungrammatical. This is related to the logical argument from learnability theory discussed above. Even in cases
where children do receive feedback, they do not seem to incorporate this information into their speech. This suggests that children are not being ‘taught’ language by
These three arguments – poverty of the stimulus, overgeneralisation errors, and no negative evidence – have been used to argue that language learning is an impossible
task, unless one posits innate linguistic knowledge. All nativist theories attempt to discover and define exactly what the properties of this innate knowledge are and
how children are able to access this information. The Chomskyan linguistic goal, therefore, is to define a grammar which is capable of generating all legal sentences
in a language, but none of the illegal ones.
The vast majority of nativist work has been dominated by work of Chomsky. Chomsky contends that the capacity to comprehend and generate language is innate, and the
principles by which it develops are not the same as those underlying other human behaviours. Chomsky’s (1965) response to the problems noted above was to propose a
mental language organ which came to be known as the Language Acquisition Device (LAD), a device through which Universal Grammar (UG), the universal rules of all human
languages, could be accessed. Nativist researchers have tended not to be interested in the mechanism by which the LAD works, concentrating instead on working out what
the rules of UG are and how children come to apply them to the specific language they are learning. The LAD is an innate structure containing knowledge about the
structure of language. It also includes some abilities specialised to language learning so that children can work out the rules of syntax.
Of course, not all human languages share the same grammar and it is possible for any child to be raised in any linguistic community, and therefore have the potential
to learn any language. A nativist solution to the problem of acquiring language has to account for the presence of errors in children’s speech and the ability to
acquire any one of a set of grammars. Chomsky’s Principles and Parameters Theory (PPT) (1981), which has replaced the LAD, deals with both of these problems by giving
the child’s linguistic input a role in the acquisition process. The notion of a set of rules governing grammar is removed in favour of the idea that UG contains some
universal principles – features that are present in all languages – such as the existence of a noun category, and some features that are subject to parametric
variation depending on the language, such as whether the subject is obligatory. In French and English a subject is required. We cannot say “Am a good kid”; we need to
include the subject: “I am a good kid”. In Spanish and Italian, on the other hand, subjects are obligatory. In Italian, both “Io sono bravo tato” (I am good kid) and
“sono bravo tato” (am good kid) are allowed (Valian, 1991). These differences are called parametric variation. Acquiring language involves identifying the correct
parameter from a range of innately specified possibilities. So if we are learning English we would have to learn to set the subject parameter to the ‘obligatory’
option. As very little learning about grammar takes place, language must necessarily contain ‘trigger’ information which enables parameter setting. Errors occur when
these parameters are not yet set correctly.
Chomsky’s theory is thus able to explain how it is that children are able to acquire language quickly and how language is learnt despite the problems of the poverty of
the stimulus and no negative evidence. However, three major problems with this theory have been noted. First, empirical studies of child language data have failed to
find any evidence that children are operating with adultlike, abstract grammatical knowledge. For example, Bowerman (1973) and Braine (1976) reported that children’s
early utterances were much more restricted in their terms of reference than would be expected if they were working with adultlike grammatical categories. They argued
that the structure of such utterances could be captured more accurately in terms of semantic or positional formulae (e.g. agent + action: hitter + hit). Second,
children make errors which do not fit in with theory. For example they omit obligatory constituents such as determiners, subjects and auxiliaries. Third, children make
grammatical errors not only when very young, but for many years afterwards, having had extensive exposure to their language. For example, English children continue to
omit subjects in sentences for a very long time. According to the theory, they should learn very quickly that subjects are obligatory in English. This suggests that
children either do not have access to UG or that that access is in some way limited.
New Nativist Theories
In response to these criticisms, nativist theorists have developed ideas which try to account for the differences between adult and child speech, while remaining true
to the assumptions that a complex structure such as grammar cannot be learnt without access to innate linguistic principles. In order to achieve this, theorists have
begun to argue that children are somehow prevented from making use of their full knowledge, although such knowledge is available to them. What it is that prevents
children using their knowledge is, however, hotly debated. Modern nativist theory can be broadly divided into two camps: Those that suggest that children have access
to adult like grammatical knowledge but are prevented from utilising this knowledge in their production (continuity theories), and those that suggest that some aspects
of linguistic knowledge only become available to the child at a later point in time (competence theories).
Continuity theorists such as Virginia Valian (1991) argue that children have access to the full blueprint for language, but that this knowledge is not reflected in
their speech production, which is constrained. Valian (1991) and Paul Bloom (1990) suggest performance limitations due to processing factors which affect both adults
and children, but are more discernable in children as children are not expert at manipulating language. Adults are constrained by factors including those related to
planning, organising syntactic structures and choosing appropriate lexical items. Children have these constraints and, in addition, as young children have a smaller
working memory than adults, they are more likely to produce shorter utterances with shorter constituents. Other proposed performance-related limits on production
include length, the content of the message the child wants to convey, syntactic and discourse requirements and pragmatic factors. Processing factors also come into
play: Valian (1986) suggests that there is a high processing load at the beginning of utterances. As children get older, these constraints are either reduced or cease
to have such a large impact on speech production as the child’s linguistic competence increases.
Although this can be considered a parsimonious explanation, due to the fact that no discontinuity between adult and child knowledge of grammar has to be accounted for,
there are two major problems with this approach. First, it has been argued by Theakston et al. (2001) that the theory does not make clear predictions. Logically, it is
hard to see how any data could contradict such an account. This is because a combination of different performance limits could, in theory, predict any pattern of
acquisition. For example, auxiliary omission could be predicted if we argue that restrictions on utterance length lead children to omit items that carry the least
semantic information. Conversely, subject omission could be explained by a performance limit that acts on the beginning of the sentence. Second, Pine, Lieven & Rowland
(1998) argue that this approach attributes too much knowledge to the child and cannot explain the restricted grammar found in children’s speech. For example, Valian
argues that children as young as 2 know that for if you want to say that you ate a cake you say ‘I ate a cake’ not ‘me ate a cake’. Knowledge about which pronoun to
use (I, me, my) is called case-marking. Valian argues that 2 year olds correctly use the nominative case (I) in subject position. Pine et al. report that although
children do correctly put subjects at the start of sentences, the children’s use of the accusative pronoun (me) in the correct position is not significantly different
from chance. In other words, children should treat ‘me’ and ‘I’ equally but, children make many more errors with ‘me’ than with ‘I’.
Competence theorists such as Andrew Radford (1990) and Nina Hyams (1986) explain the discrepancy between adult and child speech by arguing that certain aspects of
innate linguistic knowledge are not accessible by the child initially, coming ‘on-line’ as maturation progresses. One of the main driving forces behind the development
of both competence and continuity theories is the phenomenon of subject omission, in which obligatory subjects are omitted to produce phrases such as “want milk”
rather than “I want milk”. Hyams (1986) accounts for this finding in her ‘pro drop hypothesis’, in which she argues that the category of subject is mandatory, but that
the ‘subject’ parameter is initially set to a null value (correct for some languages, such as Italian); English-speaking children have to reset this parameter value
before they will achieve the correct grammar. This is accomplished by paying attention to features in the linguistic input which inform the child that subjects are
obligatory. Valian (1991) argues that this explanation cannot be correct as her analysis of American children (who regularly omitted subjects) and Italian children
shows a different pattern of subject use between the two populations. It is not the case, therefore, that young English-speaking children are mis-identifying English
as Italian. An additional problem with this sort of parameter-setting account is that one would expect an abrupt transition between the production of subjectless
utterances and grammatically correct utterances once the parameter is correctly set. Bloom (1990) notes that this does not occur – the transition is a gradual one.
Furthermore, Valian (1990) points out that it is hard to explain how the ‘triggering’ process, which re-sets parameters, can consistently be relied upon given that the
linguistic environment contains a lot of noise and examples of incorrect grammatical rules.
A quite different example of a competence theory is Radford’s (1990) ‘small-clause hypothesis’. For Radford, the difference between child and adult speech is that
children’s sentences lack functional categories. Radford argues for three stages of acquisition. First, children’s speech is acategorical. Children seem to have little
or no knowledge of how to access grammatical properties and rules. Second, at about 20 months, children enter the categorical stage in which grammatical knowledge
starts to mature, but they only have access to some categories such as noun, verb, adjective and preposition. Finally at about 24 months, the functional categories of
determiner, inflection and complementizer mature. The absence of functional categories in the child’s linguistic system before age 2 means that there are a range of
structures the child cannot produce. Radford uses evidence from a large number of child speech corpora that structures such as the possessive ‘s’, case-marked
pronouns, modal auxiliaries, the infinitive ‘to’ and nominative case-marking do not appear in children’s data until at least age 2.
Unfortunately, evidence from other studies seems to suggest that the data does not fit the predictions of the theory. In fact, it is now generally agreed that,
contrary to the small clause hypothesis, items associated with the functional categories (e.g. “a doggie”) are present in the earliest multi-word speech and that
children continue to make grammatical errors well after the functional categories are hypothesised to come on-line (Gathercole & Williams, 1994; Rowland, 2000).
What are the Universal Rules of Grammar?
A problem with nativism is the lack of an adequate description of the universal rules of grammar. Other languages are often very different to our own. For example,
English and Dutch are actually very similar grammatically when compared to the differences between English and Japanese, however, even between these two languages
there are some large differences. For example, in English we use the preposition ‘on’ to mean ‘on’ anything, regardless of the means of support: A cup on the table, a
coat on the hook, words on the page. In Dutch there are two words for ‘on’ – ‘op’ and ‘aan’ – ‘op’ for on the table or on the chair and ‘aan’ for on the wall or on the
hook. For languages that are even more different the situation seems even less universal. We have already looked at the ‘pro drop’ or ‘obligatory/optional subject’
parameter. Hyams first suggested that there is a universal rule that states that the subject of a sentence is either obligatory or optional, all children had to learn
to do was to learn whether their language had optional or obligatory subjects and set their subject parameter accordingly. However, recent research on the Asian
languages suggests that this is far too simplistic. In Japanese children can leave the subject out, but only in certain situations. The optionality is decided by
discourse factors such as whether the information carried by the subject is new or old. It is hard to see how to create a universal grammatical rule that says ‘either
have obligatory subjects or optional subjects but pay attention to the discourse to see when you can leave the subject out in the particular language you are learning;
be careful though because the circumstances in which you can leave it out may be quite complex’. With a rule this broad, there seems to be little advantage in having
universal rules at all and, if it is so difficult to capture language universals, perhaps the input – the language we hear – is more important than the nativists would
have us believe.
Constructivist theorists argue that language can be learnt, although not in the simple behaviourist way suggested by Skinner. They agree that something must be innate;
there must be something in us all that allows us all to learn language. However, they differ from nativists in 2 important respects. First, they do not believe that
there is an innate language acquisition device, rather there is a predisposition to be good at picking up the types of things that are necessary to learn language.
Second, they do not believe that the innate mechanism is all that important in learning language. Most of the work is done by children learning from their experiences
with the environment.
Although the logical argument from learnability theory appears to pose a problem for non-nativist accounts of language acquisition, this has been questioned by a
number of researchers such as Rohde & Plaut (1999), who suggest that statistical learning techniques provide evidence that natural languages are learnable. Although
statistical and distributional learning methods have been widely investigated, it is not clear exactly what contribution this type of learning makes with respect to
the problem of language acquisition (how it interacts with other aspects of language acquisition), what statistical information is available to children, and how this
information is utilised. However, there is a strong case for the argument that innate knowledge of language is not necessary. This does leave the problem of explaining
how, in the absence of innate knowledge, language is learned. There are many different types of constructivist theory, some of which are variously known as
’empiricist’, ‘interactionist’ and ’emergentist’ theories, however all will here be referred to ‘constructivist’.
Early Constructivist Theories
Social Interactionist Theory
The fundamental argument of constructivist accounts of language acquisition is that the input received by children is not impoverished – rather, it contains all the
information necessary for language learning. Catherine Snow (1977) argues that child-directed speech is different to adult-directed speech in that it is simpler and
‘cleaner’, therefore easier to learn language from, though it is still complex and consistent with ‘full’ adult speech. Child directed speech – also referred to as
‘motherese’ and ‘babytalk’ – is characterised by things that make it easier for children to decode the meaning of sentences. These include slow speech rate, making it
easier for the child to segment words from the speech stream; exaggerated intonation, making stress patterns more obvious to the child; a high fundamental frequency
(or pitch), which children are ‘tuned’ to; repetition; simple syntax, making the grammatical rules of language more obvious; and simple and concrete vocabulary. Also,
although Brown & Hanlon (1970) found that parents do not explicitly correct children’s grammar (the ‘no negative evidence’ problem), social interactionists have found
that they do provide children with more subtle feedback. They expand or recast children’s utterances into a grammatically correct equivalent and they prompt for more
details and ask follow-up questions. These responses contain an implicit correction of children’s grammar.
There are problems with this idea, though. First, child directed speech contains a lot of questions. Interrogatives (e.g. “what are you doing?”) have a more complex
linguistic structure than declaratives (e.g. “Susie likes the horse”) and some argue that this makes child directed speech more complex for children. Second, child
directed speech as described above is not universal. In some cultures people do not use child directed speech at all. In other languages (e.g. Quiche Mayan) child
directed speech is used, but is characterised by different patterns such as a low fundamental frequency rather than a high one. It is hard to argue that both low and
high fundamental frequencies make language learning easier!
Semantic and Cognitive Accounts
Although the issue of whether child directed speech is necessary for language learning is debated, the claim that speech to children is necessarily impoverished has
lost some of its weight as a result. At the very least, it suggests that the role of input should be studied in more detail before it can be dismissed. However, if
constructivists were to argue that children do learn language, they still had to explain how it was learnt. One solution to this problem was to hypothesise that
children accessed grammar through a more salient route. Early constructivist accounts of acquisition, therefore, argued that children could learn the grammatical
categories underlying sentences by associating them with their semantic or cognitive equivalents.
Semantic accounts (e.g. Bates & MacWhinney, 1982) are based on the notion that syntax is learned by associating grammatical categories with their semantic equivalents;
semantic roles (agent, action, patient etc.) are mapped onto their syntactic counterparts (subject, verb, object etc.). In a ‘typical’ semantic constructivist theory,
words are classified by semantic properties and the grammatical properties of these words are gradually analysed. Words that cannot be classified in this
straightforward manner are compared in terms of their grammatical properties as well as their semantic properties. The key idea is that words are analysed both in form
and function and narrow categories are formed on this basis. These eventually become adult-like syntactic categories. Two arguments against semantic theories come from
Maratsos. First, he notes that it is not clear that children construct initial semantic categories (Maratsos, 1979). Second, he argues that grammar cannot be acquired
in this way as not all languages have reliable semantic-syntactic correspondences (Maratsos, 1999).
Another proposal (e.g. Bates, 1979) is that language development follows on from the child’s mastery of the relevant cognitive achievements. According to Piaget
(1955), conceptual development provides the basis for linguistic development. Piaget thus accounted for the fact that language does not develop until the second year
of life by arguing that children spent the first 18 months exploring the world, discovering the natures of things in the world and constructing the ability to use
symbols to represent objects. It is not until 18 months that we have the conceptual structures necessary for language. We have seen (see section 3) that children’s
knowledge and abilities develop much younger than Piaget thought. However, other theorists have used Piaget’s ideas to develop theories of language development which
predicate linguistic achievements on cognitive achievements. For example, according to Gopnik and Meltzoff (1986), disappearance words (e.g. “gone”) appear when
children have mastered object permanence. Wh-words such as “when” and “why” are mastered once the child has acquired the concept of temporal and causal conjunctions.
This seems like a sensible idea, however it is very hard to test. A lot of supportive research comes from experiments showing correlations between the age at which
children achieve a particular cognitive skill (e.g. symbolic play) and related language (e.g. mental state words such as “believe” and “think”). However, correlation
does not equal cause – just because two things occur together doesn’t mean that one causes the other. Children start to combine words into phrases at about 2 years of
age – at about the same age they start to gain bladder control – however no one would suggest that gaining bladder control causes language acquisition. Without
evidence of a causal relationship, the theory is not really supported. Further criticism comes evidence from studies on children with severe physical or mental
disabilities. For Piaget, development is reliant on acting upon and experiencing the world – exploring your surroundings, but children who are unable to have the usual
interaction with the environment are still able to master language. A group of children born in the 1950s, whose mothers had taken the drug thalidomide during
pregnancy, were born with missing limbs and were unable to have typical sensorimotor experiences, but were just as capable of acquiring language as other children.
Also children with severe cognitive learning disabilities often develop good language skills (e.g. Williams syndrome). Severe cognitive deficits do not seem
necessarily to lead to severe language difficulties which cognitive theory would predict.
More recently, constructivist theories have used statistical and distributional information about language to argue that children can learn language from the input
they receive. The semantic-distributional approach includes a number of different accounts with one aspect in common: They are based on the idea that distributional or
positional commonalties in the language guide children’s learning of grammatical categories and rules. Theorists such as Martin Braine (1987) argue that nativist and
traditional semantic accounts attribute to children categories and rules that are more abstract than the data would indicate they possess. They suggest instead that
children learn language by picking up distributional patterns (as well as the semantic regularities and similarities common to certain word classes) and use these to
form syntactic categories. For example the phrases “a dog”, “a cat”, “a bird” and “a car” all follow the pattern ‘a + noun’. Grammatical rules and categories are
implicit in the distributional characteristics of language. Children attend to the patterns present in language and are able to form syntactic categories on the basis
of these patterns, using general cognitive tools. Other examples of such patterns are ‘noun + s’ for plurals and ‘verb + ed’ for the past tense. Braine (1987) suggests
that we first learn individual words, then we build up connections between related words (e.g. grow, show, mow; growed, showed, mowed). In this way, we learn the
grammatical patterning of our language. Distributional theories are often implemented as computer programs – cognitive models. There two broad classes of cognitive
model: Symbolic models and connectionist models. Connectionist models are computer models that learn by building up connections between related words, in the same way
that Braine suggested for children. Rumelhart and McClelland (1986), for example, have had some success at modelling past tense acquisition. We will look at an example
of a symbolic cognitive model of language acquisition in the next section.
Braine’s model has been criticised as being too simplistic by the nativists (e.g. Pinker, 1984), who argues that there are simply too many possible interconnections
for the child to keep track of. Also, Sabbagh & Gelman (2000) note that although domain-general learning mechanisms are undoubtedly powerful (and necessary), they
query whether they are sufficient. They ask how we are able to classify words in terms of abstract categories, unless we already have knowledge of these categories.
Similarly, Pinker (1987) notes that distributional learning can explain the identification of items with respect to their syntactic categories on the basis of context,
but identifies this as problematic as these contexts have to already exist. A strong constructivist perspective, such as the new emergentist school of thought
(MacWhinney, 1999), argues that this is something of a red herring – there is no need for abstract categories to exist prior to language learning. These can emerge
from the information present in the linguistic environment, when coupled with powerful, yet constrained, learning mechanisms.
Constructivist models based have also been criticised on logical grounds. Certain formal properties of human languages, such as the presence of long-distance
dependencies, have been seen as ruling out such accounts. Consider the following sentences:
The cats chase mice
The girl who likes cats chases mice
It the first sentence, the verb ‘chase’ agrees with ‘cats’ – this is ‘subject-verb’ agreement mentioned earlier. In the second sentence, the subject is ‘the girl who
likes cats’ and the verb ‘chases’ agrees with ‘the girl’. This is an example of a long-distance dependency. There is some ‘distance’ between ‘girl’ and ‘chases’
whereas, in the first sentence, ‘cats’ directly precedes ‘chase’. The argument against distributional accounts is that it would be impossible for either children or
computer programmes to use the long-distance information in the second sentence. However, recent work by Elman (1993) has shown it is possible for a relatively simple
distributional learning mechanism to learn such dependencies. This suggests that such logical arguments should be treated with a certain amount of scepticism since
they derive much of their power from the way in which they conceptualise language acquisition as a single logical problem rather than as a complex developmental
Regardless of the specific theory used to describe the process of acquisition, what becomes clear is that there is much evidence in support of the general
constructivist paradigm. With regard to lexical learning, Huttenlocher et al. (1991) demonstrated that there is a relationship between individual differences in
vocabulary growth and the amount of parental speech input children receive. Redington et al. (1993), who used the cluster analysis techniques to examine a large corpus
of speech data, show that grammatical categories are implicit in the linguistic input children receive. The results of this cluster analysis, based on the
distributional statistics of language, provide evidence that syntactic categories are clearly differentiated in adult speech.