Written by: Lucie Betáková, English Department
Pedagogical Faculty, University of South Bohemia
Jeronýmova 10, 371 15 České Budějovice, Czech Republic
In this paper I would like to say a few words about my experience with testing speaking skills of advanced learners at the university level.
I usually teach practical English courses to students in the final year of their studies. There are two major types of courses- pre-service courses for future teachers of English and courses for people who learn two languages-  English and German or English and French for either business  or the administration of the European Union. It is important to say that the course I teach is a GPE (general purpose English) course, the language of their specialisation (either for teaching, business or administartion) are taught in specialised courses.
At the end of the course there is an exam, the final language exam in the study programme. The aim of the test was to test overall language proficiency of the learners whose level should reach the level of CPE (Cambridge Proficiency Exam) or at least be somewhere between CAE (Cambridge Advanced English) and CPE. For the test I chose a test battery which (according to Underhill) usually consists of several tests of different kinds- structure, extended writing, listening and oral test. The test I designed consisted of a listening task (a multiple choice exercise), a reading comprehension test (open-ended, or short answer  questions), writing an essay based on the reading passage and a test of communicative performance.
Some information on the  oral test is the main aim of this contribution.
Oral tests

There exist many formats for testing speaking. We can devide them according to the type of interaction which takes place during the test.
There are in fact four basic possibilities for the interaction taking place during the oral test. According to Underhill:
a)      The testee speaks to an interviewer who is also the assessor
b)      The testee speaks to an interlocutor, who is not involved in assessment
c)      The testee speaks to another testee
d)     The testee speaks to a group of testees.
(Underhill p.7) then explains the terms interviewer, interlocutor and assessor. Instead of the word testee he uses a term learner which does not, in his words, imply being only an object of testing.
An interwiever is a person who talks to a learner in an oral test and controls to a greater or a lesser extent the direction and topic of the converstaion. He may intervene but not talk too much. An interwiever also takes the role of the assessor .
Interlocutor – Some oral tests have a person whose job is to help the learner to speak, but who is not required to assess him. An interlocutor is a person who talks with a learner in an oral test , and whose specific aim is to encourage the learner to display, to the assessor, his oral fluency in the best way possible. An interlocutor is not an assessor.He may be known to the learner, for example as his teacher.
An assessor is a person who listens to a learner speaking in an oral test and makes an evaluative judgement on what he hears. The assessor will be aided by pre-defined guidelines such as rating scales , which give considerable help in making these judgements. Ultimately, the decision is a subjective one, which is to say that it is a human one made on the basis of judgement , intuition and experience.Having more than one assessor usually means a more reliable judgement.
The roles of  the interlocutor and the assessor may be combined. This is the most common and most economical arrangement , but as Underhill points out it is difficult for one person to concentrate on assessing effectively while at the same time trying to appear interested in what the learner is saying and involved in serious communication with him. This dual role is particularly tiriing , and frequent rest breaks are necessary. (Und.28)
Learner-learner interaction

As has been pointed out the testee can either speak to the interwiever who  also serves as the  assessor or speaks to the interlocutor who is not directly involved in assessing the testee. For the learner it does not have to make much difference. The important thing is whether the person who communicates directly with the learner pays enough attention to what the learner is saying. The learner should be informed beforehand what the roles of the two people, interlocutor and assessor, are going to be in the test. The assessor usually takes notes to remember the learners‘s mistakes. I think it is also important to tell the learner that the notes will be taken but tell him at the same time that the assessor does not only consider the weakneses but that he also notes down the strengths of the learner’s speech. So it does not mean that every note the assessor is taking represents one mistake in the learner’s speech.
Another possibility for oral testing is the learner talking to another learner or to a group of learners. Then we speak about learner/learner interaction.
The idea is that two learners speak together to carry out a set task, while the assessor listens without intervening.The asessor then can fully concentrate on the performance of the learners because he does not have to worry anymore about keeping the conversation going and about eliciting the language.
The advantage of this type of interaction is that the learners may feel they are talking to someone whose language level is approximately the same and whose interests are very similar to their own, unlike the possible interests of the interlocutor or assessor.This can make the communication more fluent, natural and authentic. In my experience it makes the learners (with only a few exceptions) quite willing to speak.
Authors dealing with testing oral interaction suggest that some care has to be taken in pairing the learners because of their personality (extraversion and  introversion) and interests and especially their language level. So the suggested technique is to put together learners of the same or very similar level of proficiency.
Techniques for testing learner/learner interaction

Discussion, conversation
Underhill points out that this is the most natural thing in the world- two people having a conversation on a topic of common interest.It is also the hardest to make happen in the framework of a language test, it can only occur when both parties are relaxed and confident , if they have something to contribute to the conversation . Then the conversation itself becomes dominant and the real purpose of testing is only subordinate.The oral test then reaches the highest degree of authenticity. The learners here, unlike in the interview, have the initiative in bringing up a new topic , developing it or bringing it to a close.The directions taken by the conversation are the result of the interaction between the people involved in a negotiation process.
In practice, this success depends very much on the ability of the interwiever to create the right atmosphere  and is the question of the personality of the interviewer but i think also the learners and their mutual relationship. When authentic conversation takes place, says Underhill, a test suddenly becomes a human encounter, a meeting of three people. It is true that only learners with quite a high level of proficiency are equipped linguistically enough to feeel at ease.
Other types of activiites for testing spoken interaction can include
learner/learner  decision making, discussing various types of  input (e.g. a video seguence, a reading passage). The task usually involves taking information from written documents and coming to a decision or consensus about certain questions through discussion.
Another useful technique for testing spoken interaction is role/play.
If it is used well it can reduce the artificiality of the classroom and can provide a reason for speaking , especially a reason for talking to other learners. The situations and roles must be selected with the needs and interests of the students in mind.
A possible problem of role-plays is a possible reluctance of the learners to participate because it implies pretending either to be someone else or pretending being in an imagenary situation and it might be quite difficult for some learners to simulate such a situation. Some people are willing to pretend, others are not. Especially unpleasant is role-playing with the assessor or teacher because first of all they are not at the  same level of language proficiency but also because the assessor is always in a power position so for the learners it might be particularly difficult to pretend they are friends or colleagues, if required. I have some experience with using role-plays with advanced learners. I would say the activities were usually very successful but it is also important to say that it took the students some time to get started. At the beginning they are usually shy, they do not want to be the first to speak. It usually helped when I took a funny role myself, it somehow broke the ice at the beginning. I only used more complicated role-plays and simulations with highly motivated and really advanced learners. I would also like to point out that the success of this type of activity depends very much on the atmosphere in the class, the relationships among the members of the group. The group has to be supportive, the people I used it with had known each other, including me, for some time.
With less advanced and less motivated learners it is better to use more controlled role-plays in which the students are told at least in general terms what they should say or if they have a model conversation. All these facts , I think, speak against the use of role-plays for testing purposes, I myself would probably use it only for testing simple conversational routines.
Most teachers will agree that assessing or marking spoken language or even spoken interaction is a real pain. In subjective tests as speaking or free writing both inter and intra marker reliability are rather low. It does not mean, on the other hand, that we should not test free speaking and writing. Within the communicative method,  oral communication is crucial, so we should not be afraid to test it. If we decide to test spoken interaction we should have in mind the importance of validity and  thus design a test which will really measure spoken interaction, it means which will involve the participants in meaningful and authentic spoken communication with other participant(s). It means to  devise a subjective test but make it reliable , i.e. producing consistent results.
There are two ways of marking productive skills – either global or analytic marking. Analytic marking means for the assessor to have  a set of categories (criteria), and give a separate mark for each category.
The most commonly used criteria for assessing spoken interaction are the following:

  • Grammar
  • Vocabulary
  • Pronunciation, stress, intonation
  • Style and fluency
  • Content
  • Underhill  suggests his performance criteria:
  • Size (length) of the speech
  • Complexity (whether the learner attempts to use complex language or not)
  • Speed of the speech
  • Flexibility (whether the learner is able to adapt to changes in topic or task)
  • Language accuracy
  • Appropriacy of the language
  • Independence
  • Repetition (how often does the question or stimulus have to be repeated)
  • Hesitation (How much the learner hesitates).

We can see that some criteria can be assessed quite objectively, others cannot.
The advantage of analytic assessment is that it is  up to the assessor to choose the criteria corresponding to the aims or purposes of the test.
On the other hand global or impression marking means that the assessor awards a mark on the basis of the learner’s overall performance without examing any special features. It is especially useful for categories which are very difficult to measure but very important for successful communication like fluency, authenticity, naturallness of speech etc.
My test

For my oral test I chose interaction between two students, myself being both interlocutor and assessor in one person. The reason was that I wanted the interaction to be as natural as possible. I wanted to engage the students in real discussion. I was sure that in case I had been the only partner in the discussion the students would have been shy, they would always have given me the right to speak and would have  said much less that they do when they speak to a mate. In such a situation the students  might  even avoid some topics they would find most controversial.
I also wanted to avoid to engage a formal assessor because I feel that it would make the situation more formal and the students would feel less at ease.
In this case I let the studets speak to each other and I only  reacted to what they said or if they asked me a question. At the end I asked both the students some questions because I usually found what they said quite interesting so I wanted to clarify something or to find out more about the issue and I also wanted to show them that I was listening carefully and was interested in what they said.

For the format I chose a discussion. The reasons have already been stated. Discussion is very natural, in comparison with role-play students feel more at ease and it is also much easier to prepare. To elicit enough language from the students I decided to base the discussion on controversial statements the students would discuss. On the other hand, from my practice I know, that some controversial statements from textbooks are very much hated by the students and that often they have nothing to say. Other topics, on the other hand, are very successful. That is why I decided to give the students topics which would correspond to their knowledge and interests. I asked the students in one of our language sessions to carry out a brainstorming activity which would produce 20 controversial topics for the exam discussion. Here you can see the list of topics the students had produced:

  1. Mobile phones are a real nuisance and should be banned.
  2. Men and women solve problems in a different way.
  3. We can never understand our parents and they can never understand us.
  4. The ideal partner should be first of all intelligent. (or handsome, or rich, etc.)
  5. Christmas is all about money.
  6. Nowadays, people are not able to appreciate the beauty of nature.
  7. Are we becoming a nation of hypermarkets?
  8. Teaching is art, not science.
  9. Being single is in, marriage is out.
  10. What should be an attractive woman like? Always young and anorexic?
  11. Men should participate in running the household.
  12. Young people don’t read any more. They watch TV.
  13. Internet helps us  find useful information.
  14. Drugs should never be legalized.
  15. I am proud of being Czech.
  16. Culture shock: does it exist?
  17. There is nothing like a national character.
  18. There is a lack of communication among people nowadays.
  19. What is the role of the mother tongue in our society?
  20. English should become a compulsory language for all basic school children.

Giving the students the topics beforehand means giving them time to prepare. Is it good or bad?
Underhill in his book asks this question. He believes that the  advantage of test preparation is that it promotes learners’ confidence. On the other hand , the more learners are allowed or encouraged to prepare for a specific test the less their performance will represent their ordinary oral ability in a natural situation. The best test is so authentic and flexible  that the only preparation possible is successful learning and learning just for the test becomes impossible.
I expected the students would have a chance to look up some topic-related vocabulary which would enable them to speak about the topic in depth. What they really had done, I realised later, was that they had discussed some of the topics they had found difficult to speak about. They had not rehearsed anything, they had just spoken about the topics in pubs or coffee shops. It was really interesting because it made their oral production even more natural as they could start the conversation with statements like:Well, we spoke about this problem yesterday and we thought, or we had totally different opnions, or we found out that etc…
So I realised that the exam did not push the students so much to work on the language but it pushed them to think about many problems of modern life, which, I think, is not a bad by-product. The students’ preparation for the test also showed that such a test could have a very positive wash-back effect on the preceeding autonomous learning of the students, which means getting engaged in natural discussion and that such a test might promote discussion not only in English but also in the students’ mother tongue, which is also valuable. The discussion does not only involve presenting one’s one opinions but also listening to other people’s opinions, giving them a chance to speak, to express themselves.The test format I have chosen appeared to be quite successful as can be hopefully seen from the accompanying video.
An oral test at a higher level of the language proficiency will take longer because the learners will need more time to demonstrate the greater proficiency and it will especially take more time to express ideas. A test based on more general aims will require more time to generate a big enough sample of language to base the assessment on.
According to Underhill the best test lasts as long as it takes the interwiever to form a confident judgement. My test was over when the students had nothing more to contribute, they felt happy with what had been said and I had big enough a sample to be able to assess them.
In case it happened that one of the pair spoke much less than the other I always asked him/her some questions to make sure I elicited enough language.
Bachman, L. F. Fundamental Considerations in Language Testing. Oxford: Oxford University Press.
Hughes, A. 1989. Testing for Language Teachers. Cambridge: Cambridge Unievrsity Press.
Underhill, N. 1987.Testing Spoken Language. Cambridge: Cambridge University Press.
Weir,C.J. 1990. Communicative Language Testing. Prentice Hall International.