Computer-based language assessment: the future is here

David Booth
David Booth
A blonde woman sat at a computer with headphones on in a room with more computers and desks in background

Many people are surprised at the idea of a computer program marking an exam paper. However, computer-based testing already exists in many different formats and many different areas. Many tests or exams that form part of our daily life are taken on computers. If you’ve ever learned to drive, sat a citizenship test, done a training course at work, or completed a placement test for a language course, the odds are that you’ve already taken an automated test.

Yet despite it being so common, there is still a lack of understanding when it comes to computer-based language assessment and how a computer can evaluate productive skills like speaking and writing.

Computer-based testing: a closer look

A common issue is that people have different ideas of what these tests entail. Computers can fulfill several essential roles in the testing process, but these often go unacknowledged. For example, a variety of test questions are needed to administer an exam, along with relevant data, and computers are used to store both the questions and the data. When it comes to creating randomized exams, computer software is used to select the exam questions, based on this data.

Computers can make complex calculations far more quickly and accurately than humans. This means that processes that previously took a long time are completed in days, rather than weeks.

Artificial intelligence (AI) technology is now capable of grading exam papers, for example. This means a shorter wait for exam results. In PTE Academic, candidates receive their results in an average of two days rather than waiting weeks for an examiner to mark their paper by hand.

The benefits for students and teachers

People take exams to prove their skills and abilities. Depending on their goals, the right result can open the door to many new opportunities, whether that is simply moving on to the next stage of a course, or something as life-changing as allowing you to take up your place on a university course in another country.

A qualification can act as a passport to a better career or an enhanced education, and for that reason, it’s important that both students and teachers can have faith in their results.

Computer programs have no inherent bias, which means that candidates can be confident that they will all be treated the same, regardless of their background, appearance or accent. PTE Academic, just one of Pearson’s computer-based exams, offers students the chance to score additional points on the exam with innovative integrated test items.

This integration means that the results are a far more accurate depiction of the candidate’s abilities and provide a truer reflection of their linguistic prowess.

More than questions on a screen

It’s not as easy as simply transferring the questions onto a computer screen. All that does is remove the need for pen and paper; this is a missed opportunity to harness the precision and speed of a computer, as well as its learning potential.

Tests that have been fully digitized, such as PTE Academic, benefit from that automation; eliminating examiner bias, making the test fairer and calculating the results more quickly. Automated testing builds on the technological tradition of opening doors for the future – not closing them.

How technology enhances language testing

The development of automated testing technologies doesn’t merely make the examination process quicker and more accurate – it also gives us the chance to innovate. Speaking assessments are an excellent example of this.

Previously, this part of a language exam involved an interview, led by an examiner, who asked questions and elicited answers. But now that we have the technological capability, using a computer offers students the chance to be tested on a much wider range of speaking skills, without worrying about the inherent bias of the examiner.

Indeed, the use of a computer-based system facilitates integrated skills testing. Traditionally, language exams had separate papers focusing on the four skills of reading, listening, speaking and writing. But the more modern concept of language testing aims to assess these linguistic skills used together, just as they are in real-life situations.

Afterwards, the various scores are categorized to allow learners an insight into their strengths and weaknesses, which helps both students and teachers identify areas which need improvement. This useful feedback is only possible because of the accuracy and detail of automated exam grading.

The space race on paper

Back in the 1960s, during the space race, computers were still a relatively new concept. Kathleen Johnson, one of the first African-American women to work for NASA as a scientist, was a mathematician with a reputation for doing incredibly complex manual calculations. Although computers had made the orbital calculations, the astronauts on the first space flight refused to fly until Kathleen had checked those calculations three times.

This anecdote reminds us that - although computer technology is an inherent part of everyday life - now and then, we still need to check that their systems are working as they should. Human error still comes into play – after all, humans program these systems.

PTE Academic – a fully digitized exam

Every stage of PTE Academic, from registration to practice tests to results (both receiving and sharing them with institutions) happens online. It may come as a surprise to learn that the test itself is not taken online. Instead, students attend one of over 295 test centers to take the exam, which comes with the highest levels of data security.

This means that each student can sit the exam in an environment designed for that purpose. It also allows the receiving institutions, such as universities and colleges, to be assured of the validity of the PTE Academic result.

The future is here

We created computers, but they have surpassed us in many areas – exam grading being a case in point. Computers can score more accurately and consistently than humans, and they don’t get tired late in the day, or become distracted by a candidate’s accent.

The use of AI technology to grade student responses represents a giant leap forward in language testing, leading to fairer and more accurate student results. It also means more consistency in grading which benefits the institutions, such as universities, which rely on these scores to accurately reflect ability.

And here at Pearson, we are invested in staying at the cutting edge of assessment. Our test developers are incorporating AI solutions now, using its learning capacity to create algorithms and build programs that can assess speaking and writing skills accurately and quickly. We’re expanding the horizons of English language assessment for students, teachers and all the other professionals involved in each stage of the language learning journey.

More blogs from Pearson

  • A girl sat at a desk with a laptop and notepad studying and taking notes

    AI scoring vs human scoring for language tests: What's the difference?

    By Charlotte Guest
    Reading time: 6 minutes

    When entering the world of language proficiency tests, test takers are often faced with a dilemma: Should they opt for tests scored by humans or those assessed by artificial intelligence (AI)? The choice might seem trivial at first, but understanding the differences between AI scoring and human language test scoring can significantly impact preparation strategy and, ultimately, determine test outcomes.

    The human touch in language proficiency testing and scoring

    Historically, language tests have been scored by human assessors. This method leverages the nuanced understanding that humans have of language, including idiomatic expressions, cultural references, and the subtleties of tone and even writing style, akin to the capabilities of the human brain. Human scorers can appreciate the creative and original use of language, potentially rewarding test takers for flair and originality in their answers. Scorers are particularly effective at evaluating progress or achievement tests, which are designed to assess a student's language knowledge and progress after completing a particular chapter, unit, or at the end of a course, reflecting how well the language tester is performing in their language learning studies.

    One significant difference between human and AI scoring is how they handle context. Human scorers can understand the significance and implications of a particular word or phrase in a given context, while AI algorithms rely on predetermined rules and datasets.

    The adaptability and learning capabilities of human brains contribute significantly to the effectiveness of scoring in language tests, mirroring how these brains adjust and learn from new information.

    Advantages:

    • Nuanced understanding: Human scorers are adept at interpreting the complexities and nuances of language that AI might miss.
    • Contextual flexibility: Humans can consider context beyond the written or spoken word, understanding cultural and situational implications.

    Disadvantages:

    • Subjectivity and inconsistency: Despite rigorous training, human-based scoring can introduce a level of subjectivity and variability, potentially affecting the fairness and reliability of scores.
    • Time and resource intensive: Human-based scoring is labor-intensive and time-consuming, often resulting in longer waiting times for results.
    • Human bias: Assessors, despite being highly trained and experienced, bring their own perspectives, preferences and preconceptions into the grading process. This can lead to variability in scoring, where two equally competent test takers might receive different scores based on the scorer's subjective judgment.

    The rise of AI in language test scoring

    With advancements in technology, AI-based scoring systems have started to play a significant role in language assessment. These systems utilize algorithms and natural language processing (NLP) techniques to evaluate test responses. AI scoring promises objectivity and efficiency, offering a standardized way to assess language and proficiency level.

    Advantages:

    • Consistency: AI scoring systems provide a consistent scoring method, applying the same criteria across all test takers, thereby reducing the potential for bias.
    • Speed: AI can process and score tests much faster than human scorers can, leading to quicker results turnaround.
    • Great for more nervous testers: Not everyone likes having to take a test in front of a person, so AI removes that extra stress.

    Disadvantages:

    • Lack of nuance recognition: AI may not fully understand subtle nuances, creativity, or complex structures in language the way a human scorer can.
    • Dependence on data: The effectiveness of AI scoring is heavily reliant on the data it has been trained on, which can limit its ability to interpret less common responses accurately.

    Making the choice

    When deciding between tests scored by humans or AI, consider the following factors:

    • Your strengths: If you have a creative flair and excel at expressing original thoughts, human-scored tests might appreciate your unique approach more. Conversely, if you excel in structured language use and clear, concise expression, AI-scored tests could work to your advantage.
    • Your goals: Consider why you're taking the test. Some organizations might prefer one scoring method over the other, so it's worth investigating their preferences.
    • Preparation time: If you're on a tight schedule, the quicker turnaround time of AI-scored tests might be beneficial.

    Ultimately, both scoring methods aim to measure and assess language proficiency accurately. The key is understanding how each approach aligns with your personal strengths and goals.

    The bias factor in language testing

    An often-discussed concern in both AI and human language test scoring is the issue of bias. With AI scoring, biases can be ingrained in the algorithms due to the data they are trained on, but if the system is well designed, bias can be removed and provide fairer scoring.

    Conversely speaking, human scorers, despite their best efforts to remain objective, bring their own subconscious biases to the evaluation process. These biases might be related to a test taker's accent, dialect, or even the content of their responses, which could subtly influence the scorer's perceptions and judgments. Efforts are continually made to mitigate these biases in both approaches to ensure a fair and equitable assessment for all test takers.

    Preparing for success in foreign language proficiency tests

    Regardless of the scoring method, thorough preparation remains, of course, crucial. Familiarize yourself with the test format, practice under timed conditions, and seek feedback on your performance, whether from teachers, peers, or through self-assessment tools.

    The distinctions between AI scoring and human in language tests continue to blur, with many exams now incorporating a mix of both to have students leverage their respective strengths. Understanding and interpreting written language is essential in preparing for language proficiency tests, especially for reading tests. By understanding these differences, test takers can better prepare for their exams, setting themselves up for the best possible outcome.

    Will AI replace human-marked tests?

    The question of whether AI will replace markers in language tests is complex and multifaceted. On one hand, the efficiency, consistency and scalability of AI scoring systems present a compelling case for their increased utilization. These systems can process vast numbers of tests in a fraction of the time it takes markers, providing quick feedback that is invaluable in educational settings. On the other hand, the nuanced understanding, contextual knowledge, flexibility, and ability to appreciate the subtleties of language that human markers bring to the table are qualities that AI has yet to fully replicate.

    Both AI and human-based scoring aim to accurately assess language proficiency levels, such as those defined by the Common European Framework of Reference for Languages or the Global Scale of English, where a level like C2 or 85-90 indicates that a student can understand virtually everything, master the foreign language perfectly, and potentially have superior knowledge compared to a native speaker.

    The integration of AI in language testing is less about replacement and more about complementing and enhancing the existing processes. AI can handle the objective, clear-cut aspects of language testing, freeing markers to focus on the more subjective, nuanced responses that require a human touch. This hybrid approach could lead to a more robust, efficient and fair assessment system, leveraging the strengths of both humans and AI.

    Future developments in AI technology and machine learning may narrow the gap between AI and human grading capabilities. However, the ethical considerations, such as ensuring fairness and addressing bias, along with the desire to maintain a human element in education, suggest that a balanced approach will persist. In conclusion, while AI will increasingly play a significant role in language testing, it is unlikely to completely replace markers. Instead, the future lies in finding the optimal synergy between technological advancements and human judgment to enhance the fairness, accuracy and efficiency of language proficiency assessments.

    Tests to let your language skills shine through

    Explore Pearson's innovative language testing solutions today and discover how we are blending the best of AI technology and our own expertise to offer you reliable, fair and efficient language proficiency assessments. We are committed to offering reliable and credible proficiency tests, ensuring that our certifications are recognized for job applications, university admissions, citizenship applications, and by employers worldwide. Whether you're gearing up for academic, professional, or personal success, our tests are designed to meet your diverse needs and help unlock your full potential.

    Take the next step in your language learning journey with Pearson and experience the difference that a meticulously crafted test can make.

  • Woman standing outside with a coffee and headphones

    Using language learning as a form of self-care for wellbeing

    By Charlotte Guest
    Reading time: 6.5 minuts

    In today’s fast-paced world, finding time for self-care is more important than ever. Among a range of traditional self-care practices, learning a language emerges as an unexpected but incredibly rewarding approach. Learning a foreign language is a key aspect of personal development and can help your mental health, offering benefits like improved career opportunities, enhanced creativity, and the ability to connect with people from diverse cultures.

  • Friends walking outdoors chatting to eachother

    Understanding dialects in the English language

    By Charlotte Guest
    Reading time: 7 minutes

    Language reflects the diversity of human culture and society. Among its most fascinating parts are dialects, regional or social varieties of a language distinguished by pronunciation, grammar and vocabulary. Dialects are the heartbeat of a language, pulsing with the rich stories, traditions and identities of those who speak them.

    Understanding a language and dialect, and its significance, can enrich the learning experience for language learners, offering a deeper appreciation of a language and its speakers. Dialects are not just variations within a language; they are often considered separate entities, each with its own rich history and cultural significance, highlighting the complexity and diversity of linguistic expression.

    What exactly is a regional dialect?

    At its core, a dialect is a variation of a language spoken by a particular group of people. However, the distinction between dialects and a different language can often be subjective. These variations can occur due to geographical, social class, ethnic, or historical reasons. While all speakers of a language share the same basic grammar rules and vocabulary, those speaking different dialects might use unique words and slang or have distinct pronunciations, highlighting the lack of an objective difference between dialects and languages.

    For instance, British and American English are two dialects of the English language that are mutually intelligible, meaning speakers of either dialect can understand, and be understood, by the other. They share the same foundational grammar and most of the core vocabulary but differ in pronunciation, spelling, and some aspects of vocabulary and idioms. Similarly, within Britain or the United States, there are numerous regional dialects (e.g., Yorkshire English, Southern American English) that further showcase the diversity within a single language. Some of these dialects are considered by their speakers to be distinct languages, emphasizing the complex nature of linguistic identity and classification.

    What is an example of a dialect?

    An example of dialect variation can be seen in the Italian language, which boasts a wide range of regional dialects, showcasing linguistic diversity with multiple dialects spoken across Italy.

    For example, the Tuscan dialect has historically been recognized as the basis for standard Italian, largely due to its use in influential literature. However, other dialects from regions like Sicily or Lombardy vary significantly from Tuscan Italian in terms of pronunciation, vocabulary and syntax, reflecting the diverse cultural landscapes and histories of Italy’s regions.

    Another example of this variation of dialect within a single language is found in the United Kingdom. For instance, the Cockney dialect, originating from London’s East End, is renowned for its rhyming slang and distinct vowel sounds, serving as a prime example of spoken dialects that emphasize the importance of oral tradition. Contrastingly, the Geordie dialect, native to Newcastle and the surrounding areas, boasts an entirely different set of vocabulary, pronunciation patterns, and even grammatical structures, further highlighting the diverse range of spoken dialects within the standard English language.

    What is the difference between a dialect and an accent?

    The distinction between a dialect and an accent is subtle yet significant. An accent relates solely to differences in pronunciation - the distinct manner in which people say words, often influenced by unique speech patterns that can vary significantly across different languages and regions.

    In contrast, a dialect encompasses not only pronunciation and accent but also includes specific grammar and vocabulary. Accents can be a component of a dialect, but dialects offer a broader spectrum of linguistic variety, including lexical and grammatical differences.

    For instance, someone might speak English with a Scottish accent but use the same grammatical structures and vocabulary as an English speaker from London; however, Scots, a variety spoken in Scotland, is considered a dialect (or even a separate, distinct language, by some) because it possesses unique grammar, vocabulary and pronunciation.

    Why are different dialects important?

    Dialects are more than just linguistic variations; they are windows into communities' cultural and social fabric. They carry with them histories, traditions and the identity of their speakers. Some dialects are even considered 'distinct languages' by their speakers, highlighting the deep cultural significance of these linguistic forms. Learning about dialects, including regional dialects, can thus offer insights into:

    • Cultural contexts: Understanding the dialects of a language, especially regional dialects, can provide language learners with a richer cultural understanding and a more nuanced perspective of the language’s speakers. This exploration into regional dialects reveals the arbitrary distinction between 'standard' and 'nonstandard' dialects, which is often based on social, political, cultural, or historical considerations.
    • Social dynamics: Dialects can reflect social distinctions, historical migrations and contact with other languages, offering clues about social hierarchies, historical conflicts and integrations.
    • Language evolution: Studying dialects reveals how languages change over time, adapting to societies' needs, migrations and innovations.