Why the world’s learning company has to love data

hero img

This article is a response to a piece we published on this blog by Dr Ben Williamson of Stirling University, in which he explores an “emerging criticism of Pearson among education researchers”. In his piece Dr Williamson refers to two activities of Pearson: the Centre for Digital Data, Analytics and Adaptive Learning and The Learning Curve. Dr. Kristen DiCerbo and Dr. John Behrens of the Centre jointly author the response on the former; and Dominic Collard responds on the points made in relation to The Learning Curve.

We would like to thank Dr. Williamson for his interest in Pearson’s use of data and our research efforts. We believe strongly that open dialogue is key to the world making progress in education.


Dr. Kristen DiCerbo - (Centre for Digital Data, Analytics and Adaptive Learning.)

As one of the questions Dr. Williamson asks is “Who at Pearson is collecting the data, designing the algorithms to analyse it, and checking the analytics for their accuracy?”, I’d thought that’d be a good place to begin. I have written this with my colleague John Behrens, who is referenced with me in Dr. Williamson’s article.

One of our great delights in our years at Pearson is how much the company values our varied backgrounds as social science researcher-practitioners and supports our involvement in public and academic discourse. My Ph.D. is in educational psychology; after completing a school psychology program and becoming a certified psychologist, I worked in schools as a school psychologist and continued on to a research career that included observing classroom instruction around the world. John was a social worker before obtaining degrees in special education and educational psychology with cognates in instruction and cognition, as well as measurement, statistics & methodological studies. He was a professor for 10 years and gave up tenure to work in contexts in which he could apply learning and analytics at scale.

So when Dr. Williamson writes that Pearson is, “beginning to challenge the existing authority of social scientists and psychologists to study, understand and produce new knowledge about key aspects of education such as assessment and learning.” we find this a surprising conclusion, since we in fact identify as social scientists and psychologists (and technologists and educators).

Like Dr. Williamson, we believe the digital revolution is a remarkable event in the evolution of human interaction. We study this phenomenon and participate in academic communities to reflect, discuss, and have broad interchanges about these societal changes. We, and other colleagues at Pearson, publish papers, present in open scientific forums, and provide ongoing community service through external advisory boards, editorial boards, and support of journals with peer review, among other activities. We support graduate student training with internships and mentoring and have had a broad range of collaborations with academics in fields related to our interests. This provides future scholars with unique access to both the processes and challenges of research and innovation in business environments.

We are, however, not just researchers, but practitioners as well. A common concern for researchers and sponsors of research is the lack of mechanism for translating learning science research into practice. An embedded research group is one way to make that happen. We feel privileged to work side by side with product design and engineering teams to help build the most efficacious products and services we can for our customers. This means not just studying corporations and other loci of innovation and development, but working within them. We believe that we have a responsibility as stewards of educational data to conduct research to further our understanding of both learning and data methodology. There is no reason this activity should be the sole province of academia or organizations based on their tax status. Education is a complex endeavor and as Dr. Williamson points out, requires many actors and perspectives.

Our theory of action

So, what exactly is Pearson trying to accomplish with the funding of data and learning science research? Dr. Williamson asks, “why is Pearson investing in such a massive effort to conduct educational data science?” Our answer is simple: we want to serve students, parents, teachers, and administrators in the best possible way, by considering all the tools that can be fruitfully brought to bear.

Like our goal, our theory of action is simple: Better data analysis → better understanding of students’ attributes/curriculum/learning trajectories → better instructional decisions → improved learner outcomes.

By using better data analysis techniques applied to data captured from better designed activities, we hope to build more complete and accurate models of learners’ knowledge, skills, and attributes that will provide better information to teachers and learners and provide systems that are relevant to each student’s individual proficiency levels, interests, and current states. As we discussed in Impacts of the Digital Ocean on Education (DiCerbo & Behrens, 2014), our starting point on this journey is not that we should make the natural activity of society more digital, but rather that, as it is already happening organically, the educational community needs to understand the opportunities and challenges that emerge. If students are working in digital systems throughout the year, we think it essential to give them feedback along the way, and irresponsible to ignore the opportunity. Indeed it is our hope that increased awareness about learner progress throughout the year can change the balance of need for the much-maligned annual test.  We are proud to work at a company that emphasizes learner outcomes (see our efficacy efforts for more on this) and whose results can be accepted or rejected by the consumer.

Our belief system

Dr. Williamson states that one of his main concerns is that our work is, “premised on a kind of big data belief system which assumes that massive quantities of data can reveal truthful and meaningful patterns about the reality they’re taken from—that the data can speak for themselves free of human bias.”  While this is a common characterization of modern analytics writ large, a simple review of our writing suggests a different stance. Way back in 1997 John wrote (with Mary Lee Smith in the Handbook of Educational Psychology) that data analysis must be understood in the “context of history, the context of application, the context of practice and the context of alternative methods” (p. 945). More recently he advised the Learning Analytics community that “The successful learning analyst will avoid two common errors: Failure to understand the context and failure to become intimately familiar with the data.” (Learning Analytics & Knowledge Conference, 2013).

In the Impacts of the Digital Ocean on Education paper, the following figure is one of our favorites:

Screen Shot 2016-01-19 at 12.01.07

In the paper, we write, “Data is only a representation or symbol of what happens in the world. In most contexts, the goal of data collection and analysis is to provide insight and inform decisions. Accordingly, there is a long chain of reasoning that needs to be considered.” We recognize that data is a representation of the world and like all representations, it is an imperfect system which will not perfectly capture the detail of the world. We also believe that all of the activity coming after that (analysis, interpretation, etc.) is a human endeavor, involving all the benefits and challenges that implies. This view of data analysis as human process that requires understanding of meaningfulness of context and social negotiation is, in fact, a consistent theme over our careers as reflected in such works as Why People Are the Real Power Behind Big Data, Technological Implications for Assessment Ecosystems, and Activity Theory and Assessment Theory in the Design and Understanding of the Packet Tracer Ecosystem. Finally, interested readers can read more about how to avoid being “fooled by data” in our writings on exploratory data analysis (here, here, and here, for example).

Final thoughts

We hope that Dr. Williamson is correct that we are well-positioned to create new knowledge and methods. Pearson is a dynamic and evolving company working in a dynamic and evolving set of social, technological, political and economic contexts.  We are energized by the opportunity to serve the global community of learners and educators, and to work at the intersection of academic exploration and end-user service.

Dr. Williamson asks about what our work looks like “from the inside.” Given our experiences across a variety of research settings, we think he would be surprised to see how much the work we do looks just like work done in education research labs everywhere, with the added component that we are directly implementing our findings to impact the lives of learners. Just as with anyone else interested in what we do, we would be delighted to take him through our work in more detail.


The Learning Curve (Dominic Collard)

At Pearson, we believe that data helps unlock the secrets of learning. That alongside the know-how and experience of teachers and educators, data can reveal things that are invisible to the human eye and the human brain, and so help us all make better decisions.

It’s a belief that requires data to be not just robust, but also seen and used. The professional researcher may be comfortable navigating through labyrinths of numbers, but most of the rest of us are not. Teachers, parents, government officials… anyone interested in what is working well in education - most of us probably don’t have the time, the skills or the inclination to get really deep into the data.

The Learning Curve - essentially a collection of thousands of education data points collected from all over the world over the last 25 years - is one attempt to make data seen and used by more people. We want people to discover their own conclusions and draw their own correlations between education inputs (ie spend, teacher salaries, class sizes) and education and socioeconomic outcomes (ie literacy levels, graduation rates, crime and unemployment.)

None of the data on The Learning Curve ‘belongs’ to Pearson. The Economist Intelligence Unit gathers it all for us, from sources such as the OECD, UNESCO, The World Bank and the International Labour Organisation, to name just a few. Dr Williamson is correct that the EIU is an independent business within The Economist Group, which until recently Pearson had a stake. And it is equally true that few other organisations could manage the systematic and regular collection of the wide range of data that The Learning Curve demands.

All that data is then presented via a range of interactive visualisations, designed so the user is able to control the parameters of what they are seeing. For instance, you may like to know how the US and the UK compared in 2001 for public expenditure per pupil as a % of GDP. Or you may like to play that comparison out for all countries across 25 years. At a few touches of a button you can do both, and everything inbetween. Or, if you are confident using large spreadsheets of data, then we also give you the option of downloading everything to an Excel file. The Learning Curve has been specifically designed so nobody has to second guess what the user wants to understand, or the method they want to discover it.

There is another section of the site - the Index - which I suspect Dr Williamson is referring to when he argues it “limits what kinds of analyses can be done and what can be said about the data because it has been designed to prioritize the measurement and comparison of ‘effective’ education…”. The Index is an attempt to rank countries based on their overall education performance - a global league table of education standards. We think the way we have calculated where countries come stands up to scrutiny (and we provide a full explanation of the methodology on the site so people can judge for themselves) - but we also know that you could legitimately calculate this in many, many other ways. We have never suggested the Index should be seen as the final say, and have always gone to great lengths to explain that it is just one interpretation, whose value reduces the more you read it in isolation. Pearson would absolutely agree with Dr Williamson on the importance of understanding “...social and cultural context, emotional complexity, and the qualitative dimensions of human relations” in education systems. The truth is though, for now these things are much harder to measure and collect data on. That’s why we see The Learning Curve as the start of the conversation, not the end.

There is one more point I would like to make about The Learning Curve, that I appreciate is not brought up by Dr Williamson. It is free. As long as you have an internet connection and a device to access it, you can spend as long as you like exploring what it has to reveal; ¾ million people worldwide have done so.

The Learning Curve is not a modest undertaking for Pearson - in terms of cost or time - and there is no immediate revenue incentive for us either. Of course, we hope that it helps our reputation and so our ability to take part in the conversations that shape education. And, yes, that should then help our commercial performance in the long-run. But the absolute opposite will be the case if The Learning Curve somehow doesn’t stand up; if somehow we are using it to steer people away from the evidence and towards something we’d wish they’d believe, if only it were true.

Like my colleagues Kristen and John, I’d be delighted to spend time with Dr Williamson to show him behind the scenes of The Learning Curve , and of course get his view on where we might be able to improve things.