Reflections on leading a Randomised Controlled Trial
Randomised Controlled Trial. The words themselves sound clinical, conjuring up images of lab rats in cages and doctors in white coats. Indeed, until recent years, Randomised Controlled Trials (RCTs) were largely only used in medical experimentation. However, whilst they undoubtedly have their critics, RCTs are increasingly also seen as the gold standard of educational research as they are much more rigorous and inherently less biased than traditional, more qualitative methods. As a Research School, a large part of our remit is about interpreting the evidence from large-scale trials (such as those conducted by the EEF), which provide us with invaluable information about interventions which have been shown to work in other contexts. Nevertheless, the prospect of conducting our own trial was daunting to say the least. So, when, back in October, we were invited to join a ‘Neuroscience-Informed, Teacher-Led RCT project’ led by the Education Development Trust and the Wellcome Trust, we jumped at the chance.
We knew straight away that we wanted to focus our trial on retrieval practice. Like the majority of teachers up and down the country, we have been grappling with preparing students for the demands of the new curriculum and the longer, tougher exams they will sit at the end of their courses. From trial and error in our own classrooms, and from reading the existing research (this summary, from Deans for Impact, is a good starting point), we felt that retrieval practice was at least part of the answer in helping teachers and students overcome these challenges. However, we had no way of knowing whether retrieval practice (as summarised brilliantly here by the Learning Scientists), just one of many classroom interventions in place in our schools, was the intervention which was going to make the difference. This is where the RCT process is crucial: we needed to test our hypothesis.
Once a hypothesis has been arrived at, the next important step in the RCT process is the trial design. The possibilities are literally endless here, but our key mantra was to keep things simple. You need to know at the outset what you are going to measure and how; once this has been decided, everything else falls into place quite easily. We chose to test whether regular retrieval practice (in the form of simple 5-question quizzes at the start of lessons) improved students’ retention of key vocabulary. Students were tested before the start of the trial (the ‘pre-test’) and again at the end (the ‘post-test’) using a 50-word multiple-choice question (MCQ) vocabulary test. This would allow us to collect gained score data for participants which could be analysed to see whether the intervention had had an impact.
We recruited 286 Year 8 students to our trial (12 classes in total). Classes were randomly assigned to either ‘Control’ or ‘Intervention’ (the randomisation process is actually very straightforward: you can simply use the RAND function in Excel, as we did, or literally pick names out of a hat). During the 7-week trial, students in the Control group (‘business as usual’) were taught the regular scheme of work for the English Literature unit they were studying, while the Intervention group were taught a slightly modified scheme of work where they received a 5-question vocabulary quiz at the start of each lesson on key words which had been introduced in previous lessons. That was it. Staff were aware of whether they were teaching a control or intervention group and were given appropriate training and resources to deliver the content effectively. Students, however, were not aware that they were taking part in a trial, so had no reason to respond any differently to how they would usually (in RCT language this is known as ‘mundane realism’ and is desirable, as participants who are aware they are in a trial can often behave differently simply because of this awareness. This is known as the ‘Hawthorne effect’ after the trial where this behavioural change was first observed).
So far, so good. The trial progressed without incident and two months later, we had our data. Here was the part we had been waiting for, the ability to calculate an ‘effect size’ which would tell us whether our hypothesis was right, but it was also the part we were most intimated by. As a geography teacher, I have always felt I had a reasonably good understanding of statistics, but this took things to a whole new level. Luckily, we had access to a range of whizzy spreadsheets that took the hard work out of this stage for us (for those not so fortunate, there are a range of freely available tools, including Excel Stats Wizard, R stats and the EEFs own DIY evaluation guide). And whilst we didn’t even pretend to understand some of the calculations involved, we finally had some actual results in the form of an r-value (the effect size) and p-value (the probability our results had occurred by chance).
Now this was the exciting part...except, at first glance, we found our results a little underwhelming. Although our p-value was very encouraging (a result of our large sample size and effective trial design), our overall effect size came in at 0.2 – indicating a positive impact but only a relatively small one. Feeling a bit frustrated – after all, we felt in our gut that this intervention should really work - we decided to analyse our results further, looking at pupils within the trial grouped by prior attainment, gender, attendance and so on. And at this point we really did get excited. Because what we found was that for all but one of the classes who had received the intervention, there was a significant positive effect size of 0.5, showing that regular retrieval practice was associated with gained scores for these pupils which were more than double those of their peers. This was true for pupils regardless of gender, prior attainment or disadvantage. There was then one intervention class whose results didn’t fit this pattern and for whom the intervention appeared to have little impact; this group had lower attendance on average than the others which may be the reason for the lower attainment scores, but further replication would be needed here to investigate further.
So what happens now? Well for us, the next steps are threefold. Firstly, we have been spurred on by our initial findings and want to replicate our trial on a bigger scale; we will probably apply for an Innovation Evaluation Grant from the IEE to help us do this. Secondly, we feel that we have some concrete evidence to take back into our schools and use as a basis for reviewing and modifying current classroom practice. We will use the EEF’s recently published Implementation Guide as a model for leading this change. Finally, having lived and breathed it ourselves for the last few months, we now feel much better equipped to support other schools with the RCT process. RCTs might not be able to provide the answers to every research question, but ours has certainly provided clarity for us and we look forward to helping other schools achieve this too.
To find out more about our work or get in touch regarding your research journey, visit our website https://www.shottonhallresearchschool.co.uk/