A Review of Evidence on Literacy ‘Catch-Up’ During Transition to Secondary School

literacy catchupIntroduction

This is a rapid review of the existing evidence on interventions to assist children from disadvantaged backgrounds in danger of missing expected levels in literacy at the end of Year 6 and start of Year 7. The focus of the report will be on tested interventions and promising approaches that would be effective for children eligible for the pupil premium. The search will be for international work ‘published’ in English since 2001, involving rigorously evaluated interventions for literacy of 10 to 12 year olds in mainstream schooling.


  • This is a rapid review of evidence on what we know and what we need to know about literacy ‘catch-up’ for pupils moving into Year 7 at secondary school.
  • The evidence consists of 43 studies meeting minimal design and quality standards for programme evaluation, supplemented by expert knowledge of the field.
  • The studies emerged from a systematic search of nine key bibliographic databases including grey literature and unpublished dissertations.
  • Most of the evaluations have been conducted in US settings, and most are not specifically about transition.
  • There is no approach to overcoming low levels of literacy that is definitely known to work for children in mainstream settings. Where programmes are more effective, they have been used with younger children.
  • It is not the timing or medium of instruction that matters, it seems. The use of technology or summer schools sui generis are of no help. There is no clear pattern of greater success with year 6 or 7.
  • It is not motivation alone that matters. Motivation without competence does not make sense. Therefore providing incentives for inputs is more effective than for outcomes.
  • Specific classroom programmes and interventions generally find it hard to shift patterns of low literacy by age 10 or 11.
  • The individual interventions that reported success tended to be single-issue, clearer and simpler in approach, and further removed from normal practice than the less successful ones. Examples included emphasis on grammar, comprehension, or how to ask and respond to questions.
  • There were some more complex packages that have been quite heavily evaluated usually with mixed results. These could repay an investigation for the preferred age group in a UK context.
  • The most promising of these is the programme entitled ‘Response to Intervention’. There may be also be some merit in cross-age peer-assisted learning.
  • Other elements common to more than one promising interventions include teacher development from the outset, provision of new learning materials, ongoing preferably individual support for learners, and a focus on those clearly below expected levels. However, there is no suggestion that these elements are either necessary or sufficient.
  • There must be a wide range of as yet untested or as yet undeveloped interventions that could be effective. It is important that new work is not constrained overmuch by the ideas that have been tested before. However, such new ideas must start evaluation at an earlier phase in the design cycle.
  • New studies should be as large as possible, with randomisation, control of diffusion, process evaluation as standard, and outcome measures (for pre- and post- tests) should be assessed using the same instruments, preferably one standardised test to avoid the temptation of ‘fishing’.


This very rapid review of evidence was commissioned by the Educational Endowment Foundation to assist it in identifying areas of work that need doing urgently. The concern is with pupils in England from disadvantaged backgrounds, such that they might count towards the pupil premium, and who leave primary school with below the ‘expected’ level of literacy. The review set out to identify, evaluate and synthesise:

  • the most promising interventions worldwide that have an impact on the literacy of pupils in upper primary Year 6 (5th grade or last year of elementary school) or lower secondary Year 7 (6th grade or first year of middle school)
  • the key features of interventions that help pupils identified as having literacy difficulties at primary school to ‘catch up’ in secondary school
  • areas and approaches where possible interventions have been shown not to work
  • and areas and approaches where possible interventions are as yet untested

The focus is on literacy. However, where evidence is found relevant to numeracy at the same stages it can be included. The purpose of the review is to help uncover what is known about what works to improve primary:secondary transition for disadvantaged pupils with lower than expected literacy. Therefore, we report here only on the evaluation of interventions.

Methods used for review

We sought evidence through a systematic search of nine key electronic databases covering education, sociology and psychology, and supplemented what we found with literature known to us from our previous work in the field. We limited the search to studies reporting from 2001 to May 2012, in the English language. Some studies prior to this time period were also included as they were deemed well-cited pieces that were directly relevant pioneering work validated by the What Works Clearing House.

We conducted a broad search using very general keywords like secondary transition, Key Stage 2, Key Stage 3, Year 6, Year 7, grade 5, grade 6, and literacy, reading, writing, English, and randomised controlled trials, trials, interventions, propensity score, instrumental variable, experiment and their synonyms. In this way we sought studies that used an appropriate design to evaluate an intervention that referred to literacy or similar and to the preferred age range of pupils. The initial database searches were also compared to the equivalent search on Google Scholar and to a list of relevant studies already known to us. This helped us to see where relevant studies were not being captured by the search terms and to adjust them accordingly for the full search. The search had to be comprehensive but not unwieldy.

The search began with seven key relevant databases – ASSIA, ERIC, Social Services Abstracts, Sociological Abstracts, PsycInfo, International Bibliography of the Social Sciences, ProQuest Dissertations and Theses. From previous reviews we knew that the majority of randomised controlled trials were in grey literature, particularly dissertations and theses, hence we included the database ProQuest. The syntax for PsycInfo had to be slightly different because it does not recognise the use of “” as is now standard for most other search engines. The syntax settled on was:

((“secondary transition” OR “Key stage 2” OR “Key stage 3” OR “Year 6” Or “Year 7” OR “grade 5” OR “grade 6” OR “grade 7” OR “grade 8”) AND (reading OR literacy OR English Or writing) AND (randomi* OR trial* OR experiment* OR “instrumental variable*” OR “propensity score” OR “difference in difference” OR “regression discontinuity”) AND (intervention*))

This yielded 3,897 hits. Of these, only 82 were judged from reading the abstract to be relevant to the topic searched and not duplicates of each other. If the abstract made it clear that the study did not include children in Years 6 and 7 or their equivalent then it was excluded. If the abstract made it clear that there was no intervention to improve literacy or equivalent then it was excluded. If the abstract was not clear on this, the piece was retained for the present so as not to exclude potentially relevant studies. Again we noted how poor many abstracts are, and how coy the authors appear to be about what they did and found.

This set of reports was supplemented with a search of the British Education Index and the Australian Education Index databases, using equivalent search terms, altered only because the systems are incompatible. It would be very useful if these could be standardised. The syntax here was:

 ((all(“secondary transition”) OR all(“Key stage 2”) OR all(“Key stage 4”) OR all(“Year 6”) OR all(“Year 7”) OR all(“grade 5”) OR all(“grade 6”) OR all(“grade 7”) OR all(“grade 8”)) AND (all(reading) Or all(literacy) OR all(English) OR all(writing)) AND (all(randomi*) OR all(trial*) OR all(experiment*) OR all(“instrumental variable*”) OR all(“propensity score”) OR all(“difference in difference”) OR all(“regression discontinuity”)) AND (all(intervention*)))

 The full syntax is given here for each search so that replication is possible for interested parties. This second search yielded 78 potential reports, which added to the first 82 creates a total of 160. Of these, two were duplicates of others. The remaining 158 were read as full papers or reports. The fuller reports resolved ambiguities from the abstracts, meaning that 105 further studies were excluded. They are listed below largely for readers who seek evidence of other age groups or school topics, for examples, and for each one we specify the main reason for exclusion. Of course, some studies could be excluded on two or more criteria, perhaps because they did not involve the right age group, and were not about intervention to improve literacy and may be outside the publication date.

 Six studies were excluded as having an inappropriate design for a robust evaluation. These were Bryan et al. (2007), Cates et al. (2007), Compton et al. (2005), McKeown (1997), Moccia (2005), Culican et al. (2001). Five were not primary research – WWC (2010), Bowers et al. (2010), Sharples et al. (2011), Slavin et al. (2009), Suggate (2010). A further 41 were not about literacy but school transition more generally – What Works. Research about Teaching and Learning. Second Edition (1987), Addressing Barriers to Pupil Learning & Promoting Healthy Development: A Usable Research-Base. A Center Brief (2000), Saxon Math. What Works Clearinghouse Intervention Report (2010), Connected Mathematics Project (CMP). What Works Clearinghouse Intervention Report (2010), Coping Power. What Works Clearinghouse Intervention Report (2011), Agarwal et al. (2010), Allen (2011), Anderson (2006), August et al. (2011), August et al. (2010), Batchelor et al. (2005), Bennett et al. (2003), Bornarel (1990), Buhrman (2010), Capstick (2007), Coffman et al. (2007), Gallagher (1968), Martin et al. (2012), Mazur et al., McMahon (2011), Neale et al. (1966), Orsos et al. (2001), Ostrogorsky (2008), Patrikakou (2004), Prescott, (2010) , Sanders et al. (2008), Silverthorn et al. (2005), Sinclair et al. (2005), Spier (2010), Springer-Schwatken (2004), Spurgeon (2003), Srofe (2009), Swezey (2004), Valentine et al. (2009), Whitehouse (2002), Williams (2007), Woods et al. (2010), Wynn and Ransom (1977), Zanobini and Usai (2002), Zoblotsky (2003). Seven were interventions concerning general behaviour or skills – Fudge et al. (2008), Jason (1980), Johnson (1991), Meadows (1995), Mendelowitz (1991), Miller (2012), Mitchell (1993). Three concerned English as a second language – Green et al. (2011), Kim et al. (2011), Mannion (2009). A further 16 did not intervene to improve literacy as such – Corcos and Willows (2009), Crowell (2011), Hadley et al. (2010), Hallberg et al. (2011), Hamm et al. (2011), Haring (2007), Hu et al. (2011), Jackson (2010), Jolly and Turner (1979), Lewis (1995), Mustain (2006), Nalls (2011), Outhouse (2008), Tilstra (2007), Wright (2010), Van Keer, and Vanderlinde (2010). For example, Van Keer & Vanderlinde (2010) evaluated the impact of cross- age peer tutoring on pupils’ use and awareness of reading strategies, rather than their reading comprehension performance.

Of the remainder, four were not about mainstream education – Diliberto et al. (2009), Ehrlich (1979), Helou et al. (2007), Kellems and Morningstar (2010) – and 23 did not involve the preferred age group or were outside the specified date of publication. These were Success for All[R]. WWC Intervention Report (2009), Arnold (1986), Bishop (1979), Braun et al. (2011), Bruce and Chan (1994), Centeno (2005), Chaparro et al. (2012), Darcy et al. (1974), Dutch and McCall (1974), Johnston (1973), Jones (1987), Kamps et al. (2007), Kim (2006), Lockley, McDiarmid (1993), McKeown (1997), Simmons et al. (2010), Straksis (2010), Torres (2004), Weldy (1991), Williams (1975), Williams (1999), White (2012).

 The remaining 43 reports are included in the narrative synthesis below, supplemented by around the same number of studies traced through the references in the other 43, already known to us through other work, or pointed out to us by colleagues. Work is included irrespective of scale or quality at this stage. Because of time constraints (10-day turnaround) we only reviewed full dissertations/theses where they were available on-line, published as journal articles or presented at conferences. Otherwise we worked with as much as did appear on-line. The Endnote file is available via the authors for interested readers, and a fuller summary of each piece appears as an Appendix to this report.

It is acknowledged to be difficult to make a review comprehensive in the sense of including all relevant material without also having to read a disproportionate set of seemingly relevant material. This review is no different, except that it was conducted rapidly. There will be studies that have been missed. This only matters if their inclusion would have substantially altered the conclusions based on filtering through the 4,000 studies found here. It seems unlikely, and this is the most copious and up-to-date review on this specific topic. A more concerning issue is that there may be studies or commercial evaluations of learning artefacts missed because they have no publicly available or on-line reports. These are perhaps less likely to be positive evaluations than negative or neutral ones. Given that there are also well-known problems like the so-called ‘Hawthorne’ effect, the necessary use of volunteer schools and families in any trial, and the higher effect sizes encountered in research with training, expertise, resource and enthusiasm than in roll out of the same interventions, readers should assume that this review paints a slightly more optimistic picture of any intervention than it ought to.

The logic of this review suggests that the findings should be classified into four groups – a brief section on specific methods of delivering literacy interventions (such as via ICT), interventions where is considerable promise as a result of previous evaluation, interventions where the evidence for success is inconclusive, and finally a section where interventions show very little promise or have been demonstrated not to work. These groupings are not about individual studies, since no study however large and high quality is convincing on its own. The groups into which studies are grouped here may be suggested by the studies themselves, or by prior reviews such as those of the What Works Clearing House. However, all are presented here based on our judgements, and some studies are deemed relevant to more than one area. No area contains enough robust studies involving the same factors to conduct a formal meta-analysis.

Delivery of literacy interventions

One issue for all potential interventions is their method of delivery, including how and when delivery takes place.

Summer Schools and Programmes

 As far as we can tell from the evidence here, summer school programmes in themselves are not effective in improving literacy for pupils in transition. One study of around 2,000 pupils in transition from primary to secondary divided them non-randomly into two groups. It found no differential impact on literacy gain scores between the group who attended a 50 hour summer literacy school compared to a control. Both groups demonstrated an equivalent decline in scores from pre- to post-test (Sainsbury et al. 1998). Therefore, it seems that the reason for any decline over that crucial summer is nothing to do with whether literacy practice and teaching takes place. It could be due to anxiety about changing school, a change in school routine or a different curriculum emphasis. A smaller, more recent study from the US involved 331 pupils from grades 1 to 5 in one school (Kim (2007). Using stratification in terms of pre-test reading ability, pupils were randomly allocated to a treatment or delayed treatment in a waiting-list design. The treatment involved receiving 10 free books to read during the summer vacation, including postcards and letters to stimulate reading. Using self-report, the treatment group read 3 more books, on average, than the control. However, this did not convert to any difference in the literacy scores between the groups after the vacation. The number of pupils is quite small in the preferred age range (grade 5) and 52 pupils moved away during the summer (proportionately for each group and stratum). Put another way, what these two studies may show is not that summer interventions cannot work, but that it is not just about doing something well-meaning and plausible in summer. For example, it may be necessary to have some further input rather than just providing books. On the other hand, the first study suggests that simply having more ‘school’ over summer does not help either.

Use of technology and software

 Almost exactly the same situation arises from the evidence concerning technology-based literacy enhancement. The use of software, in itself, does not work and is not a solution. The vehicle of delivery, as with the issue of timing with summer schools, is not the active ingredient. The results may depend on the precise activities undertaken. This is especially clear where a technology-based approach is compared directly to another form of literacy learning (unlike the summer schools which were compared with no treatment at all). Khan and Gorard (2012) randomised 23 initial Year 7 literacy classes involving 672 pupils either to standard treatment or use of a piece of widely-used commercial software that claimed to improve literacy levels in six weeks. The process evaluation suggested that staff, pupils and even parents were more enthusiastic about the software approach that was used in all literacy lessons for a full term. Both groups improved their literacy scores, but the standard treatment group made much more progress (an effect size of nearly 0.4 standard deviations from pre- to post-test). Far from helping, the use of software may have hindered. Brooks et al. (2006) report a smaller trial of 155 Year 7 pupils in one school – which may have led to some diffusion of the process, and around 25 pupils dropped out of the study before analysis. Pupils were randomised either to a treatment group receiving one hour per day for 10 days of literacy development via computer, or to standard treatment (with a waiting-list design). There was no statistically significant gain for the treatment group in spelling. The control group had significantly higher gain scores for reading than the treatment. So, again, software by itself does not work, at least not after 10 hours, and may actually harm progress in literacy.

The producers of the software tested by Khan and Gorard (2011) had evidence that their product worked, based on before-and-after data. This demonstrated that pupils using the software for a long period improved their literacy on average. And this claim is valid. What the trial showed was that the control group improved even more. This is why before and after designs without a true counterfactual are misleading (Gorard 2013). One such study is that reported by Johnson and Howard (2003). It looked at the impact of using a specific software package on the reading achievement and vocabulary development of 755 3rd, 4th and 5th graders from low socio-economic backgrounds in the US. There was no control and so no randomisation. The paper does not present the improvement scores, but does claim that ‘high’ users of the software showed greater gains than ‘low’ users. Without a real comparator this is meaningless as the usage might be the cause or the effect of literacy gains, or of something else like motivation. This is no better than the evidence presented by the software manufacturers. Similarly, Meyer et al (2010) conducted a study to evaluate the effects of a web-based tutoring system on reading comprehension for grades 5 and 7. Two key design features were the type of feedback offered by the system (elaborated or simple) and the degree of choice pupils had in practice lessons (choice or no choice). They explored the effects of these on different measures of reading comprehension and the extent to which gains were maintained across the summer break, giving them 12 treatment and control conditions, and pupils were allocated by stratified randomisation to each. This all sounds perfectly plausible, except that the study involved only 111 in both grades combined, giving an average of only four pupils per grade and treatment arm.

 Promising approaches                             

 We found no intervention that had been evaluated more than once successfully without also finding an equivalent or greater body of evidence that it does not work. The interventions that have been tried repeatedly either clearly do not work, or the picture is unclear. This section summaries some promising individual evaluations of specific bespoke programmes.

Sundry individual studies with promise

Some specific interventions have apparently only been evaluated once. They are summarised individually and briefly here, if they show some promise at this stage.

Cantrell et al. (2010) looked at LSC (Learning Strategies Curriculum), which is an adolescent reading intervention programme to improve reading comprehension for 6th to 9th grade pupils, as a supplement to the regular curriculum. The study involved 862 6thand 9th grade pupils in 12 middle and 11 high schools. Experimental pupils were exposed to an extra 50-60 minutes of LSC per day over the course of the school year. The initiative comprised a whole-school model involving professional development for all content teachers in content area literacy, and the LSC intervention for 6th and 9th grade pupils who scored two grade levels below the grade level on the pre-test (Grade Reading and Diagnostic Evaluation). All pupils were given the whole-school model, but only a randomly selected group of struggling readers received the targeted LSC intervention on top. Teachers on the programme were trained and taught six strategies of the LSC. In the final analysis, pre-test and post-test results were available for only 655 pupils – 6th graders (n = 302) – 171 intervention; 131 control, and 9th graders (n = 353) – 194 intervention; 159 control. Outcome measures for pre- and post-test were collected using the standardised GRADE, a norm-referenced standardised test of reading achievement. The 6th grade intervention pupils outperformed pupils in the control group, on reading comprehension although effect size is small (0.22). However, no significant differences between treatment and control groups were found among 9th graders on both NCE and GSC.

Myhill et al. (2012) set out to determine the impact of contextualised grammar teaching on pupil writing and meta-linguistic understanding, investigating the relationships between pedagogical support for grammar teaching, teachers’ subject knowledge about grammar, and improvements in pupils’ writing. The intervention group received detailed support materials, with appropriate training, and the control group only an outline scheme of work without the pedagogical support. Pupils attended 32 different schools and fell within the age range 11-18, by the end of the study numbering 744. All pupils showed a mean improvement in pre- and post-test scores of 9.24%, but the intervention group (n=412) showed a mean improvement of 11.52% against the control group (n=332) of 6.41%. This was deemed significant. Given that the pupils were aged 11-18 from 32 schools, on average there can only have been about three pupils of each age in each school. This is not discussed in the report.

Brown (2004) reports a quasi-experimental pre- and post-test design to test the impact of a question-answering intervention on 267 Year 5 pupils’ reading comprehension, question-answering and vocabulary performance. Participants were from 10 classes across three schools, assigned (in an unspecified manner) to treatment (question-answering program) or control conditions (regular reading classes). There were 167 in the treatment group and 100 in the control group, and teachers volunteered to take either treatment or control classes. Reading comprehension was measured using three tests: the standardised reading comprehension test (Progressive Achievement Tests in Reading: Comprehension) and two curriculum-based written question-answering tests which included a narrative passage and a factual passage, and a test of reading vocabulary and reading fluency. The standardised reading comprehension test (PAT) was used for both pre- and post-tests. Pre- and post-tests of reading fluency were also administered. The experimental group scored higher (73%) compared to the control group (64%).

Coe (2011) is a cluster-randomised experimental study of the 6+1 Trait Writing intervention involving grade 5 pupils in 74 schools in Oregon (US), of which 39 schools were in the treatment arm. Teachers in treatment schools were trained to use the 6+1 Trait Writing model. Teachers in the control schools used the regular instruction (‘business as usual’). Participants included 102 teachers and 2,230 pupils in the intervention group, and 94 teachers and 1,931 pupils in the control condition. Pupils in both groups wrote essays at the beginning of the school year, and their scores were used as baseline measures. At the end of the school year, pupils wrote essays again, and these scores were used as outcome measures. Essays were rated on each of the 6 core characteristics of writing quality included in the 6+1 Writing model. The intervention increased pupil writing scores in the year that it was introduced, but only by a small overall effect size.

Approaches with inconclusive evidence 

Financial incentives

In terms of improving literacy for low achieving and disadvantaged pupils, the evidence here is that paying pupils, families, schools or teachers for results does not work. Perhaps this is because, like using technology or timing interventions over the summer vacation, it is not the incentive itself that would be the active ingredient. For example, in the study reported by Bettinger (2010), pupils in US grades 3 to 6 were paid in gift certificates for every good test result in five core subjects. Each school had two grades from years 3 to 6 randomised to take part, with other grades in the same schools and the same grades in different schools acting as comparators (not an ideal design). There were 24 grade/year groups in each arm of the trial. The trial showed no difference in post-test reading scores. Financial incentives did not work for literacy, perhaps because extrinsic motivation is more effective for less conceptual tasks. Pupils can memorise a series of facts or formulae to prepare them for the tests, but it is more difficult for pupils to prepare for reading a specific text or writing on a particular subject. This may explain why there was a reported difference in maths scores. Fryer (2010) looked at the use of financial incentives for pupil achievement and a range of outcomes, using around 38,000 pupils from about 260 public schools in US cities. They ranged from 1st to 9th grade, but none were 5th grade, and only in Columbia were any pupils in 6th grade. In each city, a treatment group was given monetary payments for verified performance in school. When payments were made for attainment results, there were no significant gains for standardised maths and reading outcomes. The incentives did not work, and the process evaluation suggested this was because the pupils did not know how to improve. Motivation without competence does not work in the short term, perhaps. When payments were made for inputs to the education system, however, the outcomes were different. This led to a marked improvement in inputs such as attendance, behaviour, homework, and wearing correct uniforms. These are all areas where the pupils knew what was expected and were capable of improvement. The money merely provided the incentive. Therefore, paying pupils to read books could yield an increase in reading comprehension. Again, the incentive has to be for inputs not outcomes. Lauen (2011) examined the incentive effects of North Carolina’s practice of awarding teacher performance bonuses for pupil test score achievement on the state tests. Bonuses are at the school level and dependent on whether a school exceeds a certain performance threshold score. Based on a sharp regression discontinuity design, schools that had just missed out on the previous year’s threshold achieved higher test score gains between 4th and 5th grade in reading. However, the cohorts of pupils differed, there may be an element of regression to the mean, and anyway the author notes that the improvements occur only for the highest achieving pupils in the school.

READ 180

One intervention that has been extensively tested, but without a consistent picture emerging, is READ180 (WWC 2009). This is a reading programme designed for pupils in both elementary and high school whose reading achievement is considered below the proficient level. The programme is a combination of computer program, literature and direct instruction in reading skills. The software is used to track and respond to individuals’ progress. There are workbooks designed to address reading comprehension skills, paperback books for independent reading and audiobooks to model reading. The READ 180 instructional model is 90-minute long made up of three components:

  • 90 minutes of whole group direct instruction where the teacher provides direct instruction on reading, writing and vocabulary;
  • 20-minute rotations of smaller groups of pupils through 3 activities. These include small group direct instruction in which the teacher uses resource books and work closely with individual pupils, pupils’ independent use of the READ 180 computer program to practice reading skills, and modelling and independently reading in which pupils use READ 180 paperbacks or audiobooks.
  • 10 minutes of wrap-up discussion with whole class.

A very large number of studies have investigated the impact of this intervention. WWC (2009) lists 111, but most of these did not meet basic WWC standards for evidence and design, and are not considered here. Of the remainder, all are based in the US and used an experimental or quasi-experimental design. Several found no clear impact from READ180 on comprehension for pupils in grades 5 or 6. Interactive Inc. (2002) reported mixed results, but a re-analysis by WWC found no significant differences. Assignment to groups was violated as a number of schools included pupils in the treatment whom they thought would most benefit from the intervention, and parents/caregivers and pupils were anyway allowed to request inclusion or exclusion, while pupils with a reading score lower than grade 1.5 were not allowed to take part at all. White et al. (2005) found no effect of READ 180 on the standardised New York State English Language Arts Test for grades 4 or 8, or on the CTB/McGraw-Hill Reading Test for grade 6. The average effect of READ180 across the three grades as calculated by WWC also suggests that the effect was not large enough to be considered important. Woods (2007) did not find any significant effects of READ 180 on the criterion-referenced Degrees of Reading Power comprehension test for pupils in grades 6 to 8. Two studies did find apparently significant gains in comprehension, but these only involved older pupils in 9th grade (Lang et al. 2008, White et al. 2006). And even here Lang et al. (2008) found no gain for pupils at ‘high risk’ from poor literacy.

Kim et al. (2010) conducted a randomised experiment to examine the causal effects of READ180 on word reading efficiency, reading comprehension and vocabulary and oral reading fluency for struggling readers in grades 4 to 6. The sample included 294 children from three elementary schools with a large proportion of struggling readers, defined as those who scored below proficiency on the Massachusetts Comprehensive Assessment System (MCAS), a standardised state test for English language arts. They were assigned either to READ180 or the district after-school programme which was not specifically focused on literacy, and included art projects, games and commercially produced materials like InstaCamp theme kits for astronomy, history and space exploration. The format and timing of READ180 had to be adapted slightly to conform to timings – with 60 minutes rather than 90 per session. And the other after-school programme may have had some impact on literacy. Both groups were equivalent on baseline measures, and several literacy outcomes were compared after six months of programme implementation. The final sample was 264, as 30 children dropped out between pre- and post-test, but there was no link between drop out and treatment. There was no significant difference between the groups on norm-referenced measures of word reading efficiency, reading comprehension and vocabulary. There was no difference in oral reading fluency for pupils in grades 5 and 6, and no difference in scores on the MCAS English language arts test. Both groups still scored below the minimum proficiency score of 240 at the end of the period of the intervention.

On the other hand, Caggiano (2007) used a non-equivalent control group design to estimate the impact of READ180 on reading outcomes for struggling readers in grades 6, 7 and 8. Participants were 120 pupils from one middle school in America (60 in each arm, with 20 per grade). Instruction was carried out every other day (instead of the daily sessions recommended by Scholastic Inc) for 90 minutes each session. There were no significant differences between groups in all grades, on the Virginia Standards of Learning Assessments in reading. There were no significant differences in grades 7 and 8 on any assessment. However, there were significant differences between experimental and control group in the Scholastic Inventory Reading comprehension assessment for grade 6 only. So, using a test designed by the developer of the program alone appears to show a positive effect for grade pupils’ reading comprehension.

Similarly, Scholastic Research (2008), the same organization who created and market the intervention, reported significantly different results in general literacy for 285 pupils in grades 6, 7, and 9 after one year of READ180, compared to 285 matched pupils. All were considered to be struggling as readers, and a majority had English as a second language. The outcome measure was the gain score in the English Language Arts subtest of the California Standards Test.

White et al. (2005) evaluated the READ180 program in 16 public schools in the US, with 617 pupils identified as below their grade level in grades 4 to 8. Outcome measures were compared with a control group of 4,619 pupils from the same schools who did not receive the intervention. It is not clear whether these were matched or, more likely, just the remaining pupils. The treatment group showed higher gains in reading scores than the control, but this could be regression to the mean if the other group were not also below grade level to start with. The study made a very common mistake in this literature, in using significance testing with a sample that was neither randomly drawn nor randomly allocated to groups.

Woods (2007) examined the impact of READ180 on levels of reading achievement involving 384 pupils assessed as below grade-level readers, by teachers in grades 6 to 8 in a middle school in Virginia, over a three year period. There were reported implementation problems in the first year, and no difference was found in progress between those assigned to the intervention or control after one year. However, the gains for the treatment group were significantly higher in the second and third years following treatment. The report does not describe how the cases were assigned, and therefore whether “significance” is relevant.

Sprague et al. (2010) presents what were in effect separate evaluations of READ180 in five sites across the US. One site in Ohio involved only youth detention centres, and is ignored here. Otherwise, pupils ranged from grade 6 to grade 10, mostly in Title 1 funded schools, and defined as struggling readers below their grade level. In the first year 5,551 pupils were randomly assigned to treatment or control groups. However, only 4,443 pupils were included in the study. There was a high attrition rate in one site (Portland) where only 45% of initial pupils were included in the final analysis. Some sites used the State Assessment tests, and some used the Scholastic Reading Inventory as pre-tests. Eligible pupils were randomly assigned to one of two supplemental programs – READ180 or Xtreme Reading – or to ‘business-as-usual’ where they received regular instruction. All sites used standardised and state level assessments as outcome measures (e.g. Stanford Diagnostic Reading Test, Stanford Achievement Test, New Jersey State Language Arts assessment, Iowa Test of Basic Skills, Tennessee Comprehensive Assessment Program, California Assessment Test, Scholastic Reading Inventory). Only one site showed significant intervention effects, and then only for middle school (not high school). But this site was Portland with 55% dropout from the study, which the multi-level modelling and other complex analyses presented by the authors simply do not address.

Taken overall, the evidence presented here is almost that we know READ180 does not work. It did not work with standardised assessments rather than those created by the programme producers. It did not work in the few randomised control trials presented. It did not work for those most at risk, or those in grades 5 and 6. Reading intervention has been found to be less effective for older children. Yet there is some more positive evidence as well, presented by the programme producers, or in studies without randomization or where there is high dropout. In many of the studies, the intervention was for a one-year duration and perhaps it is not realistic to expect one intervention albeit a comprehensive one like READ 180 to compensate for children’s many years of reading failure and lack of exposure to a supportive learning environment. The fact that results were mixed across sites and across year groups and for different components of literacy suggests that there may be an issue with the consistency with which the program has been implemented. Also, the duration of the program and the amount of training teachers received may differ from school to school.

Project CRISS

Project CRISS – Creating Independence through Pupil-owned Strategies – is a programme aimed at improving reading, writing and learning for 3rd to 12th grade pupils. It involves teachers modelling strategies for pupils and providing time for practice to help them to understand the learning process and to use the strategies learnt to develop independent learning. It requires teachers to adopt purportedly new teaching styles that include monitoring learning, integrating new information with prior knowledge, and which encourage active learning through discussion, writing, organising information and analysing structure of text to improve comprehension. Teachers are given training in the form of workshops, to become CRISS-certified trainers. Like READ180, this programme has been the basis of extensive research (WWC 2010). However, only two studies out of 31 met WWC minimal evidence standards.

Horsfall and Santa (1994) conducted a randomised controlled trial of Project CRISS in sixteen 4th, 6th, 8th and 11th grade classrooms in three schools. Teachers were randomly assigned either to Project CRISS and received training, or to regular instruction. In total the sample included 120 pupils in six intervention classrooms and 111 pupils in six control classrooms in grades 4 and 6 (the only years considered for this review). Around four or five pupils dropped out from each class, in a programme lasting 18 weeks. Project CRISS was judged to have a positive effect on comprehension for both grades, after nine months, measured using teacher-developed ‘free recall’ comprehension tests.

James-Burdumy et al. (2009) report on a randomised controlled trial that examined the impact of Project CRISS and three other reading comprehension curricula (Read About, Read for Real and Reading for Knowledge). The study involved 6,350 grade 5 pupils from 89 schools in districts with at least 12 Title 1 schools and not already implementing any of the four curricula. Schools were randomly assigned to one of the four interventions or to control groups. This review only considers the effect of Project CRISS and those pupils and schools that were evaluated by WWC. So, only 1,155 pupils attending 17 Project CRISS schools, and 1,183 pupils in control schools were included in the analysis reported here. Pre- and post-test comprehension ability was assessed using standardised norm-referenced diagnostic tests, the Group Reading Assessment and Diagnostic Evaluation (GRADE). For the pre-test pupils took the passage comprehension subtest of GRADE and the Test of Silent Contextual Reading Fluency (TOSCRF). For the post-test, pupils only took the passage comprehension subtest. In addition, pupils were randomly selected to take one of the two comprehension assessments developed for the study (Educational Testing Service Science reading comprehension or Educational Testing Service Social Studies reading comprehension assessments). Unlike the earlier study by Horsfall & Santa (1994), the programme in this study by James-Burdumy et al. lasted nine months. The study found no statistically significant effects of Project CRISS on the passage comprehension subtest of GRADE, nor the science or social studies reading comprehension assessments. In fact, the fourth curriculum, Reading for Knowledge, had a statistically-significant negative impact on fifth-grade reading comprehension. When all four intervention groups were combined, intervention group pupils scored lower than control group pupils on the GRADE and the composite test score.

As with READ180, this programme might just also have been in the section on unpromising intervnetions. The longest, largest and most recent study based on standardised norm-referenced assessments found no effect. An older study, conducted by the programme producer themselves, found an effect only with a specially created test of free-recall. Another possibility is that the ideas of the programme have become more general in teacher development anyway since 1994.

Peer Assisted Learning

Peer-Assisted Learning Strategies (PALS) is a peer-tutoring instructional programme that supplements the primary reading curriculum from grade 1 onwards. Pupils work in pairs on reading activities aimed to improve reading accuracy, fluency and comprehension. Pupils, taking turns to be tutor and tutee, read aloud and listen to each other providing feedback during the various activities. For the purpose of this review, we will only look at studies that evaluate the impact of the PALS programme for pupils in grades 5 and 6, and which meet the WWC evidence standards. WWC (2012) reviewed 97 studies that examined the effects of PALS, and found only one study that met minimum standards for evidence and design.

Fuchs et al. (1997) report a randomised controlled trial involving 12 elementary and middle schools with mixed reading scores and free or reduced-price lunch figures. In each arm, 20 teachers, teaching grades 2 to 6 and dealing with at least one learning disabled pupil, were assigned. Each teacher then identified three pupils to take part in the study – a low achiever with a learning disability, a low achiever without learning disability, and an average achiever. Altogether there were 60 pupils in the PALS group and 60 pupils in the comparison group. The program was implemented during scheduled reading lessons conducted three times a week for 35 minutes each. Performance was measured after 15 weeks’ of programme implementation, using the Comprehensive Reading Assessment Battery (CRAB) for both pre-and post-test. The study reported significant gains for the PALS group on reading comprehension. However, WWC (2012) reanalyzed the data taking account of clustering and found no significant difference, despite a reasonable effect size. Anyway, the baseline scores were not equivalent, the authors of the paper were also the developers of PALS, and there was a high dropout rate of 45% that was not evenly divided between the groups.

Van Keer (2004) reports on a study aimed at linking reading strategy instruction with different models of peer tutoring, including teacher-led whole class activities, same-age peer tutoring activities and cross-age peer tutoring activities. Pupils ranged from 9 to 12 years old, and 454 were involved in 19 schools with 22 grade 5 teachers in Flanders. The experimental interventions took place with a matched control group across a school year. Pre- and post-test scores were measured, using the Dutch standardised and IRT-modelled test battery ‘Toetsen Begrijpend Lezen’ (Reading Comprehension Tests) consisting of three modules. Six explicit reading instruction strategies were chosen on the basis of previous research: activating background knowledge, predicting reading, discerning main and side issues, recording and managing understandings of words and of comprehension of texts, distinguishing the genres and adapting reading accordingly. Both the whole-class and the cross-age peer tutoring groups made significantly larger pre-test to post-test gains than the control group. No significant differences were detected in the same-age peer tutoring group compared with the control group. Although the pupils were reportedly from 9 to 12 years old, the results were only reported for one age group without specifying which age group that was. Also with three age groups, three conditions and a control, it is not clear that 22 teacher clusters is sufficient. Therefore, while there are suggestions that peer mentoring could assist, the evidence is not strong, not based in the UK, and not specific to disadvantaged and struggling Years 6 and 7 Pupils.

Response to Intervention

Response to Intervention (RTI) is a school-wide multitier programme that measures pupils’ response to research-based instruction. The model works in two ways. One is problem-solving by identifying the reasons for underachievement via a case-by-case analysis and tailoring instruction based on these reasons. Another way is the use of a standard treatment protocol (STP) which is administered to struggling pupils to prevent failure. RTI is delivered in three tiers of increasingly intense instruction. Graves et al (2011) reports on a quasi-experimental study that compares Tier 2 intensive reading instruction with a control group (‘business as usual’) for 6th graders with and without learning disabilities. These were pupils who performed ‘below’ or ‘far below’ basic level in literacy. The study was conducted in a large urban school where all the pupils were on free/reduced price lunch and where 90% of the children were not native English language speakers. The intervention was in the form of small-group instruction for 30 hours over 10 weeks. The study reported that the treatment was more efficacious for pupils with learning disabilities and for oral reading fluency, but less so for reading comprehension.

Faggella-Luby and Wardwell (2011) conducted a randomised controlled trial of a Tier 2 literacy intervention for 86 at-risk pupils in the 5th and 6th grades in an urban middle school. Struggling pupils were randomly assigned to one of three treatments – Story Structure (SS), Typical Practice (TP) and Sustained Silent Reading (SSR). Only post-test scores were available, based on Cloze (a standardised, curriculum-based cloze test), Strategy-Use test (used to assess whether experimental pupils’ have learnt SS strategies taught in the intervention) and Gates-MacGinitie Reading Comprehension. The Strategy-Use test assessed strategies that were taught only to pupils in the intervention group (SS), and is not a test of general comprehension or reading – the skills that the other conditions were supposed to develop. So using this test may not be valid for comparison. 6th grade pupils in the SS and TP groups scored higher than those in SSR on all three outcomes. This was less clear for 5th grade pupils.

Vaughn and Fletcher (2012) examined the efficacy of Response to Intervention (RTI), a reading remedial intervention for middle school pupils in grades 6 to 8 with reading difficulties at primary level (Tier 1), secondary level (Tier 2) and tertiary level (Tier 3). Pupils identified as being ‘at-risk’ (scoring below the expected level on the state-level reading test) were randomly assigned to treatment or control groups. In the 6th grade, treatment pupils were given a daily intervention by trained reading specialists. Intervention pupils in the 7th and 8th grades were given small-group instruction (about 5 pupils) or large-group instruction (about 10 pupils). Pupils who received both Tier 1 and Tier 2 interventions made small gains in decoding, fluency and reading comprehension (d = 0.16) compared to those who only received Tier 1 intervention. This paper is a summary of studies previously conducted by the researchers over the years. It is not clear how many pupils were involved in the study, for example. No standardised state level assessment is reported. The study is more about how RTI should be implemented than whether it works.

Leroux et al. (2011) is a randomised controlled study to examine the effects of an intensive, small group tutoring treatment on the reading outcomes of 30 pupils in grades 6 to 8 with severe reading difficulties, in three US middle schools. All had shown little response to two previous years of intensive intervention (Tier 2). The control group included those who were randomly assigned to ‘business as usual’ in Year 1 of the study (Tier 1). Both treatment pupils and control pupils were identified as low responders based on the same end-of-year non-response criteria, but the treatment group had two previous years of intensive intervention. Outcome measures were based on a number of literacy tests (e.g. Woodcock-Johnson III Word Attack and Letter Word Identification; Sight Word Efficiency and Phonemic Decoding Efficiency, Word Reading Efficiency, Test of Sentence Reading Efficiency, Gates MacGinitie Passage Comprehension). Results show that there were significant differences between treatment and comparison groups on the Gates MacGinitie assessment, and the Woodcock and Johnson Letter Word Identification subtest, but less so on the TOWRE Phonemic Decoding subtest. However, the difference was largely due to the continuing decline of the Tier 1 pupils in the comparison group. The sample of only 30 pupils is small, and given that Tier 1 pupils (control group) were randomised two years prior to the study, there may be questions concerning internal validity.

 Again, there is mixed and incomplete evidence here, from a very different context, with small samples and sometimes involving only those with learning disabilities (the term used in the studies themselves). It is not certain that RTI does not work, but neither is it clear that this is the way to go in dealing with 10 and 11 year-olds struggling with literacy in the UK. Further work in this area could be useful.

Unpromising approaches                        

Sundry individual studies with little or no promise

Some specific interventions have apparently only been evaluated once. They are summarised briefly here, if they show no promise at this stage.

Torgesen et al. (2007) is the final report of a randomised trial of four reading interventions for ‘striving’ readers. The results were not promising.

Lingard (2005) pursued Literacy Acceleration, an intervention developed by the author for pupils identified as having literacy difficulties in the first two years of secondary education (Years 7 and 8). The intervention involves pupils being given 50 minutes of help every day in small literacy groups of 10-16 pupils. Every day, pupils are engaged in individual reading with individual help with phonics. They follow a structured spelling programme, and receive help with writing based on a model. The study took place in one large comprehensive secondary school in the UK, using pupils who were mostly below Level 3 at Key Stage 2. The author claims that the 15 pupils in the first cohort showed significant improvement in reading scores, as did the 23 pupils in a second cohort. Less progress was made in spelling. However, the study must be largely disregarded because the comparison group consisted of those pupils who had achieved Level 3 KS2. This means first that significance is irrelevance as the sampling is not random, and second that the comparison is unfair. The study has other obvious problems including the small number of cases.

Puma et al. (2007) evaluated a structured writing programme, Writing Wings, for disadvantaged pupils in 3rd, 4th and 5th grades. Two cohorts of high-poverty elementary schools (17 schools in the first year, 22 in the second) randomly assigned two 3rd grade and two 4th grade classrooms to either an intervention, or a control group where teaching went on as usual. In the second year, a 5th grade class was included in each school. The combined sample was about 3,000 pupils, with 152 classroom teachers in 39 schools across 29 US states. This was reduced to 2,405 pupils in the final impact analysis (80% overall). There were no statistically significant impacts on pupils’ writing ability or on teacher ratings of that ability. The authors claim that the impact may have been small because teachers were already teaching literacy frequently and at a ‘fairly high level’ anyway. Thus scope for improvement may have been limited. Training of teachers in the method was delayed in 13 of 17 schools until mid-October, so in cohort 1 the full programme was not implemented. Nevertheless, the evidence is that the programme is not effective in improving writing ability.

Rider (2010) involved 52 8th-grade pupils in a district-adopted, developmental reading course to improve the reading achievement of struggling middle school readers. The intervention emphasised direct strategy instruction, effective instructional principles embedded in content, diverse texts and intensive writing. Reading achievement was measured by the Gates MacGinitie Reading Test (GATES). Pupils completed a pre-GATES test (form T) in September 2007 and the post-GATES (Form S) in May 2008. The treatment group improved their reading comprehension. However, no comparison group is reported, and it is not clear how the participants were selected and assigned.

Guthrie et al. (2009) conducted a pre- and post-test quasi-experimental design to examine the effects of Concept-Oriented Reading Instruction (CORI) compared with traditional instructional (TI) on the reading performance of low-achieving readers in grade 5. The intervention lasted 12 weeks. Outcome measures were compared using the Gates-MacGinitie Reading Test. Results showed that intervention pupils scored higher on word recognition speed and reading comprehension than TI pupils. Unfortunately, CORI (two schools) and TI pupils (one school) were drawn from different schools and CORI pupils had higher grades at the outset,. Also, among the low achievers only 7% in the CORI group were identified as requiring special education, whereas in the TI group 22% were so designated.

De Corte et al. (2001) used an experimental design to evaluate a research-based and practically applicable learning environment for enhancing text comprehension strategies for pupils in upper primary schools in Leuven, Belgium. Participants were 5th grade pupils in four experimental classes (79) and eight control classes (149). Schools were ‘contacted’ at random (and it is not clear what this means). A pre-test was administered using the Standardised Reading Comprehension Test (RCT) and a Reading Strategy Test (RST). It was not clear how long the intervention was for. The intervention included instruction on learning environment with video-taped lessons. Results of the Reading Strategy Test, the Transfer Test and strategy use interviews showed that the experimental group outperformed the control group in the use of strategy and application during text reading (all process measures). The experimental group also scored slightly higher than the control group in the Reading Comprehension test, but the difference was not statistically significant. The suggestion is that a powerful learning environment may encourage pupils to use and transfer reading comprehension strategies, but this does not necessarily result in improvement in reading comprehension performance.

Wheldall (2000) reports on an Australian study to determine the effect of using the Rainbow Repeated Reading (RRP) programme to complement an already successful MULTILIT (Making Up for Lost Time in Literacy) curriculum for low-progress readers. The research focus was to decide whether there was added value to the existing programme. The RRP consists of a set of reading materials, graded for level, and associated teachers’ materials, designed to utilise repeated reading of short sections of text to increase accuracy and speed (fluency) in reading. The progamme was added to a randomly allocated sample of half the pupils enrolling onto the original MULTILIT programme in two sites. The other half of pupils remained on the original MULTILIT programme, without additional enhancement. The sample consisted of 40 low-progress readers from Years 2 to 7 all of whom were assessed as being at least two years behind their peers in reading skills. All pupils were given daily instruction over a period of nine weeks. Pre- tests were conducted using the Neale Analysis of reading (revised), the Burt Word Reading Test, and the Wheldall Assessment of Reading Passages (WARP), by trained research assistants. Post-tests were conducted using the Burt and WARP tests in week 10. For Burt scores neither factors were found to produce a significant effect. For WARP scores, only site was found to have any significance, possibly as a result of inconsistency in administration by staff. Therefore, the RRP appeared to add nothing to the already intensive MULTILIT programme.

Conclusions and recommendations

Commentary on the research uncovered

 It is notable again how poorly written much research is, and how much of the rest either does not use an appropriate design for its research questions or is otherwise fatally flawed from the outset. The poorest parts of reports tend to be the abstracts, the lack of methods detail, and the logic of the conclusions drawn. Where robust evaluation is more commonly attempted, as in the US and by doctoral researchers, there is often an apparent conflict of interest that goes unaddressed. Even where the originator of an intervention does not stand to gain financially, they can have prestige and pride invested in a successful evaluation. It would be better for evaluators to be genuinely curious about the intervention but not to care whether it works or not. Robust evaluations of the kind necessary to drive ethical policy and practice decisions are uncommon in the UK, which we see as an indictment of the funding regimes involved. In addition, new studies should be larger, as large as possible, with randomisation, control of diffusion, process evaluation as standard, and outcome measures (for pre- and post- tests) should be assessed using the same instruments, preferably one standardised test to avoid the temptation of ‘fishing’.

Substantive summary of findings

It is difficult to compare interventions where they used different instruments, tested different components, or for different age groups and varied in duration and implementation. Some were carried out by the researcher, some by trained teachers, some by partially trained teachers and some by trained specialists. There was also diffusion where some schools offer their own interventions on top of the ones on trial.

There is, apparently, little work of the kind sought that addresses the issue of ‘catch-up’ for those in mainstream settings struggling with literacy in the transition to secondary schools. Nothing was found that could be said to have solved the problem. This review looked also at wider interventions to improve literacy, especially for potentially disadvantaged pupils either in Year 6 or 7. In general reading interventions have been found to be less effective with these older children. Studies have found that early deficits in reading practices and skills accumulate over time. It is difficult to poor readers to close the gap between themselves and more proficient readers from primary through to secondary (Kim et al., 2010). For example, a RCT by Torgesen et al (2006, 2007) evaluated four reading interventions and found inconclusive results. The interventions appear to benefit younger grade 3 but not grade 5. Perhaps, if literacy for older children is to be improved, the intervention has to start early in the primary school years.

The review of evidence has uncovered no ‘magic bullet’. Some interventions have only one small or partial evaluation. This is not enough to take forward, even if the evaluation was positive. Some interventions have been evaluated several times but with genuinely conflicting results. And some have been shown not to work. We found no intervention that had been evaluated more than once successfully without also finding an equivalent or greater body of evidence that it does not work. The interventions that have been tried repeatedly either clearly do not work, or the picture is unclear. There are some promising individual evaluations of specific bespoke programmes that are worth pursuing.

It is not the timing or medium of instruction that matters, it seems. The use of technology or summer schools sui generis are of no help. Presumably their success, if it is possible, would depend on the content and delivery more than the timing and the technology itself. There are several attempts at specific classroom programmes and interventions. In general, they find it hard to shift patterns of low literacy by age 10 or 11. There is no evidence that interventions are more or less effective with pupils in year 6 or year 7. Similarly, several promising interventions took place during standard curriculum time. There does not have to be additional time involved.

It is not motivation alone that matters. Motivation without competence does not make sense. Therefore providing incentives for inputs is more effective than for outcomes.

There were some complex packages that have been quite heavily evaluated usually with mixed results. Some of these could repay an investigation for the preferred age group in a UK context. Perhaps the most promising is the programme entitled ‘Response to Intervention’. There may also be some some merit in cross-age peer-assisted learning, and somewhat less in the programmes ‘READ180’ and ‘Project CRISS’.

The common elements of promising interventions

The pattern is far from clear. However, there are some elements that appear in more than one promising intervention. They may be neither necessary to nor sufficient for success but they would repay consideration. In this way they are unlike issues of specific timing, additional teaching, medium of delivery like ICT or otherwise, or overt motivational programmes, none of which are common to successful interventions.

Generally the individual interventions that reported success tended to be single-issue, clearer and simpler in approach, and further removed from normal practice than the less successful ones (even where this novel practice took place during routine lesson times). Example of themes included emphasis on grammar, on written and oral comprehension, and how to ask and respond to appropriate questions. There appears to be some advantage in specifically targeting those pupils clearly below the expected level of competence for their age, rather than making the intervention more widely applicable. Possibly related to this is the suggestion that individual case analysis of the problems and achievements of each pupil should be undertaken at the outset and then used as a genuine basis for tailoring the intervention. This would make the intervention a template rather than a fixed product, making it a challenge to evaluate. Ongoing support could be an advantage rather than a simple ‘treatment’, and this might include specially prepared learning materials. A few interventions emphasise the role of teacher development in their implementation. Clearly, this is particularly important where the teachers are responsible for day-to-day delivery of the intervention.


 It is also worth noting that that here must be a wide range of as yet untested or as yet undeveloped interventions that could be effective. It is important that new work is not constrained overmuch by ideas that have been tested before. However, any such new ideas must start evaluation at an earlier phase in the design cycle than those that have already shown promise.

