What is sufficient progress? - Sig+ for School Data

In the crazy world of competitive shin kicking, combatants apparently shout “Sufficient!” when they’ve had enough. I know how they feel. This whole issue of defining ‘sufficient progress’ feels like being kicked in the shins repeatedly. So, I thought I’d try to explain what it is and why attempting to set end of key stage targets right now is a fairly futile process.

The Key Stage 2 Assessment & Reporting Arrangements for 2016 were published last week and the key changes section contained this brief explanation of the new progress measures:

2.9 Progress Measures

Progress measures in 2016 will work in a similar way to current primary value-added measures or Progress 8 in secondary schools. A school’s score will be calculated by comparing their pupils’ KS2 results against those of all pupils nationally who had similar starting points.

Pupils will be assigned to prior attainment groups based on their KS1 results.

The department will confirm what score a school would need to get to have made ‘sufficient progress’ after the tests have been sat next summer.

More detailed guidance on how the new measures will be constructed is expected to be published early in 2016.

After years of expected and better than expected levels of progress measures, this seems new and daunting, but it is exactly the same method used in the VA measures of RAISE and FFT reports for years. Essentially, it involves comparing a pupil’s KS2 attainment against the national average attainment for pupils in the same cohort with the same start point (this is known an estimate or benchmark). So, for example, we compare the KS2 scaled score of a pupil that was 2c, L1, 2c at KS1 against the national average scaled score for pupils that were 2c, L1, 2c at KS1. I produced this hypothetical example to illustrate this:

In this example, a pupil has fallen short of the expected standard but has made ‘sufficient’ progress and achieves a positive VA score because their scaled score is higher than the national average result for pupils with the same prior attainment. Conversely, it is possible for a pupil to achieve the expected standard but not make ‘sufficient’ progress because nationally pupils with the same prior attainment achieved a higher score on average. The differences between pupils’ actual and estimated results are then averaged for the whole cohort to arrive at a school VA score. It is most likely that sufficient progress will be a negative threshold and perhaps based on percentile ranking so that we don’t end up with 50% of schools below floor.

The key thing here, with regards our attempts to second guess what constitutes sufficient progress, is that pupils’ individual benchmarks are calculated retrospectively. In other words, because a pupil is compared against the average attainment of pupils nationally with the same start points in the same year, we have to wait until all the 2016 results are in before we know the line they have to cross. This means we are stumbling around in the dark right now and any attempt to set targets is like shooting into the night whilst wearing a blindfold. At best it’s distracting; at worst it’s a danger to pupils’ learning. Even FFT – and those guys know a thing or two about target setting – are being cautious this year by providing broad estimates in the form of ARE bands. But that’s not stopping some people from having a crack at it.

Things will improve in 2017 once we have some actual data in the bank, but considering more and more pupils will reach the expected standard each year, and that average scores will increase, any estimate derived from the previous year’s data is likely to be too low. Right now though we have absolutely no idea what a likely outcome will be from any particular start point because no one has sat the tests. We certainly shouldn’t be applying some spurious methodology like adding two whole levels of progress to the KS1 result and then attempting to convert the outcome to a scaled score. This is failing to understand the difference between VA and expected progress. It is a fact that many pupils in primary schools made so-called expected progress but fell short of their VA estimate (the opposite was common in secondary schools), and we could unwittingly repeat this through an ill-conceived approach. And, whilst the DfE claim that the new expected standard is equivalent to a 4b, we know this is a very broad approximation and performing a conversion to scaled scores on that basis is likely to be inaccurate and misleading.

The concerns are twofold: 1) that schools will attempt to teach to the test, and 2) that schools will be held to account for targets set on flawed methodology. My fear is that schools, having set scaled score targets for pupils based on a ‘sufficient progress’ model, will then test pupils to see how close to these targets the pupils are. The sample tests don’t have a raw score to scaled score conversion so they might attempt to do it themselves using the rough criteria for meeting the expected standard contained in the frameworks. Highly dubious. Alternatively they might used a commercial standardised test, which produces scores in a similar format. Again, this is very risky. Schools must understand that a score of 100 in these tests indicates the average level for pupils taking that particular test, and therefore cannot be linked to the key stage 2 expected standard. It might be that pupils find that particular test hard, so 100 will be linked to a low raw score. Or they might find it easier, so 100 will be linked to a higher raw score. No matter the difficulty of the test, around a half of pupils will be below 100 and the other half will be above. The expected standard, on the other hand, will always be around the same level of difficulty and the DfE want to move towards 85% of pupils achieving it (the floor standard being kept at 65% in 2016). This means that 100 is not the average and is therefore a different entity to a score of 100 in a commercial test.

So, take note when trying to set targets for 2016. It’s a pointless exercise. The data will have no validity – it’s a stab in the dark – and could even be dangerous. The best advice, for this year at least, is to block out the noise and concentrate on the teaching. Then the results will hopefully take care of themselves.