The Rise of the Assessbots: do systems influence assessment?

“And we would go on as though nothing was wrong” Joy Division, Transmission

I recently got into a discussion on Twitter about the differences between tracking and assessment, and whether there is any such thing as an assessment system. I’d suggested that many systems auto-assess (I referred to them as Assessbots), that the teacher ticks a few boxes and waits to see what comes out the other end. This is wide of the mark and, with hindsight, actually rather patronising so I apologise for my glib remark. Obviously systems don’t teach, question, mark work, feed back and decide whether pupils have achieved or not (well not yet anyway). But do systems have tools that guide assessment? Yes. And can these tools actually influence assessment in some way?

Probably.

That was the point I was trying to make in my rather clumsy way and it’s certainly something worth exploring further.

If we take a look at most of the popular systems in use in schools they will have some sort of assessment tool built in. This will involve an APP-style series of statements or objectives against which a teacher will tick an appropriate box or enter a code to signify the pupil’s level of understanding. Obviously, here the teacher is carrying out the assessment but perhaps there are some grey areas regarding who’s the master and who’s the servant:

1) Some systems are inflexible, offering a set list of objectives – the provider’s interpretation of what is and is not important – and so map out what is to be taught, perhaps in a particular order. Moreover, some systems may have just a few key objectives (e.g. NAHT KPIs) to guide assessment whilst others have many more. The former relies more on a teacher’s judgement within a broad framework whereas the latter is more prescriptive and definitive. Clearly some schools are happy with just a handful of key indicators to guide assessment whilst others seem to find security in having more exhaustive lists of objectives to inform assessment. Whilst there is no right approach, perhaps the latter risks sidelining the teacher’s professional judgement, reducing assessment down to a ‘painting by numbers’ approach. It’s possible that systems are providing a crutch that users become dependent on but which does little to develop a teacher’s skills in assessment.

2) The ticks, codes or scores entered against each statement are then converted into a numerical value, weighted, aggregated and compared against a series of thresholds to arrive at an overall judgement (e.g. ARE band) used for the purposes of tracking pupil progress. The teacher has assessed the pupil to be developing in some areas and secure in others but it is the system that decides whether the pupil is Year 4 Developing or Year 4 Secure overall. Many schools may use these labels as a guide and manually adjust them as appropriate but some clearly don’t, choosing instead to accept the category the system assigns pupils to. And even if you don’t agree with the system and change the outcome accordingly, any adjusted judgement is still constrained by the system’s pre-defined parameters. In other words, you may change outcomes to something more realistic and find that the tracking sheet turns bright red, causing you to question the changes you’ve just made. It might then be tempting to go back and tweak a few assessments at the objective level to alter the end result. We then succumb to Goldilocks Assessment Syndrome, testing the input until the output is just right. Believe me, it happens. I’ve had a few teachers admit that this is going on in their school because the system they use wasn’t producing the data they’d expect. So, just go back and tweak it until it does. Unfortunately, what you are left with is not formative assessment. It does not provide an accurate record of a pupil’s strengths and weaknesses. It does not help identify gaps in a pupil’s learning. Instead it is useless data, gerrymandered to gain the desired outcome.

All of the data produced by these systems are based on the observations made by teachers but there is a concern that the important detail of assessment risks being lost or supplanted by clumsy, auto-generated judgements, which can influence next steps, reinforce misconceptions, have a bearing on future teaching and assessment, and even cause recent assessments to be re-evaluated in light of the outcome.

Perhaps the issue is best illustrated by the following statements. Ask yourself if you’ve ever heard anything like these being uttered in your school:

“they can’t be secure because they were only emerging last term”

“I ticked all these objectives and they’ve only made one step”

“They need to achieve at least 33% of the objectives to move up a band”

“They have to make at least 3 steps per year”

All of the above are warning signs that the system is exerting an influence over assessment in a school; that there may be a temptation to allow what we want out of the system to affect what we put in. A classic case of tail wagging dog.

System breakdown

A few months ago I helped a school extricate itself from its existing system and set up an new one. The previous headteacher had left and the new head was keen to implement something more user friendly. The old system was not popular with staff – they found it clunky, overcomplicated and unintuitive – and everyone wanted to try something new.

So I began the process of extracting the data and getting it into a format that I could import into the new system. Whilst going through the spreadsheet I’d put together I noticed more and more spurious data and weird anomalies: pupils that hadn’t changed level for over a year, some that had jumped a level in the space of a term, and others that had spiked and dramatically gone backwards. After much head scratching I turned to the new head:

Me: “I don’t think I can use this. It’s all over the place. It just looks all wrong”

HT: “well, what are we going to do then? We have to have something”

Me: “Yes, but not this. This is nuts. Do teachers have anything else?”

HT: “I doubt it – can’t see why they would – but I’ll ask”

Within an hour teachers came to the office bearing gifts: excel files on memory sticks, word tables, hand drawn grids on A3. A complete, alternative set of assessment data. The headteacher looked fairly stunned:

HT: “Where has this come from? Why have you got all this?”

Teacher: “We all made our own assessments. You don’t think we trust the crap that system churns out do you?”

It turns out that teachers had been expected to use the system’s APP tool despite their serious misgivings and lack of faith in the data it generated. It was felt that the system provided assurance and made teachers less accountable for outcomes. That it is in someway better to allow a system to categorise pupils, regardless of accuracy, than to let teachers use their professional judgement. This was evidently a big mistake.

I’ve encountered similar situations in other schools. Due to the massive pressures of accountability and need for evidence there is, quite understandably, a growing desire for systems that don’t just lessen the administrative burden, but also reduce risk, deflect blame, and devolve responsibility. A system we can point at and say “it wasn’t me, it was him”. They thrive in a culture of fear and high stakes.

So, no, systems don’t assess, teachers do, but they can certainly influence the assessment process. Through their metrics, algorithms, and parameters the system converts assessment data into tracking data and there is therefore a very real risk of allowing the desired outcome to dictate the input. A couple more ‘exceedings’ here, a few more ‘secures’ there and – voila! – just right.

Hopefully none this will ring true with you, in which case there’s nothing to see here, move along. However, if there’s a grain of truth in any of the above then maybe it’s time to take a good hard look at your system and do some soul searching about your rationale for assessment.

Again it comes back to this:

Your tracking system must be tailored to fit the way you assess your curriculum, not the other way round.

And perhaps it’s time we did some cold turkey and tried weaning ourselves off our dependence on systems that dictate what we do.