Sunday, 18 October 2015

The Thing About Tests - Part 2

Some frank conversation about testing is desperately needed.  I'll start:  I loved test time in my "traditional teaching" life.  More than any other reason, a test period represented an extra prep.

There.  I admitted it.

I'm not happy about that.  I'm not proud to say it.  But if I'm being honest, I have to tell you:  I never had a problem handing out a 40 question multiple choice with 4 written response test.  That's going to take 60 to 90 minutes, and for that time period every kid is going to be quiet and focused.  Ahhhhhh.  Sweet relief.  Get a little marking done, maybe sort out the next day's lesson.  Oh! My friend just updated his status on Facebook?  Chimichangas are great.  Right on bud!  Maybe I'll just scroll down a little more... Haha!  Cats are so funny. <90 minutes later> "All right everyone, turn in your exams!"

You elementary teachers reading this are probably rolling your eyes right now.  Don't get that down in Grade 1 hey?  Sorry folks.  I still don't know how y'all do it.

What's my point?  The scenario above exemplifies the main issue that I presently have with the ways tests are traditionally used: too often we are conflating interests that have no business being put together when we test students, and in doing so we have the potential to do our kids a disservice.  The extra "motivation" I had to test kids likely played a role (even if subconsciously) in things like frequency of testing and test length.

Is a 90 minute test always the best approach for every set of learning objectives?  I'd be lying if I said that question factored into my decisions very much back when I created tests that, to this day, still sit in filing cabinets back at my school.  Instead, the structure of my school's bell times dictated how long my tests were.  Enough questions to keep the kids who take the longest to need most of the class, but not so many that you have a significant number of students unfinished at the end of the period.  The "sweet spot" would see the last kid hand their paper in right before the bell.  See what's wrong with that design?

It serves the adult in the room, and not the students.

And there's the criticism that the anti-testers have a right to hang their hats on:  Testing is all too often not about learning.  It's not that tests are invalid, or that they can't test for certain things, or that they put too much pressure on kids.  All those issues are hotly debatable.  The main issue with much of traditional testing is that too many aspects of the testing aren't designed for the students.  The testing is mostly for us, the adults.  It exists so we know if we're doing our jobs properly.  It exists so we can assign numbers to kids that are culturally adopted and easy to understand.  It helps us rank and sort them.  It helps us more than it helps them, at least insofar as learning is concerned.  None of those things are necessarily bad or wrong (as I pointed out in an earlier blog), but "crossing the streams" so to speak is exactly what's got testing in all the hot water it's in right now.

I recently read an article from 2014 by Linda Flanagan that described this phenomenon in the greater context of grades as a whole.  In the piece, she highlights our societal obsession with measurement.  I would like to distinguish measurement from assessment in much the same way Flanagan does in her article: To me, assessment is what I do in my class with individual students to get a barometer on their learning.  Measurement is what is done to large groups of students, whole schools, entire jurisdictions, etc.  Measurement is the thing of standardized government exams, and it typically occurs in conjunction with assessment.  We find out what our kids know at the same time that we evaluate them and make decisions about their futures.  A lot of "traditional" testing in classrooms (including my classic 90-minute "40 & 4" tests) is actually just bad measurement disguised as assessment.  My suggestion here is that the practice of conflating the objectives of measurement and assessment is what is causing a lot of the problems that surround the "to test or not to test debate."

So do I think we should throw tests out the window altogether?  Absolutely not.  I don't even think it's necessary to throw the measurement brand of testing out the window.  Let's just start making steps toward making testing for learning the greater emphasis over testing of learning.  These aren't new concepts by any stretch.  I remember a vigorous discussion with my fellow student teachers on the topic when I was still in teacher college in the late 90s.  Unfortunately the conclusions reached in our discussions (mainly that testing for learning is better) got quickly set aside in the cultural milieu of real-world teaching.

And testing for learning is not the only aspect of the testing paradigm that needs an overhaul.  To illustrate, I'm going to attempt to describe one class in particular that I think much more closely approaches the way good testing should look:

Alicia is a high school science teacher with an 11th grade Physics class.  Her class has a strong emphasis on inquiry and project-based learning.  Currently, her students are attempting to navigate a design challenge where they must construct a machine that launches a spherical projectile.  Most students are choosing to construct either catapults or trebuchets.  One group works on a ballista, while another is trying to craft an unusual hybrid of a baseball pitching machine and a skeet launcher.  Students must use physics principles to accurately predict the landing sites of their projectiles, determine launch and land velocities, and describe energy changes.

The project challenges students to investigate and understand important concepts in Physics, such as Kinematics (the study of motion) in two dimensions, Dynamics (the study of forces), and Energy.  The students must work collaboratively and creatively to construct and refine their devices.  In order to fully complete the task, the students must learn which data is relevant, how to gather it, and how to calculate the expected trajectory and landing positions of their projectiles.



Alicia has scaffolded this learning into her project calendar.  She provides the students with resources to investigate these concepts independently, but also builds a lesson schedule into her project calendar.  Students have the option of attending her lesson on projectile motion calculations where she demonstrates and works through examples with students.  There is formative work for students to practice the concepts.  Students may choose formative work that meets basic expectations for competency in the course or more challenging practice material, depending on their aspirations and aptitude. To ensure that students are meeting the learning objectives, she has also created a summative waypoint exercise.  Students are not allowed to perform the first set of formal trials with their device until all members have demonstrated at least a minimal understanding of the concepts -- in other words: The Test.

The test consists of three questions, all written response.  The nature of these questions are all well known by the students in advance of the test.  The questions are constructed to hit on the fundamental concepts in the topic, not necessarily every single potential learning objective.  Students write the test when they are ready, and may write it as many times as necessary to meet the standard.  While it sounds like it would be pretty easy for students to cheat, the kicker here is that no two tests are precisely the same.  Alicia has a skeleton form that acts as the backbone of the test, but hand writes subtle alterations in to each exam so that no two tests will ever be identical.  They all test the same concepts, but with slightly different questions.

That may sound like a lot of work for Alicia, but each test only takes her a few seconds to prepare (she jots down 6 numbers and three words into blanks on a templated page).  Construction of her tests takes far less time than the full period quizzes and tests of her previously more traditional teaching practices.  Her grading here is also much different than the typical testing paradigm.  Using protocols for communication that she established early in the course, she can tell at a glance how proficient the student is in a variety of competencies.  She doesn't need to know what the correct answer is.  She only needs to see how the student is tackling the question to know if they are "getting it."

Did they apply the appropriate principles?  Are they setting up their formulas correctly?  Is their flow of ideas logical?  Are they communicating their learning clearly?  Are they using proper conventions and notation?  She can rate each student in their competencies with just a quick scan of the test.  Rather than marking it in detail for right/wrong, she is assessing the student on what they can do and what they can't yet (granted many kids want to know if they "got it right," but that doesn't take too long to figure out either).  

Here "failure" is simply a waypoint in the process.  Where there are problems, she can address each student at their level individually.  She can also leverage the collaborative needs of the groups, as there is a degree of motivation for students to have everyone on the team understanding the material (teammates can tutor one another in their respective difficulties).  Students are also asked to be metacognitive regarding their testing.  They rate their own difficulties in tacking the questions, their tendency to guess, and their confidence in their process.  All of this provides valuable information to the teacher with one-glance efficiency.  She can structure follow-up activities in response to individual and overall performances.

Creating and grading these tests are time savers for Alicia.  The tests take a fraction of the time to write than a lengthy full-period exam.  She's far more likely to adjust and modify them from year to year rather than reuse them unchanged for decades.  She spends less time assessing them, and students take less time to actually write them.  The tests are sufficient to give her clear roadmaps on how to direct each student's learning and extra time she and the students have can be spent shoring up problems and mastering content, rather than simply moving on to the next topic with holes in some students' understanding.

Here is is a laundry list of features in Alicia's class that I think are particularly critical in regards to testing:
  • Testing serves learning.  It doesn't mark the end of the learning cycle, but rather a midpoint.  It is assessment first and either measurement (as I described the term earlier) secondly or not at all.
  • The opportunity to correct learning deficiencies identified through testing is built in to the process.
  • Testing can be personalized, tailored to the abilities and needs of individual students.  (And before you discount this point as being pedagogically unsound, we should have a discussion on the critical differences between "equal" and "fair.")
  • It is efficient and agile.  It serves its purpose without dominating the agenda.
  • Testing doesn't monopolize the raison d'être for learning.  Students learn for reasons other than "you'll need to know this for your test."
  • Testing is neither the backbone nor the default for assessment in this class.  It is one tool among many that is used only when it makes sense to use.  Tested objectives are purposefully selected for their alignment with the nature of testing and the benefits it confers to the learning.  A great many other learning objectives are assessed and evaluated using other methods of assessment.

It is worth noting that there are other components of Alicia's teaching that make her classes vitally effective beyond just how and why she tests.  Those components complement her design choices with regards to testing.  Her class is student-centerered, emphasizing projects, problem solving, and critical thinking.  But it's worth noting that she still teaches -- Alicia is a skilled facilitator and students genuinely enjoy her lessons, whether it's 4 students partaking or her whole class of 34.

It's also worth noting that this course happens to be highly academic.  It is part of the springboard that will launch many students into upper academia where testing will be the norm.  Alicia also teaches a vocational science class to 10th graders, and tests are almost completely absent from that course.

What's absent in the scenario above is the role of the measurement brand of tests.  Where's the government exam in the equation above or the high-stakes final exam?  Well that, my friends, is its own can of worms, and (given the existing length of this post) demands I save it for next time.

TL;DR: Testing isn't education's Great White Whale.  If it's used primarily to support learning and is one amongst a number of useful assessment tools, it can have a lot of value in any classroom, even the predominantly student-centered ones.

Vive la revolution!


No comments:

Post a Comment