Creating Assessments: Three Types of Standards

March 23, 2013

I’m proud to be on the ground-floor of a possible ‘better assessments’ movement here on the math Blogotwittersphere. I’m excited to see people talking about their assessment process and admitting that our assessments could be better than what they could be – which is reassuring, because I’ve been thinking that for a long time. See this and this for some background of where my head is at. This post is a reflection on the things I choose to assess and how I choose to assess them.

One thing I’ve really begun to understand is how much assessment is guided by curriculum, and how choices about assessment can have amazing impacts on curriculum choices. I’m a believer that the things we assess and the way we assess is how we send a message to our students “HEY! This stuff is important! And you need to be able to do it if you want to be successful in this class!”

The organization of these individual skills and knowledge usually falls under the header of a ‘standard’. The way I assess, each page of my test is a separate standard that is graded independently from the other pages on the test. My underlying philosophy of these assessments is: It should be clear both to me and my students the standard that I am assessing. It should be clear both to me and to my students what the expectations of ‘mastery’ are for that standard. Assessments should make it clear both to me and to my students where their gaps in knowledge are, as well as their strengths in understanding. Assessments should promote student-directed remediation. Assessments should provide accurate data for a teacher about the level of understanding of his or her students. That’s a lot of pressure for an assessment.

This means it’s a big deal when we choose to assess something, and its a big deal when we choose not to assess something. I take this choice seriously, which means I need to examine the curriculum for each of my units and decide what it is that I want to assess and how I want to assess it. After doing this for a year, I’ve come to the following realization: Not all standards are created equally, which means not all standards should be assessed equally. When I look through my units and decide what I want to assess or how I want to assess it, I’ve started to group skills and concepts into three types of standards: Procedural Standards, Conceptual Standards, and Synthesis Standards.

Optional Reading: The choice of Procedural and Conceptual as the terms I chose comes from the article Adding it Up: Helping Children Learn Mathematics. This was an article I read in college when becoming a teacher – ‘Procedural Fluency’ and ‘Conceptual Understanding’ are two of the 5 strands of mathematical proficiency. As I was searching for words to describe what I was noticing in my curriculum, I reread the two sections on procedural fluency and conceptual understanding and felt it matched up pretty well to what I was observing in my standards.

Procedural Skills

As I looked through my curriculum for individual standards to assess, I realized there are certain skills that were simply foundational for everything we would do for the rest of the unit (or, in some cases, for the rest of the year). I needed to make sure my students understood these things at the level of ‘consistently computationally correct answers’. These are usually skills from previous courses or the foundation skill for a particular unit. My assessments for these skills are barebone: here are several problems you need to know, all at roughly the same level of difficulty, and you need to be able to do them consistently. I grade these harshly because they’re the foundation – there’s no question on here higher than the level of ‘identify’ or ‘apply a procedure correctly’. These are the skills that usually appear embedded in my Conceptual and Synthesis skills later on. They’re the skills where, once a student understands these, their success in the other skills suddenly skyrockets.

Outside Influence: Kelly O’Shea’s post about A and B objectives in Standards-Based Grading. I’ve taken her ideas about A objectives (what I’m calling Procedural) and B objectives (what I’m calling Conceptual) and added a third one: Synthesis. This post really helped cement a lot of the ideas I presented above, so I’m really grateful that I found it. If you don’t read her post now, I highly recommend returning to it and reading her bullets at the bottom of the post describing the benefit of separating objectives.

Procedural Skills that come from Algebra: Solving Algebra Equations, Integer Arithmetic (Assessment Below), Graphing Lines

View this document on Scribd

Procedural Skills that come from Geometry: Applying the Pythagorean Theorem, Angle Identification (Assessment Below), Trigonometry Ratio Identification

View this document on Scribd

Outside Influence: Sam Shah does something similar with his Algebra Boot Camp.

For these skills, I don’t trust one solitary question to let a student demonstrate understanding – I make sure I have several so students can demonstrate consistency. I also grade these pages very harshly. On the integer test – if a student misses any more than 2 problems, they’ve failed that page. Most of my students aren’t used to this – they’re used to ‘slipping by’ on tests from other classes because its several skills collected together, so their little mistakes get lost in the mess of 100-pt test that they’re taking. This system doesn’t let them hide anymore – I’m making a statement with my assessment: consistency and computational correctness is important. The entire point of this page is to get these very specific problems correct. If you miss them on a different skill later in the year, I’ll cut you some slack – but for this procedural skill, I’ve purposefully created very little gray area: you either know it or you don’t.

This was one of the toughest things for me at the beginning of the year – explaining to students that they failed one page of the test because of several small mistakes. They’re not used to this, so they’re not happy about this, so it created some friction early in the year. But I’ve gotten better at this conversation as the year’s progressed and as my students have started to understand how high I’ve set my standards for these procedural skills. The conversation I’ve started having is: “Let’s say I asked you to spell your name 8 times. Should be easy, right? You know your name – no big deal. So you go to spell it and you give it to me, but I tell you that on the 5th line, you mispelled your name. Even though it’s just one time, what am I supposed to think? Spelling your name is something I would expect everyone to do no matter how many times – if you can’t, we need to have a serious talk, or you better do it 8 more times and prove to me that you really do know it. That’s what integers are for me: you need to be able to do it every single time. And if you can’t, either we need to talk, or you need to try again and prove to me that you can”.

Teaching Note: Procedural Skills almost assume that students will need to reassess. Multiple Times. And the work they do to reassess is not at the same level as the more conceptual skills in your course – some students with very low skills will need these basics from scratch, but many students will just need lots and lots of practice. I’ve solved this problem with my Wall of Remediation and by having assessment templates that I can use to create reassessments quickly and on the fly. But, in my experience, my high standards makes earning a 100% on these pages sooooooo satisfying.

Conceptual Skills

These are the meat of my unit – the central conceptual understanding that I need my students to walk away with. These are skills that I imagine as scaffolded – there is a basic understanding, a strong understanding, and a mastery understanding. These are skills that usually have a problem-solving component or ‘explain’/’justify’/’analyze’/’sketch’ component embedded in them. They’re the ones where I really spend time trying to think about how to assess: “What’s the right question to ask so I that I can tell that they truly understand what they’re doing? How do I know they’re not mindlessly applying a procedure?”. When I think of a ‘bad’ test question I’ve written, it’s usually a question trying to assess one of these standards.

Outside Influence: I think this post by Jason Buell does a great job of emphasizing part of what I’m talking about: on a traditional test, how do you handle a student who nails all the trivial application of skills/vocabulary questions, but falls short on the application and synthesis questions? The resulting conversation about grading is worth reading too.

I’ve started creating Tiered Assessments for these skills, which I first read about at the It’s All Math blog but have since rediscovered a few other places. The basic idea is: You make a decision about what types of problems/prompts demonstrate ‘Mastery’ versus ‘Strong Understanding’ versus ‘Weak Understanding’ versus ‘No Understanding’. Or, for students who think purely in terms of grades, what an “A” student can do, what a “B” student can do, a “C” student, and a “D” student (and if you miss all of them, you’re an “F” student). I use numbers to communicate these ideas:

1 = Weak Understanding, 2 = Basic Understanding, 3 = Strong Understanding, 4 = Mastery with Small Mistakes, 5 = Mastery

This satisfies pretty much all of my goals for an assessment: it clearly communicates my expectations, it informs students about the remediation that they need, and it helps me collect data about my class. Creating one of these assessments involves deciding what my level 2, 3, and 5 problems look like.

I design my level 2 problems with the same philosophy as my Procedural standards: “These are problems everyone should be able to do consistently. Low-level Blooms. If you can’t get this right, we need to have a serious talk about these ideas”.

I design level 3 problems with the idea “What questions can I ask that requires you to make a choice about how to apply what you know? That may be multi-step or rely on some foundational procedural skill in addition to the current conceptual skill?” I usually use released items from the state assessment to gauge where these problems should be.

I design level 5 problems with the idea “Okay – prove to me that you really know what you’re doing. You’ll either have to apply this skill to a slightly new context, or decide how to apply it multiple times, or explain your thoughts in a way that proves to me that you know what you’re doing”. I’ve told my students: “my assessments are like an argument from you to me: it’s your job to convince me that you really understand what you’re doing. You can do this with your scratch work, with your explanations, or with your pictures – but whatever you do, it’s your job to be clear and correct so I believe you”. I think this especially applies with level 5 problems: I want to design a problem that really requires a student to do some leg-work to show me that they understand what they’re doing. For really conceptual problems, I want them to really explain what they know for me to judge. For procedural problems, I want there to be some sort of problem-solving or ‘habits of mind’ aspect to the problem that they’ll need to apply. When considering these problems, I usually look at Common Core resources, the Park Math curriculum, or any set of problems grounded in problem-solving strategies or habits of mind.

Outside Influence: Jason Buell also has a guide on how to create these Tiered Assessments which I think is definitely worth a read.

Here are some tiered assessments I’ve made that I’m proud of:

View this document on Scribd

Analysis: I feel like the level 2 question gives me an immediate entrance into how the student thinks – the shapes are simple and the questions are simple. If a student misses this, we definitely need to have a talk, although I’ve debated giving them the areas as well rather than have them calculate it. The level 3 questions are straightforward if you know what you’re doing and build on a foundational skill (Calculating Area). But the level 5 question really gets to the heart of the student’s understanding – it requires explanation, analysis and reasoning, gets to the heart of how a student understands probability and how it relates to area.

View this document on Scribd

Analysis: For the Level 5 Problem: I’ve written about Parallel Line Mazes before, but the gist is: a student has to ‘jump’ from angle to angle using the different parallel line relationships (Alternate Interior, Vertical, Corresponding, etc) meeting a certain set of criteria. This problem challenges the student to know more than just the name of the relationships, but how to apply those relationships in a novel situation and they must be comfortable with certain problem-solving strategies and perseverance.

See Also: Sam Shah’s Favorite Test Question is a Level 5 question – novel, gets to the heart of a student’s understanding, and requires explanation.

I can tell already – whenever I make an assessment I’m proud of, it’ll be when I’ve found the perfect Level 5 Question and the right transition from Level 2 to Level 3 to Level 5. I’m not there yet with all my assessments, but I think this is a good start. I feel extremely confident about the labels of ‘Master’ vs ‘Strong Understanding’ vs ‘Weak Understanding’ with the way this test is broken up, especially since I haven’t padded my test with extra questions just to hide what they do or don’t know.

Creating Better Assessments

Michael Fenton has written about his frustration with SBG assessments being purely application of skills. This is something I can absolutely relate to – when I first started implementing SBG and following the guides that I read online, I began feeling that the only type of assessment I could write was one that acted as a checklist of skills for my students to do. I struggled trying to find a way to keep that balance – of promoting problem-solving skills and ‘habits of mind’ while still holding students accountable for basic application of skills. This is the struggle that led to this blog post and my curiosity about assessments – I haven’t had very long to implement these types of assessments, but I feel pretty good about the direction this is going.

I’m still curious how other people write assessments. Michael Fenton is leading the charge and I highly recommend reading and responding to his post over at his blog. Tina C has written about her process and Lisa Henry is asking for feedback on a test question of her own. I think this endeavor is related to the question of “How do we create opportunities for our students to exceed our expectations”, and I’m excited to see these conversations continue and grow so that we’re all searching for these Level 5 Questions to give our students.

Some Parting Words from Sam Shah: (If there’s one thing I’m good at, it’s aggregating posts from the Blogotwittersphere with a similar theme, even if they’re from ages ago). Here’s Sam from when he gave a test that really asked students to express their thoughts:

“For me the obvious corollary is that: we need to start rethinking what our assessments ought to look like. If we want kids to truly understand concepts deeply, why don’t we actually make assessments that require students to demonstrate deep understanding of concepts?”

From → Classroom Theory, Curriculum, First-Year Classroom

18 Comments

malcolm permalink

This is a really helpful post! I’ve been struggling with this a lot this year. It’s my first time teaching Algebra, after several years of teaching Geometry, and I’m deeply frustrated with how much my assessments feel like checklists of skills, and repetitive skills at that.

You probably have this somewhere else on your blog, but do you mind if I ask how you calculate grades from this 5 point scale? It looks to me like anyone getting all 3s should definitely be passing, so I assume you don’t just enter grades on a scale of 0 to 5 and take an average, right?

Reply
- mathymcmatherson permalink
  
  Malcolm,
  
  Each test goes into the gradebook out of 5 points. So you’re right – someone who gets all 3’s has earned a passing grade in my class. And I’m okay with that based on how I’ve defined my level 3 problems. And yes – their assessment average is their assessment scores (out of 5) all averaged together.
  
  How I handle to problem of ‘how do grades accurately represent my students’ knowledge?’ is another blog post of its own, and is related to what you’ve asked about.
  
  Reply
  - malcolm permalink
    
    Where I teach, the grade scale is mandated and different from the usual so 60% is still failing. I find I have to fudge it a little; one year I did the scale from 1-6 instead of 0-5, where everyone got a free point on every goal, but that confused the message somehow.
    
    I also work at a school where we’re required to have every 8th grader take Algebra I, which seems absurd to me, but it means it’s hard to know where to set the mastery level. I love the idea of separating procedural from conceptual and being incredibly strict on the procedures; so many of my students come to me not having almost any of the procedural prerequisites to succeed in Algebra, and making it easier for them to scrape by at the beginning hasn’t made them any more successful in the long run.
Mr.Atkinson permalink

I love the break down between procedural and conceptual skills but I’m still struggling with the synthesis skills. I can’t tell if its the ability to synthesize altogether or the ability to synthesize within a certain skill. You have habits of mind mixed in with the procedural skills but I would seperate them out or put them with synthesis skills.

Do you give procedural quizzes and conceptual quizzes at different times or just on different pages of the same assessment?

I know this doesn’t exactly fit in with the blog but have you talked with teachers in other disciplines about how this would workin their English or social
Studies or science class?

Reply
- mathymcmatherson permalink
  
  Mr Atkinson,
  
  You’re right that a ‘synthesis skill’ is still unclear – I’m preparing another post devoted entirely to what I mean by those. I decided this post was already long enough. There’s another one coming that’s just my ideas about Synthesis skills.
  
  I haven’t been giving quizzes this year, but I’m going to start next year. Instead, all of the items I’ve posted would appear as single pages as part of a larger assessment. There could be procedural, conceptual, and synthesis skills as pages in the same assessment. I have certain procedural skills show up multiple times throughout the year, especially if they’re foundational for something else we’ve been working on. So, it’s possible for me to give a ‘test’ (what I call an assessment) that has all 4 of the pages in the post above.
  
  I’m still fleshing together my ideas about the best ways to implement these assessments and supplement them. And there’s also something to be said for how I grade these assessments. I’m learning that writing about assessment takes a lot of work.
  
  Reply
  - malcolm permalink
    
    Thank you for taking the time! I’m really feeling the need to revamp my assessments for next year and this post & discussion has really set me pondering.
    
    I know the boring logistical stuff isn’t as interesting as the deeper quality-questions part of the assessment process, but just so I can get a whole picture – how often do you give assessments this year, and how many times would each conceptual skill appear on an assessment? And do you feel like you’ll keep that system next year?
    
    I ask because I, so far, have given weekly overlapping quizzes in the Dan Meyer mold, and I find that they become really long if I make the questions deeper. It might be unsustainable if every topic has questions to your (obviously outstanding!) level of rigor. I want to move more in that direction and I’m trying to figure out how that looks in my head.
mathymcmatherson permalink

Malcom,

Your ‘boring logistical’ questions are important too – I’m developing a whole post as part of that conversation too. I’m realizing that writing about assessment is all sorts of tricky because of all the details surrounding it.

I started the semester just like you describe – quiz every two weeks with overlapping skills. And I became frustrated based on what you described – that I felt like I couldn’t ask the ‘deeper’ questions that I wanted because I was constrained by the format. Again – another reason I wanted to see what people’s assessments looked like: I was wondering if I was asking the wrong questions, or if that particular format was fundamentally limiting in the types of questions I could ask.

So – next year – I’m not doing assessments every few weeks, but I want to do separate quizzes every few weeks that serve this focus – lower-level questions meant to continually assess and refresh students about their understanding.

I also imagine my units going in this fashion: start with the foundational ideas, which are probably procedural in nature. Halfway through the unit, assessment on these procedural standards so students know where they stand. As the unit progresses, we begin to delve into the more conceptually rich standards. Once we’ve reached the level of rigor that I want, we assess over the conceptual and procedural skills. So, the procedural skills show up again, but it’s the first time for the conceptual skills. If everyone bombs the conceptual skills, then I put it on the bellwork for a week or so and retest (this is something I did this year with geometric probability to a fair amount of success).

Also: last semester, my final was 17 pages long: one page for each skill. So every student had a chance, on the final, to raise their grade one last time. This, to me, is something _powerful_ you can do with a final exam if you assess this way – it truly is the ‘final’ chance to show that you know something. It also means if a student has worked hard all year and is passing at the final, then they are guaranteed to pass the class – something that some students found very motivating halfway through the semester. Anyway – I mention this to answer your question: most skills show up at least twice – once on an initial assessment and again on the final. This semester, I don’t think I’ll assess every skill (it was a pain in the butt to grade under a deadline). Instead, I was thinking of picking all of the conceptual standards I have and testing just on those – which would be something like 9-10 for 2nd semester.

Anyway – those are some brief thoughts. More writing on all of this to come.

Reply
Jason Roy (@roybot) permalink

I can’t help but ask about the level 5 question on the Geometric Probability quiz since I’ve seen you post it twice. I think you are wanting the kids to create a proportion to decide the point value of shape B? If I took this test, this is probably what I would do.
I wonder though, is this what the kids did? Did any question the question? All the question says is that Mr. Schneider is making a game and A is worth 20 points. It doesn’t say that points are proportional to the probability that they will be hit. And of course if you look at a real dartboard for more than a few seconds it is obvious that the points are not divided in such a way. And its a game right, so much of the fun is figuring out the loop holes we can use to win!
I’m not sure any of my kids would go down these roads, but if they did they would certainly exceed my expectations.

Reply
- mathymcmatherson permalink
  
  Jason,
  
  One of the ‘big ideas’ for my Geo Probability unit was that point values and probability have an inverse relationship – what I told my kids: the _smaller_ the area, the _more_ points it should be worth. I like my level 5 questions because it’s one of the first times I found a way to really assess this big conceptual idea. And it all hinges on explanations.
  
  When I gave the test, I wasn’t expecting students to use proportions to determine exact answers (although some did, which is one instance where students exceeded my expectations). Instead, I expected them to use the relative sizes of the areas to pick their points. They should have noticed that shape B is bigger than shape A, and thus should be worth less points. They should have noticed that shape C and shape A have the same area, so they should be worth the same amount of points. Then they should know that in order to have a shape worth more points than shape A, it should have less area than shape A. Overall – I was really satisfied with the explanations my students gave and how these ideas all connected together.
  
  One thing I like about this question is there are many possible answers, but whose validity depends on the explanation that the student gives. If they can provide an argument/explanation that shows me they understand the conceptual underpinnings of geometric probability, then I will give them the points. I don’t think I want to design level 5 questions with a specific solution path in mind – I want it to be slightly open, but requiring an explanation to demonstrate understanding.
  
  Reply
  - Jason Roy (@roybot) permalink
    
    Daniel,
    Wow, your answer to my question (which I was really hoping did not come off as snarky, although I had been worrying about that) was really great. I love that your level 5 questions don’t have right or wrong answers as well.
    
    I am going to try to post more assessment stuff on my fledgling blog. I gave a quiz last week and was able to snap some responses to the trig proof I included on it. I just posted this http://crispymath.com/news/2013/3/28/a-trig-assessment – before I noticed your reply even.
  - mrdardy permalink
    
    Mathy
    I took the liberty of ‘borrowing’ those three geometric probability questions for my Calc BC class. I give them a take home problem set each week as a way to give them practice at both sharing ideas (which sometimes, unfortunately, means copying work) and at looking at novel problems. They all got it ‘right’ and presented some interesting arguments. I was surprised though that a few expressed a bit of annoyance with those problems. I take that as a sign of thinking going on.
Joshua Bowman (@Thalesdisciple) permalink

I’m going to have to switch to your 5-point scheme next time I use SBG instead of the 4-point scheme I used this semester. Several reasons, the main one being that I don’t know how to score a problem that clearly demonstrates mastery of the topic but has a small error.

Reply