Teacher Certification ExamStudy Topic

Assessment and Learning Study Guide for the Teacher Certification Exam

Study assessment and learning for the teacher certification exam. Covers formative vs summative assessment, validity, reliability, and differentiation.

Topic Overview

Formative assessment is assessment for learning: it occurs during the instructional process to provide feedback that the teacher and student use to adjust learning and teaching while there is still time to act on the information. Formative assessment does not have to be formal; it includes observation, questioning, exit tickets, whiteboards, think-pair-share, and brief quizzes. The defining characteristic is that the results are used to inform instruction -- if a teacher gives a quiz but never reviews the results to adjust teaching, it is not truly formative. Research by Black and Wiliam shows that well-implemented formative assessment produces some of the largest gains in student learning of any instructional intervention.

Summative assessment is assessment of learning: it occurs at the end of an instructional unit to measure student achievement against a standard. Examples include end-of-unit tests, final exams, standardized state assessments, and performance projects. Summative assessments are typically used for grading and accountability rather than for adjusting ongoing instruction. Both formative and summative assessments are necessary; the error is treating summative assessment as the only measure of learning.

Validity is the degree to which an assessment measures what it is intended to measure. A test with high content validity covers a representative sample of the content from the instructional domain (the test measures what was taught). Construct validity is the extent to which a test measures the theoretical construct it is supposed to measure. Predictive validity is the degree to which test scores predict future performance (for example, SAT scores predicting college GPA). An assessment can be reliable without being valid (consistently measuring the wrong thing), but it cannot be valid without being reliable.

Reliability is the consistency of assessment results across time, items, raters, and test forms. Types of reliability include: test-retest reliability (same test given twice yields similar scores), inter-rater reliability (two different scorers give consistent scores for the same performance), internal consistency (all items on the test measure the same construct), and alternate forms reliability (two versions of the same test yield similar scores). Reliability is a necessary but not sufficient condition for validity.

Norm-referenced vs. criterion-referenced assessment: a norm-referenced test (NRT) compares a student's performance to a normative sample (other test-takers) and produces scores such as percentile ranks and stanines. NRTs are designed to spread scores across a range and are used when ranking is important (such as college admissions tests). A criterion-referenced test (CRT) measures whether a student has achieved a defined standard of performance, regardless of how others performed. Most state standards-based assessments and classroom tests measuring mastery are criterion-referenced. Teacher certification exams are criterion-referenced: there is a passing score based on required competency, not on how other candidates perform.

Differentiated instruction is a framework for responding to student variance in readiness, interest, and learning profile by adjusting content (what students learn), process (how they learn it), and product (how they demonstrate learning). Differentiation is not giving different students different content standards; it is providing varied paths to the same rigorous learning goals. Data from formative assessment is the primary driver of differentiation decisions -- teachers use what they learn from ongoing assessment to form flexible groups, re-teach, extend, or provide additional scaffolding.

Common Mistakes to Avoid

Confusing formative and summative assessment; formative assessment happens during instruction and is used to adjust teaching and learning. Summative assessment happens at the end of a unit or course and measures final achievement. The distinction is about purpose and timing, not format -- a quiz can be either formative or summative depending on how it is used.
Confusing validity and reliability; reliability is consistency (the test gives similar results each time it is administered). Validity is accuracy (the test measures what it is supposed to measure). A reliable test is not automatically valid; a valid test must be reliable. On the exam, remember: reliability is necessary but not sufficient for validity.
Treating norm-referenced and criterion-referenced scores as equivalent; a percentile rank (NRT) tells you how a student performed relative to other test-takers, not whether they have mastered specific content. A criterion-referenced score (CRT) tells you whether the student met a defined performance standard.
Forgetting that standardized test scores such as percentile ranks and stanines are norm-referenced; a percentile rank of 70 means the student scored higher than 70% of the norm group, not that they answered 70% of questions correctly.
Assuming differentiated instruction means different content standards for different students; differentiation adjusts the path to learning (content, process, or product) but not the learning goal itself. All students work toward the same standards at appropriate levels of challenge.
Overlooking the use of data in instructional planning; assessment data (from formative checks, benchmark tests, and state assessments) should directly inform instructional decisions such as grouping, re-teaching, pacing, and enrichment, not just be collected and filed.

Checkpoint Quiz

Test your understanding of Assessment and Learning

These questions are for study practice only and are not official exam questions.

1. Which assessment strategy BEST promotes student metacognition?
2. A teacher administers the same math quiz twice in two weeks without re-teaching. Students score similarly both times. This suggests the quiz has high:
3. Which of the following is the BEST example of a formative assessment?
4. A teacher wants to ensure that an assessment has strong content validity. The MOST important step is to:
5. A 504 Plan differs from an IEP primarily in that a 504 Plan:
6. After analyzing class test data, a teacher notices that 80% of students missed every question about fractions but performed well on all other topics. The MOST appropriate instructional response is to:
7. A student with a reading disability receives extended time and a human reader on a standardized science assessment per their IEP. A colleague argues that these accommodations give the student an unfair advantage. The MOST accurate response to this concern is:
8. Which of the following BEST describes criterion-referenced assessment?
9. A teacher uses pre-assessment data to form three flexible reading groups before starting a new novel unit. Which term BEST describes the purpose of this pre-assessment?
10. A district notices that students from lower-income households consistently score significantly lower on a standardized assessment even after controlling for instructional quality. This pattern most likely signals a need to examine:

Start Full Practice Test

Study GuideRead the full guide for the Teacher Certification Exam Practice TestTake a free, timed practice test Full Exam PrepAll topics, tests, and state pages

Frequently asked questions

What is the difference between formative and summative assessment?

Formative assessment occurs during instruction and provides feedback used to adjust learning and teaching in real time. Examples include exit tickets, questioning, and brief checks for understanding. Summative assessment occurs at the end of a learning period to measure final achievement against a standard. Examples include end-of-unit tests and state assessments. The key difference is purpose: formative is for learning, summative is of learning.

What is the difference between validity and reliability in testing?

Reliability is the consistency of a test -- it gives similar results under similar conditions. Validity is the accuracy of a test -- it measures what it is intended to measure. A test can be reliable (consistent) without being valid (if it consistently measures the wrong thing). A valid test must also be reliable. Think of reliability as a necessary but not sufficient condition for validity.

What is a norm-referenced test and how does it differ from a criterion-referenced test?

A norm-referenced test compares a student's performance to a normative sample and produces relative scores (percentile ranks, stanines). It is designed to spread students across a distribution and is used when ranking is the goal. A criterion-referenced test measures whether a student has achieved a specific performance standard, regardless of how others performed. Most classroom mastery tests and teacher certification exams are criterion-referenced.

What does a percentile rank of 65 mean?

A percentile rank of 65 means the student scored as well as or better than 65 percent of the students in the norm group. It does not mean the student answered 65 percent of questions correctly. Percentile ranks are norm-referenced scores and indicate relative standing within a comparison group.

What is differentiated instruction?

Differentiated instruction is a teaching approach that responds to the diverse readiness levels, interests, and learning profiles of students by adjusting content (what is taught), process (how students access and work with ideas), and product (how students demonstrate learning). Differentiation is not different standards for different students; all students work toward the same learning goals through varied pathways. Formative assessment data drives grouping and pacing decisions in a differentiated classroom.