Assessment & Evaluation

Designing assessment systems that actually predict job performance

A four-project + capstone sequence applying instructional design and assessment theory to a real-world corporate training problem — from diagnosing what's broken to building a system that stays effective over time.

Corporate L&D / Call Center Training

~200 customer service reps, 3 call centers

Lead Instructional Designer

EDUC 42595 — Assessment & Evaluation

TechFlow Solutions: when training "works" on paper and fails in practice

TechFlow Solutions runs a four-week, fully online Customer Service Excellence Program for approximately 200 new hires and tenured reps across three call centers. Annual investment: around $150,000. On paper, the program was delivering — completion rates were high, assessment scores were strong.

On the floor, none of it translated. Escalation rates for recent graduates were running 35% above baseline. Supervisors were describing the training scenarios as "nothing like what we actually deal with" and spending hours re-training people who had just passed. Learners were completing the program and still reporting they didn't feel ready to handle an angry customer.

The question wasn't whether the training had problems. The question was whether the right fix was patching what existed or building something structurally different. These four projects work through that question systematically.

"High assessment scores do not reliably predict on-the-job performance — which means the assessments are measuring the wrong things, not that learners aren't learning."

Four projects, one continuous design arc

Each project builds directly on the last. The progression moves from diagnosing the problem, to designing a solution, to implementing it with integrity, to ensuring it keeps working — which is also how a real L&D initiative would unfold.

Project Foundation & Assessment Diagnosis

Defining the context, establishing learning objectives, and identifying what the existing system gets wrong

  • Before designing anything, I needed to understand what the current system was actually measuring — and why it was producing results that looked good and performed poorly. This project establishes the TechFlow context in detail, defines five performance-focused learning objectives, and diagnoses three structural problems with the existing assessment architecture: lack of authenticity, surface-level academic integrity risks, and misalignment between scores and business outcomes.

    • Grounded learning objectives in job performance criteria, not content recall

    • Identified construct validity as the root problem, not learner motivation

    • Established that criterion-referenced assessment is appropriate for this context — and that the criteria themselves were wrong

    • Set up the case for DMADV over incremental improvemen

Constructive alignment

Stakeholder analysis

Criterion-referenced assessment

Performance-based objectives

Assessment validity

Assessment Blueprint

Designing the formative and summative assessment architecture before a single tool is selected

  • With the diagnosis complete, this project lays out the assessment philosophy and strategic architecture. The central question I kept coming back to: who gets to define what effective performance looks like? I grounded the answer in supervisory expectations and frontline practice — not executive-level metrics. The blueprint prioritizes formative assessment early and often, followed by a smaller set of high-impact summative tasks designed to generate evidence of actual readiness, not completion.

    • Assessment philosophy rooted in supervisory and frontline expertise, not executive reporting

    • Formative-heavy design supports learner confidence-building before high-stakes evaluation

    • Each assessment given a distinct, non-overlapping function — avoiding the blur between formative checkpoints and summative judgment

    • UDL principles embedded across multiple means of expression

Assessment philosophy

Peer feedback

Formative / summative balance

Learner agency

UDL

Rubric design

Implementation Plan & Six Sigma Process Application

Rebuilding the assessment system from the ground up using DMADV, GoReact, and behaviorally anchored rubrics

  • This is where the design gets operational. After determining that DMAIC — iterative improvement of what exists — was the wrong framework for a system with structural validity problems, I applied DMADV to design a new assessment architecture from the ground up. The centerpiece is a communication performance task evaluated through GoReact, which handles AI-assisted pre-sorting at scale before human evaluators make final readiness determinations. Integrity is addressed through personalization and process documentation rather than surveillance.

    • DMADV selected over DMAIC — structural misalignment requires redesign, not patch

    • GoReact deployed as AI-assisted pre-sort layer, reducing evaluator load while preserving human judgment at the decision point

    • Behaviorally anchored rubric criteria calibrated to actual call types documented by supervisors, not generic scripts

    • Academic integrity framework treats the old system's design as the problem, not the learners

    • Implementation timeline and resource requirements stress-tested against 200-learner scale

DMADV / Six Sigma

Academic integrity

Artificial Intelligence

Inter-rater reliability

Behaviorally anchored rubrics

Scalability planning

Continuous Improvement Plan

Defining how the redesigned system stays accurate, equitable, and effective over time

  • A well-designed assessment system still calcifies if no one tends to it. This project defines the infrastructure for keeping the system honest: PDCA cycles as the operational tool within a broader Continuous Improvement culture, three-interval data collection (post-cohort, 90-day, annual), and a tiered reporting structure designed so that each audience — the L&D team, supervisors, leadership — gets the signal that's actually useful to their decision-making. The success metrics are deliberately pointed: not whether improvement cycles are running, but whether the system's predictive validity is actually improving over time.

    • PDCA cycles nested within a Continuous Improvement Model — isolated fixes compound better inside a data-driven culture

    • 90-day predictive validity check as the primary test of whether assessment is doing its job

    • Quarterly Dashboard designed to a "readable in five minutes" standard — editorial discipline over data exhaustiveness

    • Equity review built into the annual cycle, not treated as a separate initiative

    • Improvement success defined by sustained outcomes, not by volume of cycles run

PDCA

Stakeholder reporting

Continuous improvement

Equity review

Predictive validity

KPI design

Complete Assessment Strategy

The Final Project synthesizes everything from the four Mini Projects into a single, implementation-ready document addressed to TechFlow's Training Director. Where the Mini Projects developed individual components of the assessment system, this strategy presents the full architecture — refined, integrated, and pressure-tested — alongside the professional recommendations and reflective practice documentation that would accompany a real proposal to organizational leadership.

Complete Assessment Strategy

A full redesign of TechFlow's assessment architecture, from structural diagnosis to implementation-ready strategy

  • This document is written as a professional deliverable to TechFlow's Training Director — not as academic coursework. It opens with an executive summary that names the original system's core failure plainly: a 98% completion rate and an 87% average assessment score coexisting with a post-training escalation rate 35% above baseline. Every design decision that follows connects back to one standard: assessment should predict job performance, not generate completion records.

    The strategy covers the full arc — a refined three-component assessment architecture (LMS branching simulations, GoReact-supported communication performance task, and structured reflection protocol), a CLEAR-framework technology integration rationale, an adaptability and scalability plan, data governance and ethical use framework, and a continuous improvement infrastructure built to evolve with each cohort. It closes with a professional reflection on the design process itself, naming the areas of growth — scalability thinking, ethical data use, rubric calibration at scale — with the same directness applied to the TechFlow diagnosis.

    • Executive-level communication: presenting a complex system to a non-specialist decision-maker without losing rigor

    • Ethical data use framework for mandatory-participation workplace training — where the stakes on transparency are higher than optional contexts

    • Three-year structural review cadence with specific early-trigger signals, not just a calendar reminder

    • Explicit professional reflection on limitations and continued development areas — what a confident designer looks like when they're being honest

    • Appendices designed for actual use: rubric ready to hand to evaluators, reporting structure ready to hand to leadership

Full system integration

Executive communication

Risk mitigation

Data governance

Professional reflection

CLEAR framework

Adaptability planning

What this work shows

These projects move from theory to practice across the full assessment design lifecycle. Each competency is visible in the work itself — not just claimed.

Assessment validity & alignment

Designing assessments that measure what jobs actually require, not what's easy to score

Technology integration

GoReact deployment at scale, with a clear rationale for AI-assisted pre-sorting vs. human evaluation

Equity & inclusion in design

UDL principles, bias analysis, and equity review built into the system — not added on

System-level thinking

Treating formative, summative, and continuous improvement as a connected architecture

Scalability planning

Pressure-testing resource-intensive designs with real math before committing to implementation

Data-driven improvement

Defining success metrics that distinguish genuine improvement from activity

Process improvement frameworks

Applying DMADV and PDCA appropriately — and knowing when each is the right tool

Stakeholder communication

Tiered reporting designed around what each audience actually needs to make decisions

Rubric design

Translating supervisory expertise into behaviorally anchored criteria — avoiding style bias and centering decision-making over polish