Question Analysis

Contents

The OnTarget question analysis report is considerably different than what is being offered by most other software systems. Question analysis is a method of statistically reviewing items on a test to ensure that every question meets minimum quality control criteria. This statistical analysis is conducted after the test has been administered. Question analyses are conducted to review the quality and difficulty of the items.

There are three different ways to dig deeper or analyze the questions

Looking at all questions or by quality such as poor questions only

2. By Question Number, Question Difficulty, or Question Quality

3. Ascending or Descending Order

After deciding which way to analyze, it is time to dig deeper into each question either as a team, PLC, or individually. On the left is a section for input while going through the question analysis, identifying what action was taken, or keeping notes for future reference.

Sample Question Analysis Report

This Question Analysis report provides these three key metrics that indicate how well this particular test question performed. Here’s what each measurement means:

Question Difficulty (P-Value)

0.93 This means 93% of students answered this question correctly. In general:

P-values range from 0.00 to 1.00
0.50 (50% correct) is considered ideal difficulty
0.30-0.70 is the optimum range for most classroom assessments
The 0.93 indicates this question was too easy – almost everyone got it right

Question Difficulty (Rasch)

-1.37 This is another way to measure difficulty using a different scale:

The scale typically runs from -4 to +4
Negative numbers = easier questions
Positive numbers = harder questions
0 = moderate difficulty
The -1.37 confirms the question was quite easy

Question Quality (Point Biserial)

0.35 This measures how well the question distinguishes between high and low-performing students:

Values range from -1.00 to +1.00
Higher positive values are better
0.30+ is generally considered “good”
The 0.35 means students who scored well overall were more likely to get this question right, which is what you want

What this means

This question worked well to identify student understanding (good quality), but it was too easy to provide meaningful assessment data. Consider making future similar questions slightly more challenging while maintaining the clear distinction between concepts that made this one high-quality.

Evidence of Validity

This is a validity review checklist used to systematically evaluate test questions to ensure they’re fair, accurate, and appropriately constructed. Here’s what each section helps you examine:

Standards Alignment

Verifies the question actually measures what it’s supposed to measure
Ensures it aligns with specific learning standards or objectives
Confirms the question matches the intended grade level and content area

State Standards: Does it match your curriculum requirements?
Depth of Knowledge (DOK): Does the cognitive demand match your learning objective?
- DOK 1: Recall facts (identify, list)
- DOK 2: Apply skills/concepts (describe, compare)
- DOK 3: Strategic thinking (analyze, evaluate)
- DOK 4: Extended thinking (synthesize, create)

Example issue: Teaching students to “analyze” but asking them to only “identify” creates a mismatch.

Bias and Sensitivity

Checks for cultural, gender, socioeconomic, or other biases
Ensures the question doesn’t favor certain groups of students
Reviews for potentially offensive or insensitive content
Looks for assumptions about student backgrounds or experiences

This protects all students from unfair disadvantages:

Cultural bias: Avoid references that favor certain backgrounds (e.g., assuming all students know about skiing)
Stereotypes: Don’t reinforce harmful assumptions about groups
Sensitive topics: Avoid emotionally charged content unrelated to learning objectives

Example issue: A math word problem about expensive vacations may disadvantage students from low-income families.

Language and Vocabulary

Evaluates whether the reading level is appropriate
Checks for unnecessarily complex or confusing language
Ensures vocabulary doesn’t create barriers unrelated to the content being tested
Reviews for clear, concise wording

This ensures language doesn’t become a barrier to demonstrating knowledge:

Grade-appropriate vocabulary: Use words students at this level should know
Clear, concise writing: Eliminate unnecessary complexity
Consistent terminology: Don’t switch between “rectangle” and “quadrilateral” randomly
Active voice: “Students conducted the experiment” vs. “The experiment was conducted by students”

Example issue: Using “precipitation” instead of “rain” in a 2nd-grade science question.

Structure and Context

Examines the overall organization and flow of the question
Checks if the context/scenario is realistic and relevant
Ensures the question format supports the learning objective
Reviews for logical progression of information

This ensures the question format supports learning assessment:

Clear instructions: Students understand what they’re being asked to do
Appropriate context: Scenarios are realistic and relevant
No unintended clues: The structure doesn’t give away the answer
Parallel structure: All answer choices follow the same format

Example issue: Making the correct answer noticeably longer than the distractors.

Answer Choices (for multiple choice questions)

Evaluates whether distractors are plausible but clearly incorrect
Checks that the correct answer is definitively right
Ensures choices are similar in length and complexity
Reviews for “all of the above” or “none of the above” issues

This ensures multiple-choice questions work effectively:

Plausible distractors: Wrong answers should reflect common misconceptions
Grade-level appropriate: Distractors based on content students should know
Avoid “gotcha” answers: Don’t trick students who actually know the material
One clearly correct answer: No ambiguity about which choice is right

Example issue: Having distractors like “purple elephant” that no student would reasonably choose.

Visuals

Checks charts, graphs, images, or diagrams for clarity
Ensures visuals support rather than confuse the question
Reviews for appropriate size, resolution, and labeling
Confirms visuals are accessible to all students

This ensures graphics support rather than hinder assessment:

Purpose-driven: Charts/images directly support the question’s objective
Complete information: Everything needed to answer is provided
Clear and legible: Students can easily read and interpret visuals
Accessible: Works for students with visual processing differences

Example issue: Including a blurry graph where students can’t read the numbers on the axes.

How to Use This Checklist

Review each question systematically through all six lenses
Mark “No” for any area that needs improvement
Revise the question before using it with students
Document your changes for future reference
Build a bank of validated questions over time

This process might seem time-consuming initially, but it prevents the much larger problem of making instructional decisions based on invalid assessment data. Start with the most important assessments (unit tests, benchmarks) and gradually work through each question bank.

Here’s why this matters so much

Analyzing assessments for validity is critical for ensuring tests actually do what they’re supposed to do.

Validity ensures the assessment actually measures the intended learning objectives, not unrelated factors like reading ability when testing math concepts, or cultural knowledge when testing science skills. Without validity, one might think a student doesn’t understand fractions when they actually just struggled with complex vocabulary.

Updated on May 28, 2025

Was this article helpful?

Yes No