Behaviorally Anchored Rating Scale:The Interview Scoring Method That Actually Works
Most interview scorecards ask interviewers to rate candidates on a 1-5 scale with no explanation of what a 3 means versus a 4. The result is gut-feel scores dressed up as data. Behaviorally anchored rating scales fix that by replacing vague labels with specific descriptions of what good and bad actually look like.
The problem with standard interview scoring is not the number scale. It is that every interviewer brings a different mental model to the same label. One person's "4 for communication" is another's "3 for leadership." Without a shared definition, you are not comparing candidates. You are averaging personal preferences. This is one reason studies consistently show that unstructured interviews predict job performance only slightly better than chance.
A behaviorally anchored rating scale (BARS) solves this by describing observable behavior at each score level. The descriptions come from real examples: what your top performers actually did, not what someone imagines an ideal candidate should do. When two interviewers score the same candidate answer against the same behavioral anchor, their scores converge. That consistency is the whole point.
BARS originated in industrial-organizational psychology in the 1960s. Smith and Kendall (1963) developed the framework to address rating errors in performance appraisals. The same logic applies directly to interview evaluation: if you define what each score means in behavioral terms, interviewers rate what actually happened in the conversation rather than how they feel about the candidate. Research published in the Journal of Applied Psychology found structured interviews with behavioral anchors reduce between-interviewer variance by over 40%.
This guide covers how BARS works, how to build one from scratch, four ready-to-use competency examples, and the mistakes that make well-intentioned rating scales useless. If you already use a structured interview scorecard or structured interviews, BARS is the next layer that makes your scoring defensible.
The Core Problem
Why traditional rating scales produce unreliable scores
Ask three interviewers to score the same candidate on a 1-5 scale for "communication." You will get three different numbers. Not because they watched different conversations, but because nobody told them what a 4 looks like. BARS gives every score point a behavioral definition grounded in real job performance.
Interviewer A: "4 = Above Average" means something different to interviewer B
Every interviewer sees exactly what a 4 looks like. No guessing.
The traditional 1-5 scale shown on the left is technically measuring something. The problem is that each interviewer maps their own experience onto those labels. Someone who has worked with a truly exceptional communicator sets their 5 bar much higher than someone who has not. The scores look quantitative but they are mostly just opinion.
The BARS version on the right describes actual behavior. A 5 is not "excellent communication." It is "tailored message to audience, used data and story, confirmed understanding." Any interviewer reading that anchor can decide whether what they heard in the interview matched it. The behavioral description does the normalization work that labels cannot.
The Framework
What BARS actually is and how it works
A behaviorally anchored rating scale has three components:
A defined competency
The skill or behavior you are evaluating: problem solving, stakeholder management, resilience, communication. One competency per scale. Not a mix.
A numeric scale
Typically 3-7 points. Five points is the most common. The numbers themselves do not matter; what matters is that each one has a behavioral anchor.
Behavioral anchors at each level
A short description of what observable behavior looks like at that score point. Written in past tense, specific, with no vague adjectives. Derived from real examples, not theory.
The key word is "behavioral." A behavioral anchor describes what someone did, not what they are. "Stayed calm under pressure" is not behavioral. "When the deployment failed at 11pm, identified the rollback path, communicated status to three stakeholders, and had the system back up within 90 minutes without escalating" is behavioral. The specificity is what makes it useful.
BARS is most often used in three hiring contexts: panel interview evaluation, phone screen scoring, and post-interview calibration discussions. When your interviewers each have a copy of the anchors, disagreements about a candidate's score become conversations about specific behaviors rather than personality impressions. That is a much more productive conversation. For the broader context, see our guide on interview training for hiring managers.
Building Your Own
How to build a BARS from scratch in 5 steps
Building BARS takes 4-6 hours total for a single role. The upfront investment pays off quickly. Once built, you reuse the same anchors for every hire. Here is the exact process.
Define the competency
Pick a specific skill or behavior relevant to the role. 'Communication' is fine. 'Vague professionalism' is not.
Collect critical incidents
Ask your best performers: what did they actually do when this competency mattered? Gather 10-15 real stories.
Sort into performance levels
Group the stories from worst to best. 3-7 points on the scale. 5 is usually enough.
Write behavioral anchors
Describe the observable behavior at each level. Past tense, specific, zero adjectives like 'excellent.'
Calibrate with interviewers
Run the same test answer through three interviewers. If scores diverge by more than 1 point, refine the anchors.
What makes good behavioral anchors
The single most important rule: write anchors in past tense, in specific behavioral terms. "Communicated clearly" is not an anchor. "Sent a written summary to all stakeholders within 24 hours of the decision, with context, rationale, and next steps" is an anchor. The test is simple: could two people read the anchor and agree on whether a specific candidate response met it or not?
Good anchors come from the critical incident technique. Ask your best performers to describe three situations where their behavior in this competency area directly affected the outcome: once where they performed well, once where they could have done better. Strip out the context until you have just the behavior. Group those behaviors into levels. That is your scale.
The competencies you build anchors for should map directly to the role's actual demands. A customer success manager role might prioritize stakeholder communication, conflict resolution, and handling ambiguity. A software engineering role might prioritize debugging methodology, technical communication, and handling scope changes. Match the competencies to the job, not to a generic HR template. For the underlying job analysis, our hiring plan guide covers competency mapping in more detail.
Ready-to-Use Examples
4 BARS examples you can adapt today
These four competencies cover most professional roles. Use them as a starting point and refine the anchors based on your specific job requirements and what your best performers actually look like.
Problem Solving
Identified root cause through data, proposed two solutions with trade-offs, gained buy-in
Broke problem into parts, found a workable solution, checked with stakeholders
Solved the surface issue; root cause analysis incomplete
Applied a previous solution without assessing fit
Escalated without attempting any analysis
Stakeholder Management
Mapped stakeholders proactively, tailored updates by role, anticipated concerns before they arose
Regular communication, responded to concerns promptly, kept key people informed
Informed main stakeholders but some were surprised by decisions
Communicated only when required; reactive, not proactive
Key stakeholders unaware of progress or blockers
Handling Ambiguity
Set own success criteria, made progress without full information, aligned team on direction
Moved forward on partial data, flagged assumptions, course-corrected when new info arrived
Completed task but waited longer than necessary for clarity
Requested detailed instructions before starting; minimal independent judgment
Stalled on unclear tasks; needed full specification before acting
Conflict Resolution
Addressed conflict early, found shared interest, restored working relationship afterward
Raised disagreement directly, listened to other side, reached workable agreement
Resolved conflict but delayed the conversation longer than necessary
Avoided direct conversation; involved manager prematurely
Let conflict escalate or left it unresolved
Notice that each anchor uses past tense and describes what the person did, not what they are. A 5 on Problem Solving is not "exceptional analytical thinker." It is "identified root cause through data, proposed two solutions with trade-offs, gained buy-in." The first is an adjective. The second is a behavior you can match against a candidate's answer.
When using these examples, the single most important customization is the 5-anchor. What does elite performance look like for your best current employee in this competency? Start there and work backward. The bottom anchor (1) is easier to write: it is the behavior that would have led to a clear failure in a real situation on the job. The middle anchors (2-4) describe the gradient between those extremes.
Implementation
How to integrate BARS into your interview process
The mechanics are simple. Before the interview, each interviewer receives a copy of the BARS for the competencies they are evaluating. Not all interviewers need to evaluate all competencies. In a panel format, it is more effective to assign each interviewer two or three competencies and let them focus. This reduces the cognitive load and produces deeper evaluation per competency.
During the interview, the interviewer asks a behavioral question designed to surface evidence for their assigned competency. "Tell me about a time you had to communicate a complicated decision to a non-technical audience" produces the behavioral data needed to score against the Communication BARS. After the interview, the interviewer scores immediately while the conversation is fresh, matching what they heard against the anchor descriptions.
The calibration discussion is where BARS earns its value. Instead of "I thought she was great," interviewers bring their scores with behavioral evidence. "I gave her a 4 on stakeholder management because she described sending written summaries proactively, but she did not mention checking for understanding or anticipating concerns, which would be a 5." That is a real conversation about performance, not a personality contest.
Google's re:Work research on structured interviewing found that work sample tests and structured interviews with clear criteria were the two most predictive hiring methods. BARS is the mechanism that makes structured interviews actually structured. Without it, "structured" just means you asked the same questions without comparing answers on common terms.
One practical note: keep the anchors visible during the debrief. Interviewers who score without referring back to the anchors drift toward their original impression of the candidate. The value of BARS comes from anchoring decisions to the written standard repeatedly, not just at the moment of scoring. When you use an applicant tracking system that supports structured evaluation, you can embed the anchors directly into the scorecard so interviewers reference them at the point of scoring.
What Goes Wrong
5 mistakes that make BARS useless
BARS done poorly is worse than no scale at all. It gives the appearance of rigor while still producing garbage data. These are the mistakes I see most often.
Using adjectives as anchors
Replace 'excellent communicator' with what that actually looks like: 'explained technical concept without jargon, confirmed audience understanding before moving on.'
Building BARS without top performers
The anchors come from real stories. Build them with 3-5 people who have already done this job well, not just the hiring manager's theory of what good looks like.
Too many scale points
Seven points sounds precise. It is not. Interviewers cannot reliably distinguish a 4 from a 5 from a 6. Five points with clear anchors beats seven with vague ones.
Building BARS once, using forever
The job changes. The team changes. Review anchors at least annually and after any significant performance issue that the scale failed to predict.
Skipping calibration sessions
Even perfect anchors drift in interpretation over time. Run a calibration session each quarter where interviewers score the same sample response and compare.
The most damaging mistake is building BARS in a HR meeting without input from people who have done the job. HR professionals know process. They do not know what a 5-out-of-5 problem solver looks like on the operations floor or in a client call. That knowledge lives with your best performers. A BARS built without them is a theory of good performance, not a description of it.
Context
Where BARS fits alongside other evaluation methods
BARS is one component of a complete interview evaluation framework, not a standalone solution. The strongest hiring processes combine multiple methods, each designed to test something the others cannot.
- •Behavioral competencies that show up in conversation
- •Panel interviews where multiple raters need to align
- •Senior roles where judgment and soft skills drive performance
- •Reducing inconsistency across a large or distributed hiring team
- •Pure technical skills where a code test or case study is more direct
- •Roles with very small sample sizes (built once, never reused)
- •Competencies that are hard to surface in a single interview conversation
- •Hiring teams that will not commit to calibration sessions
My view is that BARS should be table stakes for any organization hiring more than 10 people a year. At that volume, the inconsistency between interviewers becomes a real problem, not just for fairness, but for your ability to learn which candidates succeed and which do not. Without consistent scoring, your hiring data is noise.
BARS also provides legal defensibility that vague scorecards do not. If a rejected candidate challenges a hiring decision, "we compared their responses against written behavioral criteria developed from job analysis" is a much stronger position than "our interviewers felt they were not the right fit." The EEOC's Uniform Guidelines on Employee Selection Procedures explicitly favor selection criteria tied to job-relevant behaviors over subjective judgment. For teams working on bias reduction alongside this, our guide on reducing unconscious bias in hiring covers the broader framework.
Getting Started
How to introduce BARS to your team without resistance
The biggest obstacle to BARS adoption is not skepticism about whether it works. It is the time investment to build it. Most hiring managers already feel like the interview process takes too long. Asking them to spend half a day developing rating scales before they can screen anyone feels like adding process, not removing it.
The honest answer is that the upfront time is real. Four to six hours to build BARS for a single role is not trivial. But the math works out quickly: if you hire for the same role twice a year, you save far more time than you spent by reducing debrief disagreements, preventing halo-effect advances, and making faster calibration calls. A hiring team that argues for 45 minutes about whether a candidate's communication was "strong enough" every time it comes up is burning far more time than a 5-hour BARS build.
Start with one role. Pick your most common hire, the one you do three or more times a year. Build BARS for that role first, run it through one hiring cycle, then measure whether your inter-rater reliability (how often two interviewers score the same answer within 1 point) improved. That data is far more persuasive to skeptical hiring managers than any argument.
For teams using structured interviewing already, adding BARS to your existing interview scorecard is a straightforward upgrade. Replace the open-ended rating boxes with anchor descriptions. Run one calibration session to align on interpretation. The process change is small; the improvement in scoring consistency is significant.
Frequently Asked Questions
What is a behaviorally anchored rating scale (BARS)?
A behaviorally anchored rating scale is an interview evaluation tool that replaces vague labels like 'excellent' or 'meets expectations' with specific, observable behavior descriptions at each score level. Instead of rating someone a 4 out of 5 for communication, you rate them against written descriptions of what a 4, 3, 2, or 1 actually looks like, based on real examples from people who have done that job well.
How is BARS different from a standard interview scorecard?
A standard scorecard lists competencies and asks interviewers to rate them. BARS does the same thing but gives each score point a behavioral definition. A scorecard without BARS asks 'how good is their communication?' BARS asks 'does their example match the description for a 4 or a 5?' The second question is far less subjective and produces much more consistent scores across interviewers.
How many competencies should I include in a BARS?
Four to six competencies per role is the practical ceiling. Beyond that, interviewers spend so much time scoring that they stop listening. Pick the competencies that actually predict performance in that specific role, not a generic list of 10 virtues. A senior sales role might need just four: stakeholder management, resilience, discovery skills, and handling objections.
Can you use BARS for all roles?
BARS works best for roles where you can describe observable behavior clearly: management, sales, customer success, operations, and most professional roles. It is harder to use for highly technical roles where performance looks like code quality or system design rather than interpersonal behavior. For those roles, structured technical assessments and work samples tend to produce better signal than behavioral anchors.
How long does it take to build a BARS from scratch?
A well-built BARS for one role takes 4-6 hours total: about an hour gathering critical incidents from high performers, two hours drafting and grouping anchors, and another hour or two calibrating with your interview panel. The output is reusable for every hire in that role, which makes the upfront time worthwhile if you hire for that position more than twice a year.
Does using BARS eliminate interviewer bias?
It reduces bias significantly but does not eliminate it. BARS removes the ambiguity that lets bias sneak in through vague labels. Two interviewers arguing about whether a candidate is a '4 or 5' on communication are often expressing personal preferences, not performance data. Anchoring both of them to the same behavioral description cuts that variation. But confirmation bias, similar-to-me bias, and halo effects still require structured interviewer training to address fully.
Resources & Further Reading
Related Guides
- Interview Scorecard: How to Build One That Predicts Performance
Build the scorecard that BARS anchors plug into
- Structured Interviews: A Complete Guide for Hiring Managers
The broader framework that makes BARS most effective
- 60 Behavioral Interview Questions
Questions designed to surface behavioral evidence for BARS scoring
- Interview Training for Hiring Managers
Train your team to use BARS effectively
External Sources
- Google re:Work: Structured Interviewing
Research on what actually makes interviews predictive
- SHRM: Talent Acquisition Resources
HR industry research on structured evaluation methods
- EEOC: Uniform Guidelines on Employee Selection
Legal framework for defensible hiring criteria
- HBR: How to Take the Bias Out of Interviews
Research on structured evaluation and bias reduction
Structured scoring built into your hiring workflow
Prepzo's interview module includes structured scorecards with behavioral anchors, so your team scores candidates consistently, every time. Free to start.
Try Prepzo free