Re-labelling 78,000 mis-annotated dashcam frames for a Sydney Pyrmont AV startup — AB7 vision pod, 11-day turn
The head of perception at an 18-person autonomous-vehicle startup in Pyrmont — two blocks from the Sydney fish market, IST 4.5 hours behind AEST — had a training set that was quietly poisoning the model. A prior offshore vendor had labelled 78,000 dashcam frames, but a spot-audit found a 14% error rate on pedestrian and cyclist bounding boxes. The team’s mAP had stopped improving three sprints in a row. AB7’s Mohali vision pod re-labelled all 78,000 frames in 11 calendar days at $0.19 per frame — $14,820 — and the corrected set moved cyclist recall up 9 points. This is what that job actually involved.
If you have a vision dataset that needs auditing or re-labelling, this is one of the ten categories on the AB7 AI & Robotics Services hub. The rest of this post is the work behind the number.
A bad label is worse than a missing label
A missing annotation costs you one training example. A wrong annotation actively teaches the model the wrong thing. The Pyrmont team’s problem was the second kind. Boxes were drawn loosely around cyclists, occluded pedestrians were skipped, and “person on scooter” was inconsistently tagged as pedestrian on some frames and cyclist on others. The model learned the noise.
The first job was not labelling. It was measuring how bad the existing set was. AB7’s pod sampled 2,000 frames at random and scored every box against a written rubric. The audit confirmed 14% box-level error, concentrated in three classes: cyclist, occluded pedestrian, and scooter-rider. Those are exactly the classes a Sydney AV stack cannot get wrong.
The rubric came before the first box
Most annotation goes wrong because the spec is verbal. We wrote a 9-page labelling guide with the customer’s perception lead first. It pinned the hard cases: a cyclist dismounted and walking the bike is a pedestrian; a person more than 60% occluded by a parked car still gets a box with the occluded attribute; a delivery e-scooter is its own class, not a cyclist.
Every edge case got a reference frame. The guide is the asset that makes a re-label repeatable, and the customer kept it for their next vendor or in-house team.
Who did the work
The pod was 6 trained annotators plus 1 QA reviewer, all in AB7’s Mohali Phase 8B vision cell. They labelled in CVAT, the open-source annotation tool, with the customer’s class schema imported directly. Two annotators were assigned only to the three error-prone classes so judgement stayed consistent across the set.
The 4.5-hour IST-to-AEST gap meant the Sydney lead reviewed flagged frames at 09:00 his time, which was 04:30 in Mohali — the pod’s questions from the prior shift were answered before the Sydney workday even started. No 24-hour round-trip on a blocking question.
The numbers a perception lead actually budgets against
Here is the cost comparison the Pyrmont head of perception ran. Scale AI’s managed video annotation quoted roughly $0.41 per frame for this class mix, around $32,000. A US-based boutique came in at $0.55 per frame. MTurk was cheaper per frame but failed the pilot — the same inconsistency that created the original mess.
AB7’s number was $0.19 per frame, $14,820 all-in, including the audit and the rubric build. That is 54% under the Scale quote. The 11-day turn mattered as much as the price, because the next training run was already scheduled.
What changed after the re-label
The corrected set went into the next fine-tune unchanged. Cyclist recall moved up 9 points and the false-positive rate on parked-car occlusions dropped by roughly a third. The perception lead’s own words, paraphrased with consent: the model finally started learning the hard classes instead of memorising the vendor’s mistakes.
The re-label is now a quarterly line item, not a one-off rescue. AB7 holds the rubric, so each new batch ships against the same written standard.
When this kind of engagement makes sense
Re-labelling pays for itself when your error rate is structural, not random — when a whole class is mis-specified rather than a few boxes being sloppy. If your mAP has plateaued and a spot-audit shows double-digit error on your safety-critical classes, the dataset is the bottleneck, not the architecture. Start with the audit. The 2,000-frame sample tells you whether you have a $15,000 re-label or a $2,000 touch-up.
Talk to the vision pod that did this
If you have a dashcam, LiDAR, or sensor-fusion set that needs auditing or re-labelling, AB7 will run a paid 2,000-frame audit first and give you a fixed per-frame number before you commit to the full set. See the ten categories on the AB7 AI & Robotics Services hub, or go straight to a conversation. Call +1 (321) 341-7733, email director@ab7solutions.com, or book a 30-minute call with Ashok.
Written by
AB7 Solutions Editorial Team
Content & Research Division
The AB7 Solutions editorial team combines expertise across healthcare operations, IT staffing, cybersecurity, and workforce management to deliver actionable insights for business leaders.
Follow on LinkedIn →