Carrier Risk Scoring, Rebuilt: From Rules to Reasoning | What's New

For the past year we've watched the broker-fraud market reshape itself. The classic chameleon-carrier playbook (set up an LLC, get an MC, run it into the ground, dissolve, repeat) has been displaced by something subtler: ring operators buying aged MC numbers with clean histories, swapping out contact info, and onboarding to brokers under an identity the carrier graph still considers legitimate. Highway's Q1 2026 Freight Fraud Index found roughly half of theft incidents now involve carriers with previously clean operating histories.

That shift broke the assumption a lot of risk-scoring systems were built on. We rebuilt ours.

What we changed

AlphaLoops Risk Signals now combines six independent evidence streams against a carrier:

Authority for sale: listings detected across broker-facing marketplaces, scored for confidence and price/age plausibility.
Equipment for sale: fleet VIN matches against equipment-marketplace listings.
Entity relationships (Neo4j graph): shared officers, addresses, phones, emails, equipment VINs, and surnames across DOTs.
News and media: Perigon-sourced news mentions, sentiment-scored.
LLM-pre-scored fraud / financial signals: long-running classifications from our research pipeline.
NEW: Historical authority, insurance, and change events: deterministic, structural signals computed directly from FMCSA's insurance, insurance_history, carrier_auth_hist, and carrier_change_events tables.

The sixth source is the biggest addition this quarter and the one this post is about.

What's in the historical bucket

Three FMCSA tables, three different things they tell us:

insurance: FMCSA's live coverage snapshot. One row per active policy. This is the authoritative answer to "is this carrier currently insured."
insurance_history: the audit log of cancellations, replacements, and policy lifecycle events. Not a live view. A carrier with continuous coverage simply has no recent rows here.
carrier_auth_hist: authority grants, revocations, and dispositions.
carrier_change_events: FMCSA census diffs (officer changes, address changes, fleet size changes, etc.).

For each carrier we compute a structured JSON of derived signals: BIPD lapse gaps per insurance branch, cancel-vs-replace ratios, distinct insurers in the last five years, involuntary revocation counts at multiple time horizons (36-month and 6-year, per FMCSA's codified scrutiny window in 77 FR 45728), short/moderate/long-gap revocation buckets, officer/address/phone/email change counts in 90-day and 12-month windows, fleet size swings, and combination flags that fire when chameleon-like activity clusters in tight windows.

The combinations matter, not the individual events

This is where the GAO ARCHI literature finally became useful operational guidance for us. The 2013 Report to Congress published the exact formula FMCSA uses internally:

score = (name_match × officer_match)
      + 2 × EIN_match
      + 2 × SSN_match
      + 2 × D&B_match
      + 1 × phone_match
      + 0.5 × address_match

Threshold for flagging: 1.5. A single attribute match alone (officer, phone, address) is below threshold. That's intentional. GAO found 251,337 raw officer-name pair matches in MCMIS and 10,032,429 address pair matches. Single overlaps are mostly false positives.

What crosses the threshold is the combination: officer + address + phone within a tight time window. That's the pattern with documented 2.2-2.6× lift in correctly identifying carriers with adverse outcomes.

We mirrored this for temporal patterns on a single DOT. When a carrier's officer changes, then its physical address changes, then its phone changes, all within 90 days, that combination triggers. Single field changes don't. Mailing-address-only changes don't. MCS-150 biennial refiles don't.

We also added a separate "Sold-MC signature" detector for the post-2023 fraud vector ARCHI was designed before. It fires when contact info changes alongside a new insurance carrier, without an officer change. That's the modern playbook: ring operators stay on file as the officer of record while transferring operations.

What we deliberately don't flag

The research is as useful for what it tells you to ignore as for what it tells you to weigh.

A single revocation discontinued within two weeks. FMCSA L&I procedural docs describe this as the dominant pattern of insurer BMC-91X clerical errors: an agent files a cancellation, the carrier produces a replacement policy within days, the revocation is discontinued. One of these in a carrier's history carries near-zero predictive signal.
Standalone MCS-150 filing-date updates. Per the National Academies' 2017 review, this is administrative compliance with no safety nexus.
Voluntary revocations. FMCSA explicitly excludes these from monitoring.
BOC-3 process-agent lapses during mass-purge events. Overdrive documented a 2023 incident where ~1,700 carriers were revoked the same week because their shared blanket-agent service shut down. Those are agent-driven, not carrier-driven.
Driver Fitness and Controlled Substances BASIC percentiles as crash-risk predictors. ATRI's 2014 analysis of 471,306 carriers found inverse relationships: the higher the "bad" percentile, the lower the crash rate, likely due to inspection-selection bias.

But aggregate patterns can flip

This was the calibration insight that shaped our final prompt design. A single short-gap revocation is noise. But ten of them across five years isn't noise. It means the carrier's underwriter is consistently dropping coverage and the agent is consistently scrambling to replace it. The single event tells you nothing; the rate tells you the relationship between carrier and underwriter is broken.

We could have encoded this as rigid thresholds. We didn't. Hardcoded suppression loses the aggregate signal, and hardcoded thresholds force the wrong abstraction. Instead, the LLM receives the research framework (the empirical anchors from GAO ARCHI, NAS 2017, ATRI 2018, Highway Q1 2026, and the relevant Federal Register codifications) and is asked to reason about totality of evidence.

It can see one short-gap revocation in 2017 and treat it correctly as noise. It can see fifteen short-gap revocations spread across the last five years and recognize that as structural insurer instability. We don't need a rule for that distinction. We just need the reasoning surface to make the distinction visible.

What this looks like in practice

For a carrier like Old Dominion Freight Line (10,974 power units, 49 years of tenure, continuous BMC-82 coverage with Fidelity & Deposit), the historical bucket now correctly reads from the live insurance table and reports "Active." No more false "currently uninsured" flag from looking only at the cancellation history.

For a carrier with one 2-day revocation cure in 2022 and no other patterns, the historical bucket reports it in the short_gap_revocations counter but doesn't fire any combo. The LLM has the research context to call it what it is: clerical noise, not operational instability.

For a carrier where contact info changes alongside a new insurer in 30 days without an officer change, the historical bucket fires the Sold-MC signature combo. Even if the carrier's history is otherwise clean, that pattern alone justifies elevated scrutiny, because that's the fraud vector that defeats the carrier graph.

What we still don't do

We don't maintain a virtual-address blocklist. The MOTUS PPOB enforcement initiative (active since January 2025) is reducing this pattern at the FMCSA registration gate, but commercial-mailbox addresses still appear in some authority filings.
We don't score safety rating transitions as standalone signals. The data is too sparse to be reliable, per NAS 2017 and the FMCSA SMS Federal Register notice.

What's next

The structured historical bucket is the foundation for cross-referencing temporal signals on a focal carrier against the static identity-overlap signals already in the carrier graph. The next iteration unifies them: when officer changes on Carrier A coincide with the appearance of those officers on a newly-granted Carrier B in the same corridor, we want a single combo to fire on both DOTs simultaneously. The pieces are in place; the join is the work.

We're also adding insurer-level aggregation. Some underwriters cancel coverage on far more carriers than others (Britto 2010 documented this), and the identity of the cancelling insurer is itself a signal. A CANCEL event from Progressive looks different from a CANCEL event from a small RRG that's been winding down its trucking book.

Carrier Risk Scoring, Rebuilt: Why We Stopped Treating Single Events As Evidence