Traditional carrier search is broken — not because the data is bad, but because it treats every data point as if it exists in isolation. Fleet size tells you nothing without knowing where a carrier runs, what they haul, and how they're equipped. Two carriers with identical truck counts can be completely different businesses serving completely different needs.
Today, we're releasing Carrier Embeddings — a foundational AI capability that changes how AlphaLoop understands, compares, and surfaces carriers. It's the engine powering our new CarrierMatch feature, and it's built on the same transformer architecture behind ChatGPT.
The Problem: Numbers Don't Tell the Whole Story
Consider two carriers, both with 3 trucks. Are they similar? To a traditional database query, yes. To anyone who works in transportation, obviously not.
|
Carrier A: Local Delivery Co. 3 trucks, 2 states, no sleepers, day routes |
Carrier B: Cross-Country Hauler 3 trucks, 15 states, all sleepers, long-haul |
Traditional search would call these identical. Any experienced broker knows they're completely different businesses. Our AI now knows that too.
Our Solution: Teaching AI to See the Whole Picture
We built a transformer model trained on every registered carrier in the United States — 2.3 million carriers, 60+ data points each. The model doesn't look at data points one at a time. It understands how all the pieces fit together, the same way an experienced industry professional would.
Step 1: We Collect the Full Picture
For each carrier, we ingest over 60 distinct data points:
-
Fleet size and equipment types
-
States and regions served
-
Safety scores and inspection history
-
Cargo types and specializations
Step 2: Data Points Talk to Each Other
This is where the real intelligence happens. Each data point doesn't just sit in a column — it interacts with every other data point to understand what it actually means in context.
The model learns that "3 trucks + 2 states + no sleepers" describes a fundamentally different business than "3 trucks + 15 states + all sleepers" — even though the first number is identical. Context is everything.
Step 3: Create a Carrier Fingerprint
All of that contextual understanding gets compressed into a 128-dimensional embedding — a unique mathematical fingerprint for every carrier. Think of it as a carrier's operational DNA.
Two carriers with similar fingerprints operate similarly — even if their raw numbers look different on paper. This is what enables true similarity matching.
Step 4: Find Your Match at Scale
When you search for a carrier or run a lookalike query, we compare their fingerprint against all 2.3 million carriers in our database using cosine similarity and vector indexing. Results are:
-
Ranked by how similarly they truly operate — not just how similar their stats look
-
Surfacing hidden similarities that even seasoned brokers might miss
-
Returned in under 100 milliseconds
Trained on the Entire Industry
Our model didn't learn from a sample or a subset. It analyzed every registered carrier in the country — learning the full spectrum of how fleets are structured, from owner-operators to large regional carriers.
|
2.3M+ Carriers Analyzed |
60+ Data Points Per Carrier |
<100ms Search Time |
For the Technically Curious
Under the hood, we use a transformer architecture with self-attention — the same class of model behind ChatGPT, but purpose-built for structured tabular carrier data rather than language.
The model is trained using masked column reconstruction: we randomly hide data points and ask the model to predict them from context. This forces it to learn deep interdependencies between all carrier attributes — not just surface correlations.
The result is a 128-dimensional embedding per carrier. Similarity is computed using cosine distance with GPU-accelerated HDBSCAN clustering and vector indexing across 500K+ carriers, enabling sub-100ms search at scale.
What This Means for Your Team
Carrier Embeddings is the foundation for CarrierMatch — our new lookalike search tool that lets you drop in a carrier you already work with and instantly surface others that operate the same way. No more manual filtering across dozens of fields.
For GTM teams, this means:
-
Prospect lists built from operational similarity, not just demographic filters
-
Discovery of carriers that look different on paper but behave like your best accounts
-
A smarter signal for segmentation, prioritization, and outreach sequencing
