Carrier Vector Embeddings — 800K+ Fleet Profiles | What's New

Traditional carrier search is broken — not because the data is bad, but because it treats every data point as if it exists in isolation. Fleet size tells you nothing without knowing where a carrier runs, what they haul, and how they're equipped. Two carriers with identical truck counts can be completely different businesses serving completely different needs.

Today, we're releasing Carrier Embeddings — a foundational AI capability that changes how AlphaLoop understands, compares, and surfaces carriers. It's the engine powering our new CarrierMatch feature, and it's built on the same transformer architecture behind ChatGPT.

The Problem: Numbers Don't Tell the Whole Story

Consider two carriers, both with 3 trucks. Are they similar? To a traditional database query, yes. To anyone who works in transportation, obviously not.

Carrier A: Local Delivery Co.

3 trucks, 2 states, no sleepers, day routes

Carrier B: Cross-Country Hauler

3 trucks, 15 states, all sleepers, long-haul

Traditional search would call these identical. Any experienced broker knows they're completely different businesses. Our AI now knows that too.

Our Solution: Teaching AI to See the Whole Picture

We built a transformer model trained on every registered carrier in the United States — 2.3 million carriers, 60+ data points each. The model doesn't look at data points one at a time. It understands how all the pieces fit together, the same way an experienced industry professional would.

Step 1: We Collect the Full Picture

For each carrier, we ingest over 60 distinct data points:

Fleet size and equipment types
States and regions served
Safety scores and inspection history
Cargo types and specializations

Step 2: Data Points Talk to Each Other

This is where the real intelligence happens. Each data point doesn't just sit in a column — it interacts with every other data point to understand what it actually means in context.

The model learns that "3 trucks + 2 states + no sleepers" describes a fundamentally different business than "3 trucks + 15 states + all sleepers" — even though the first number is identical. Context is everything.

Step 3: Create a Carrier Fingerprint

All of that contextual understanding gets compressed into a 128-dimensional embedding — a unique mathematical fingerprint for every carrier. Think of it as a carrier's operational DNA.

Two carriers with similar fingerprints operate similarly — even if their raw numbers look different on paper. This is what enables true similarity matching.

Step 4: Find Your Match at Scale

When you search for a carrier or run a lookalike query, we compare their fingerprint against all 2.3 million carriers in our database using cosine similarity and vector indexing. Results are:

Ranked by how similarly they truly operate — not just how similar their stats look
Surfacing hidden similarities that even seasoned brokers might miss
Returned in under 100 milliseconds

Trained on the Entire Industry

Our model didn't learn from a sample or a subset. It analyzed every registered carrier in the country — learning the full spectrum of how fleets are structured, from owner-operators to large regional carriers.

2.3M+

Carriers Analyzed

60+

Data Points Per Carrier

<100ms

Search Time

For the Technically Curious

Under the hood, we use a transformer architecture with self-attention — the same class of model behind ChatGPT, but purpose-built for structured tabular carrier data rather than language.

The model is trained using masked column reconstruction: we randomly hide data points and ask the model to predict them from context. This forces it to learn deep interdependencies between all carrier attributes — not just surface correlations.

The result is a 128-dimensional embedding per carrier. Similarity is computed using cosine distance with GPU-accelerated HDBSCAN clustering and vector indexing across 500K+ carriers, enabling sub-100ms search at scale.

What This Means for Your Team

Carrier Embeddings is the foundation for CarrierMatch — our new lookalike search tool that lets you drop in a carrier you already work with and instantly surface others that operate the same way. No more manual filtering across dozens of fields.

For GTM teams, this means:

Prospect lists built from operational similarity, not just demographic filters
Discovery of carriers that look different on paper but behave like your best accounts
A smarter signal for segmentation, prioritization, and outreach sequencing