We have seen hundreds of lead scoring models across B2B SaaS companies, and the vast majority share one fatal flaw: they are built on assumptions instead of data. Marketing teams assign arbitrary point values — "downloaded a whitepaper? +10 points. Visited the pricing page? +20 points" — and wonder why sales ignores their MQLs. At GTM11, we build lead scoring systems that actually predict conversion, using Claude AI, HubSpot, and real behavioral signals.
Why Traditional Lead Scoring Fails
Traditional lead scoring relies on manual point assignments that reflect what marketers think matters rather than what actually predicts a sale. The typical setup in HubSpot or Salesforce involves a committee meeting where marketing and sales debate whether a webinar attendance is worth 15 or 25 points. This approach has three fundamental problems:
- Static weights decay fast: Buyer behavior changes quarterly. A scoring model built in January is stale by March.
- No negative signals: Most models only add points. They never subtract for disqualifying behaviors like visiting the careers page (they are job hunting, not buying).
- Firmographic blindness: Behavioral scoring without firmographic context means a student downloading your ebook scores the same as a VP at a target account.
The GTM11 Lead Scoring Framework
Our framework combines three scoring dimensions into a single composite score that updates dynamically. We call it the BFI model: Behavioral, Firmographic, and Intent.
Dimension 1: Behavioral Scoring
Instead of guessing which actions matter, we analyze your closed-won deals from the past 12 months. Using Claude AI, we process the full activity timeline of every converted lead and identify the behavioral patterns that statistically correlate with conversion. Common high-signal behaviors we discover include:
- Visiting the pricing page more than twice within 7 days
- Returning to the site after receiving a sales email
- Viewing case studies in the same industry vertical
- Engaging with bottom-of-funnel content like ROI calculators or comparison pages
We also identify negative signals that most scoring models miss: visiting only the blog without ever exploring product pages, unsubscribing from nurture sequences, or bouncing from the pricing page in under 10 seconds.
Dimension 2: Firmographic Scoring
We enrich every lead with firmographic data using Clay and Apollo. The enrichment pipeline pulls company size, industry, tech stack, funding stage, and hiring velocity. Claude AI then compares each lead's firmographic profile against your ideal customer profile and assigns a fit score from 0 to 100.
The key insight here is weighting. If 80% of your closed-won deals come from Series B+ SaaS companies with 50-500 employees, a lead matching that profile should start with a significant score advantage regardless of their behavioral activity.
Dimension 3: Intent Scoring
Intent data from sources like Bombora, G2, or even LinkedIn ad engagement reveals when a prospect is actively researching solutions in your category. We pipe this data into HubSpot via N8N workflows that run every 6 hours, updating lead scores in near-real-time when a target account starts showing buying intent.
Need help building your GTM systems? I build outbound and pipeline systems for B2B companies - and get results in 30 - 60 days.
Building the Pipeline in N8N
The technical implementation connects HubSpot, Clay, and Claude AI through an N8N automation workflow. Here is the architecture:
- Trigger: New lead enters HubSpot (form fill, import, or API)
- Enrichment: N8N calls Clay to enrich firmographic data and Apollo for contact details
- AI Analysis: Claude AI receives the enriched profile plus behavioral history and returns a structured score with reasoning
- Score Update: N8N writes the composite score and reasoning back to custom HubSpot properties
- Routing: Leads above threshold are auto-assigned to sales reps; leads below enter nurture sequences
The Claude AI prompt is critical. We instruct it to return a JSON object with scores for each dimension, a composite score, a confidence level, and a one-sentence explanation of why the lead scored the way it did. This explanation gets written to a custom HubSpot field so sales reps understand why a lead is hot, not just that it is.
Calibration and Continuous Improvement
A scoring model is only as good as its feedback loop. We set up a monthly calibration process:
- Pull all leads that scored above 70 but did not convert — analyze why the model was wrong
- Pull all closed-won deals that scored below 50 — identify signals the model missed
- Feed these exceptions back into Claude AI to retrain the scoring logic
- A/B test scoring thresholds: does routing at 65 vs 75 produce better conversion rates?
After three calibration cycles, our clients typically see MQL-to-SQL conversion rates improve by 40-60%. The sales team starts trusting the scores because they reflect reality, not marketing's wishful thinking.
Implementation Checklist
If you want to build this system for your own organization, here is what you need:
- HubSpot Professional or Enterprise (for custom properties and workflows)
- Clay account for firmographic enrichment
- N8N instance (self-hosted or cloud) for orchestration
- Claude AI API access for intelligent scoring
- 12 months of CRM data with clear won/lost outcomes
We typically implement the full system in 2-3 weeks. The first week is data analysis and model design, the second is technical implementation, and the third is calibration against historical data. By week four, you are routing leads with a model that is already smarter than anything a human committee could design.
Stop guessing which leads matter. Build a scoring model that learns from your actual conversion data and gets smarter every month. That is how modern GTM teams win.
