Bot Insights — crawler governance

Crawler Governance — www.example.com, AI category health queue

Top crawler entities ranked by triggered governance signals for the current window.

Comparison window
2026-04-07 00:00 → 2026-04-14 00:00 UTC
vs 2026-03-31 00:00 → 2026-04-07 00:00 UTC
Executive summary

What this report says

2 of 2 AI categories need analyst attention — start with AI Training (80 governance-surface failures).

Recommended action Check good crawler rate limits, 5xx exposure, robots.

Coverage is thin — 72% of rule evaluations had missing inputs. Real risk may be higher than the score implies.

2 Assign 0 Watch 0 Insufficient data 0 Close — expected

2 AI categories need analyst attention (out of 2).

Crawler governance evidence

2 entities need analyst attention

AI Training #1
Assign Score 64 Crawler governance Low confidence: 12 of 17 rules missing inputs
Crawler-governance signals
  • Policy Surface Failure Present +16 pts

    Governance surfaces have 80 failed requests.

  • Good Bot 429 Present +14 pts

    Good bot traffic has 120 429 responses.

  • Good Bot Error Rate High +12 pts

    Good bot error rate is 6.50%.

  • AI Crawler Growth High +10 pts

    AI crawler metric increased by 344.44%.

Supporting signals
  • Volume Delta High Movement +12 pts

    Request volume increased by 31000 (344.44%).

Search Crawler #2
Assign Score 50 Crawler governance Low confidence: 11 of 15 rules missing inputs
Crawler-governance signals
  • Policy Surface Failure Present +16 pts

    Governance surfaces have 120 failed requests.

  • Good Bot 429 Present +14 pts

    Good bot traffic has 7400 429 responses.

  • Good Bot Error Rate High +12 pts

    Good bot error rate is 8.20%.

  • Rate 429 Delta High +8 pts

    429 rate increased by 6.3 percentage points.

Triage queue

2 AI categories, ordered by what to do

Verdict AI Category Score Δ Primary domain Top evidence
Assign
Assign AI Training 64 ±0 Crawler governance Governance surfaces have 80 failed requests.
Assign Search Crawler 50 ±0 Crawler governance Governance surfaces have 120 failed requests.
Hosts in scope

Scorecard rollup

Per-host scoring detail lives in the scorecard brief. This rollup only cross-references which hosts the movement applies to.

Host Score Band Verdict
AI Training 64 High review Assign
Search Crawler 50 Medium review Assign
Domain score matrix

How risk points distribute across domains

AI Category Score Crawler governance Movement
AI Training 64 52 12
Search Crawler 50 50
Recommended next steps

Investigations to queue from this report

  1. Check good crawler rate limits, 5xx exposure, robots.txt, llms.txt, and sitemap availability.
    2 AI categories · AI Training, Search Crawler
  2. Review mover attribution for the same scope and confirm comparable current/baseline windows.
    1 AI category · AI Training
Coverage

Which rules could be evaluated

Rule coverage by domain — 5 domains evaluated, 71.88% of rule evaluations had missing inputs
Domain Triggered Below threshold Inputs missing Distribution
Crawler governance 8 0 1
Cache busting 0 0 8
Movement 1 0 4
Origin impact 0 0 4
Security evidence 0 0 6

Confidence: Medium · 2. Reasons: Baseline window has enough rows, Current window has enough rows, Some feature inputs missing, Dimensions fit retained schema, SIEM data unavailable, Summary table used.

Method & caveats

What this report is and isn't

Rule-based scorecard for · , built from mechanical features only, compared against week over week. It reports what was measured, not why. Missing feature inputs are reported as such — they are not scored as safe.

Schema, source table, and constraints
Schema
bot_scorecard_artifacts.v1
Comparison
Week over week
Producer limit
5 (returned 2, truncated: false)
Tenant / database
·
Table
bi_summary_hour
Constraints
Rule-based scorecard; Mechanical features only; No causal claim; LLM may summarize structured evidence only
Confidence reasons
Baseline window has enough rows; Current window has enough rows; Some feature inputs missing; Dimensions fit retained schema; SIEM data unavailable; Summary table used
Orientation — what this report measures
What this measures

A health score for each ranked crawler entity (AI category, bot class, or request host) on a 0–100 scale. Higher scores reflect more triggered crawler-governance signals — good-bot 429 / error rate, AI-crawler growth, governance surface failures — plus rate delta context when the rowset population is crawler-specific.

How to read the score

Higher score = more triggered crawler-governance rules. Bands: escalate, monitor, observe.

  • escalate · 0–40
  • monitor · 40–70
  • observe · 70–100
What this can't say

Not a confirmed-malicious-crawler call. Missing inputs are reported as missing — they are not scored as safe.