Six data-analytics platforms, the same six-stage visibility engine, and a result that should reset how you think about authority: Domo, the lowest-authority brand here at DR 70, posts the highest aggregate AI citations, out-citing DR-88 Databricks, which blocks crawlers at the access layer. Authority and AI visibility are correlated but not the same engine, and the gap between them is where the opportunity sits.

The divergence that drives this teardown

Domo, the lowest-DR brand (70), posts the highest aggregate AI citations. Databricks, the highest-traffic brand, under-converts. The cause is GEO hygiene, not authority, and it builds directly on the authority paradox we documented in proptech.

01How does the visibility engine work?

Six stages gate demand; a failure low in the stack caps everything above. Every brand here runs the same six-stage engine. Demand is captured only if all six stages hold, which is why a DR-70 brand can out-cite a DR-88 brand that blocks crawlers at the access layer.
Access
bot can reach the content
Discovery
llms.txt, sitemap, robots
Structured data
JSON-LD entity
Content
quotable non-branded material
Authority
trusted referring domains
Citation
surfaced & quoted by AI
Figure 1 - the GEO citation funnel. Each stage gates the next; a brand fails at the first broken link, and access/discovery gate everything above them.

02What does the category scoreboard show?

Authority and AI-citation share diverge sharply across the six. Every brand normalized on one row. Traffic value is the monthly PPC-equivalent of organic visits; AI citations sum responses across six engines.
The category scoreboard
CompanyDRTraffic/moValue/moRef domDR70+SpamAI cites
Databricks881.3M$2.3M31.9K2.8K24.8%629
Splunk91847K$1.41M30.4K3.2K22.2%1,131
Domo70163K$292K14.6K1.7K38.4%1,084
Alteryx79153K$363K20.4K1.6K52.8%139
Baremetrics7734.8K$31.2K4.2K60523%118
Sisense7828.1K$42.8K8.6K92143.2%75
Figure 2 - aggregate AI citations across six engines. Domo and Splunk lead; Databricks is third despite far more traffic. Source: Ahrefs Brand Radar, June 2026
Figure 3 - Domain Rating benchmark. Domo (signal) is the lowest-authority brand yet leads on citations, the divergence in one chart.

03The six engines, torn down

One forensic card per brand: how it works, what to copy, where to attack.

Databricks - Category Leader

The engine. A definitional-content flywheel on an enormous, clean backlink base, partly throttled by a hostile WAF.

By the numbers
DRTraffic/moValue/moAI citesBrandedSpam
881.3M$2.3M62958.6%24.8%

Content engine. The deepest definitional library in the set. It owns the canonical 'what is X' queries (data warehouse, vector database, data lake) and pairs them with product-led pages. 41% non-branded across 31.2K keywords, 13.3K in the top three, the exact content shape AI engines summarize.

Link engine. 31.9K referring domains, 2,844 DR70+, at a clean 25% spam ratio. The differentiator is depth in developer ecosystems: GitHub links from over 2,000 pages, a co-citation moat no marketing campaign can fake.

GEO configuration. Strong on-page signals but two GEO holes: a WAF that returns 403 to non-browser agents on robots.txt and sitemap.xml, and no fetchable llms.txt. The result is an AI-citation share that under-indexes its traffic dominance.

Evidence: top organic keywords
KeywordVolumeTrafficPos
databricks485,660455,7201
what is a data warehouse297,29023,6534.4
what is a vector database320,00019,2284.2
machine learning models161,61020,2212.6
databricks careers20,35016,9471

What's working: Definitional 'what is' pages that map 1:1 to AI-answer queries; A developer co-citation moat (2,000+ GitHub linking pages); The largest, cleanest commercial footprint in the set.

Playbook, what to copy: Build a canonical 'what is X' library for every category concept; Pursue developer-ecosystem links for durable technical authority; Treat certification and community pages as link magnets.

Attack angle, where to win

Databricks taxes itself at the access layer. Where it blocks GPTBot/PerplexityBot and skips llms.txt, a fully-open competitor shipping FAQ schema over the same definitional terms can intercept the long-tail AI citations it leaves unconverted. Hard to beat on authority, easy to out-cite on hygiene.

Splunk - Authority Leader

The engine. A breadth-of-education content machine on the cleanest, highest-authority link base in the set.

By the numbers
DRTraffic/moValue/moAI citesBrandedSpam
91847K$1.41M1,13143.5%22.2%

Content engine. Publishes across the entire security and observability syllabus: risk frameworks, distributed systems, SIEM, hash functions. 56% non-branded, 38.4K keywords with 20.8K in positions 4-10, feeding a long tail of AI eligibility.

Link engine. The strongest profile here: ~30.4K referring domains, 3,193 DR70+, and the lowest spam exposure at 22%. Authority compounds because the profile stays clean.

GEO configuration. Technically the benchmark: valid robots.txt and sitemap, canonical, full Open Graph and Twitter, three JSON-LD blocks including VideoObject. Two blemishes: no llms.txt, and a templating bug leaking a broken hreflang token into the markup.

Evidence: top organic keywords
KeywordVolumeTrafficPos
splunk212,030149,8341.1
risk management frameworks267,30035,6712.8
software testing basics335,54023,67528.3
what is a distributed system223,04014,8365
hash functions165,49010,8765.4

What's working: Syllabus-wide content covering the whole category vocabulary; The cleanest link base in the set (22% spam); Benchmark technical hygiene.

Playbook, what to copy: Own the full category glossary, not just bottom-funnel pages; Protect link hygiene as a KPI; Add VideoObject and rich JSON-LD to multiply entity signals.

Attack angle, where to win

Splunk is hard to displace on authority; the opening is GEO hygiene. No llms.txt and no FAQ schema over a massive glossary. A challenger that wraps the same definitional content in FAQPage markup and ships llms.txt can win definitional citations before Splunk closes the gap.

Domo - GEO Overperformer

The engine. A disciplined GEO configuration that converts modest authority into category-leading AI citations.

By the numbers
DRTraffic/moValue/moAI citesBrandedSpam
70163K$292K1,08431.3%38.4%

Content engine. Markets directly at the AI-agent buyer ('Governed Data for AI Agents') and ranks across BI terms plus a quirky high-traffic cluster (strip chart). 69% non-branded with a strong 6.6K Top-3 footprint on 12.6K keywords.

Link engine. 14.6K referring domains, 1,686 DR70+, but a 38% spam ratio hinting at historic low-quality accumulation. Authority is the lowest here (DR 70), which makes its citation lead all the more notable.

GEO configuration. The reason it over-performs: the richest llms.txt in the set (16KB of curated description), which tracks with category-leading citations (432 Grok, 336 AI Overviews, 302 AI Mode). The irony is broken basic metadata: no canonical, no meta description, only og:type on the homepage, and a sitemap served as application/rss+xml.

Evidence: top organic keywords
KeywordVolumeTrafficPos
domo136,44034,9472.9
domo ai51,8609,5272.5
strip chart data26,0005,8161
business intelligence tools261,0902,2334
strip chart meaning41,3505,4781.3

What's working: The richest llms.txt in the set, the clearest cause of its over-performance; Buyer-aligned positioning (AI agents, BI category); A strong Top-3 footprint for a mid-authority domain.

Playbook, what to copy: Copy the llms.txt discipline, the single highest-ROI GEO move; Align positioning to the emerging buyer and build content around it; Use Domo as the internal case study: low DR, high citations.

Attack angle, where to win

Do not fight Domo on GEO mechanics; fight it on authority. With only DR 70 and 38% spam, a competitor with cleaner DR70+ links can out-rank it on commercial BI queries while matching its schema discipline. Its broken homepage metadata is a quick credibility wedge.

Alteryx - Branded-Dependent

The engine. A powerful brand-SERP harvester with almost no non-branded discovery engine behind it.

By the numbers
DRTraffic/moValue/moAI citesBrandedSpam
79153K$363K13973.9%52.8%

Content engine. 74% of traffic is branded (alteryx, certification, community, designer, download). The non-branded layer is thin and softening, a demand-harvesting engine, not a demand-generation one.

Link engine. Second-largest footprint (20.4K referring domains, 1,555 DR70+) but the worst spam exposure in the set at 53%; more than half its referring domains add risk rather than authority. A disavow program is overdue.

GEO configuration. WordPress with full Open Graph and clean headings, but two faults: no llms.txt, and the conventional /sitemap.xml returns 404 (the real index hides at /sitemap_index.xml), so crawlers probing standard paths can miss it.

Evidence: top organic keywords
KeywordVolumeTrafficPos
alteryx94,72075,3901.2
alteryx certification3,9103,5881
alteryx community2,9002,6381
trifacta2,0801,7061.2
alteryx designer2,7801,5891.1

What's working: Total ownership of its brand SERP; Healthy traffic value ($363K/mo) and full Open Graph; A large raw link footprint to build on once cleaned.

Playbook, what to copy: Treat brand-SERP completeness as table stakes, then layer non-branded content; Use full Open Graph everywhere for rich entity signals.

Attack angle, where to win

Alteryx is wide open on non-branded discovery. A competitor publishing strong how-to and comparison content for analytics-automation queries intercepts buyers Alteryx never reaches. Combine that with 53% spam and a missing/locked sitemap, and it is the most structurally exposed enterprise brand in the set.

Baremetrics - Niche Content Specialist

The engine. A pure editorial content engine that ranks enormous finance terms but lacks the authority to fully cash them in.

By the numbers
DRTraffic/moValue/moAI citesBrandedSpam
7734.8K$31.2K1184.6%23%

Content engine. The most content-led model here: 95% non-branded. It ranks for genuinely huge educational terms (churn rate analysis at 263K, what is a burn rate at 180K, MRR, cohort analysis). Editorial depth is real and category-relevant.

Link engine. The smallest authority base: DR 77 but only 4.2K referring domains, 605 DR70+, at a clean 23% spam. Stripe, Shopify and Medium links lend ecosystem relevance, but the base is too thin to lift it past more authoritative sources.

GEO configuration. Clean foundations: valid llms.txt, sitemap, canonical, and JSON-LD including SoftwareApplication. The gap is a missing og:image, so widely-shared finance guides render without a preview card, suppressing the social amplification that builds AI authority.

Evidence: top organic keywords
KeywordVolumeTrafficPos
churn rate analysis263,20010,3924.3
what is a burn rate180,2004,85610.4
baremetrics1,2901,3571
startup financial modeling38,00074212
mrr37,9304448.8

What's working: The purest content engine: ranks 263K-volume terms on a tiny domain; A clean technical base with llms.txt and commerce schema; Topically relevant ecosystem links (Stripe, Shopify).

Playbook, what to copy: Target enormous-volume educational terms adjacent to the product; Add FAQ/HowTo schema to calculators and guides; Earn topically-relevant ecosystem links, not just high-DR links.

Attack angle, where to win

Baremetrics wins on editorial depth and loses on authority. A competitor with more DR70+ links can out-rank its finance guides on the exact high-volume terms it depends on. Its missing og:image is a free amplification gap to exploit.

Sisense - Contracting Challenger

The engine. A decent SQL-tutorial niche engine that is actively decaying because its crawl infrastructure is broken.

By the numbers
DRTraffic/moValue/moAI citesBrandedSpam
7828.1K$42.8K7528.5%43.2%

Content engine. Owns a focused SQL-tutorial niche (order of execution in sql, group by in sql, python data analysis) with a 72% non-branded mix. But the footprint is shrinking to just 2.1K keywords, the smallest active library in the set, and traffic is contracting.

Link engine. 8.6K referring domains, 921 DR70+, but a 43% spam ratio (second-worst). DR 78 is respectable; the problem is decay, not raw authority.

GEO configuration. Has an llms.txt, but the rest of discovery is broken: /sitemap.xml returns a 301 to an empty 0-byte response, and there is no hreflang. Crawlers cannot enumerate the site, the mechanical reason the keyword footprint is collapsing.

Evidence: top organic keywords
KeywordVolumeTrafficPos
sisense9,7306,9441.1
order of execution in sql2,8801,8201.5
python data analysis92,1001,05113.3
group by in sql12,3906165.1
mtd full form8,3408141.3

What's working: A focused SQL-tutorial niche with a good non-branded mix; A respectable DR 78 to rebuild from; It has published an llms.txt.

Playbook, what to copy: Even a struggling brand should ship llms.txt, discovery hygiene is cheap insurance; Defend a focused niche rather than spreading thin.

Attack angle, where to win

Sisense is the most beatable peer: a declining curve plus a broken sitemap means crawlers cannot index it properly. Consistent publishing on a working crawl stack would overtake it on embedded-analytics queries within two quarters, the lowest-effort share grab in the set.

04What do the cross-cutting patterns reveal?

GEO hygiene predicts citations more than authority, and the best combo is unclaimed. AI engines select sources, they do not rank pages, and the clearest predictor of selection here is GEO hygiene rather than raw authority. Grok and Google AI Overviews carry most citation volume; Gemini barely cites anyone. The llms.txt divide is the sharpest line in the data: only Domo, Sisense and Baremetrics publish one, and Domo's richest-in-set file tracks with its citation lead.
Figure 4 - spam exposure by brand. Alteryx (53%) and Sisense (43%) carry the dirtiest profiles, inviting algorithmic discounting.
Figure 5 - non-branded share of traffic. Alteryx harvests existing demand; Baremetrics generates it; Splunk and Databricks own the GEO-optimal definitional middle.

Two opposite strategies bracket the set. Alteryx (74% branded) harvests existing demand; Baremetrics (95% non-branded) generates demand but cannot fully cash it without authority. The GEO-optimal middle is the definitional libraries of Splunk and Databricks. Nobody has yet layered FAQPage schema over those glossaries, the single softest spot in the category.

The technical stack, scored
ElementDatabricksSplunkDomoAlteryxBaremetricsSisense
llms.txtNNYNYY
Valid sitemap.xml~YY~YN
AI-crawler accessNYYYYY
JSON-LD schemaYYYYYY
Full Open GraphYYNY~Y
Canonical tagYYNYYY
hreflang~~YYYN
Meta descriptionYYNYYY

Legend: Y implemented, ~ partial or misconfigured, N missing or blocked. The three highest-leverage fixes across the set: add a canonical and meta description (Domo), allowlist AI crawlers at the WAF (Databricks), and wrap glossaries in FAQPage schema (everyone).

High-leverage fixes, copy-paste ready
html
<!-- 1. Canonical (Domo) -->
<link rel="canonical" href="https://www.example.com/" />

<!-- 2. AI-crawler allowlist in robots.txt (Databricks) -->
User-agent: GPTBot
Allow: /
User-agent: PerplexityBot
Allow: /

<!-- 3. FAQPage schema over glossary (everyone) -->
<script type="application/ld+json">{"@context":"https://schema.org",
 "@type":"FAQPage","mainEntity":[{"@type":"Question", ... }]}</script>

05What's the synthesized playbook?

Eight reusable plays, ordered by leverage-to-effort. Pulled from what the leaders do and the gaps the laggards leave.
The synthesized playbook
PlayWhy it worksEffort
Ship a rich llms.txtHalf the market has none; Domo proves it converts authority into citations~2h
Wrap glossaries in FAQPage schemaNo one has done it; it is what engines extract for 'what is' answers~3h
Build a definitional 'what is X' libraryMaps 1:1 to AI-answer queries; Splunk and Databricks win this wayOngoing
Allowlist GPTBot / PerplexityBotCrawler blocking silently caps citation eligibility~3h
Fix canonical + meta + Open GraphCheap entity-signal hygiene~2h
Pursue developer-ecosystem linksGitHub/docs co-citation is a durable technical-trust moatOngoing
Guard link hygiene as a KPIClean profiles compound; 22% beats 53% over timeQuarterly
Track AI share-of-voice separately from rankingsAuthority and citation share divergeMonthly
llms.txt, the GEO file half this market is missing
llms.txt
# <Company> - <one-line positioning>
> A concise description of what the company does and who it serves.

## Products
- [Platform](https://example.com/product): overview and core capabilities

## Learn
- [Glossary](https://example.com/glossary): definitions engines can quote
- [Guides](https://example.com/guides): how-to content for category queries

06Where's the fastest share? (the attack plan)

Sisense and Alteryx are the cheapest share; Databricks and Splunk are hardest. Each competitor ranked by how cheaply share can be taken, with the specific weakness and the move that exploits it.
The attack plan
TargetExposed weaknessThe moveEase
SisenseBroken sitemap, contracting 2.1K-keyword footprint, 43% spamPublish consistently on a working crawl stack; target SQL/embedded-analytics queriesEasiest
Alteryx74% branded, no llms.txt, sitemap 404s, 53% spamOwn non-branded analytics-automation how-to and comparison contentEasy
BaremetricsSmallest authority base; missing og:imageOut-authority it on the high-volume finance terms it ranksModerate
DomoDR 70 (lowest), 38% spam, broken homepage metadataOut-rank on commercial BI queries with cleaner DR70+ linksModerate
SplunkNo llms.txt, no FAQ schema, broken hreflangWin definitional security citations with FAQ content + llms.txtHard
DatabricksWAF blocks crawlers; no llms.txt; under-convertsBe fully crawler-open and FAQ-structured to capture long-tail citationsHard
Bottom line

Easiest share: Sisense (broken sitemap, contracting) and Alteryx (74% branded, dirtiest links, no llms.txt).

Hardest to displace: Databricks and Splunk on authority, but both wide open on GEO hygiene.

The universal opening: nobody pairs a definitional glossary with FAQPage schema and llms.txt. That combination is unclaimed, and non-branded definitional content plus flawless crawl infrastructure beats raw authority for AI citations.

Why does Domo out-cite Databricks despite far lower authority?

Because AI citation is gated by GEO hygiene, not just Domain Rating. Domo publishes the richest llms.txt in the set (16KB) and markets directly at the AI-agent buyer, while Databricks blocks simple crawlers at its WAF and ships no fetchable llms.txt. Domo (DR 70) posts 1,084 aggregate AI citations versus Databricks' 629 (DR 88), because access and discovery gate everything above them.

What is the single highest-ROI GEO move for a data-analytics brand?

Ship a rich llms.txt. Only three of the six brands publish one, and Domo's detailed file tracks directly with its citation lead. It is roughly two hours of work and converts existing authority into citations, especially for mid-authority domains that can't win on Domain Rating alone.

Which brand is the most exposed to competitive attack?

Sisense. Its /sitemap.xml returns a 301 to an empty response so crawlers can't enumerate the site, its keyword footprint is contracting to 2.1K terms, and 43% of its referring domains are spam. Consistent publishing on a working crawl stack could overtake it on embedded-analytics queries within one to two quarters.

What's the unclaimed opportunity across the whole category?

Pairing a definitional 'what is X' glossary with FAQPage schema and a working llms.txt. The leaders (Splunk, Databricks) own the definitional content but have no FAQ schema or llms.txt; the GEO-disciplined brands (Domo) lack the authority. No one has combined all three, so non-branded definitional content plus flawless crawl infrastructure is the fastest path to AI citations.

About rawmktg.

rawmktg. publishes data-driven teardowns of B2B verticals and brands, pulling AI-citation and SEO data to show exactly where the visibility gaps are. Method: same data, same lens, every time. Contact: vinayak@rawmktg.com

Data source: Ahrefs (organic keywords, referring domains, Brand Radar AI citations) plus a live technical crawl of all six domains, captured June 2026.