Multi-Touch AI Attribution Engine Cutting Ad Wastage by 35% for Pixis

MarTech / AI Marketing 6 weeksAI-Powered DevelopmentAI IntegrationAI Agents

35%

reduction in ad spend wastage

Multi-Touch AI Attribution Engine Cutting Ad Wastage by 35% for Pixis

Pixis

35%

Ad wastage reduction

5M+

Events processed daily

6wks

Full platform delivery

Ad platforms integrated

Context

The business context

Attribution is one of marketing's oldest unsolved problems. Every platform claims credit. Last-click models give all credit to whoever closes the conversion. First-click models over-reward awareness channels. Multi-touch models exist in theory but most implementations are correlation machines - they tell you what happened, not why. Pixis was building AI marketing products for enterprise clients and needed an attribution engine that was both accurate and explainable - because marketing directors need to defend budget decisions in board meetings, not just in dashboards.

The problem

5 specific problems that needed solving

Last-click attribution gave 100% conversion credit to retargeting and branded search - channels that were intercepting organic conversions rather than creating them

Marketing teams were cutting upper-funnel spend based on attribution data that made those channels look like wasted budget

No existing model could disentangle channels that were genuinely driving incremental conversions from those cannibalising organic

Attribution reports were black boxes - clients trusted the numbers but couldn't explain the methodology to internal stakeholders

Each ad platform's native attribution tool used different windows, different counting logic, and different attribution models - making cross-platform comparison meaningless

Our approach

Causal inference, not just correlation.

Standard multi-touch attribution assigns fractional credit based on touchpoint position or frequency - but position and frequency aren't causation. A user who clicks a branded search ad two minutes before purchasing wasn't converted by that ad; they were already going to buy. We used Shapley value game theory - a method from cooperative game theory used to fairly distribute credit among contributing players - to model what each touchpoint genuinely contributed to the probability of conversion, controlling for what would have happened without it. This gave Pixis a model they could explain mathematically to clients, not just show as a black-box score.

Shapley value attribution: each channel's credit is its marginal contribution to conversion probability, averaged across all possible orderings of the customer journey

Holdout experiment framework: regular geo-split tests validate the model's incremental contribution estimates against real-world holdout groups

Per-client model training: each Pixis client gets a model fine-tuned on their own conversion data, not a generic industry model

Weekly retraining pipeline: campaign creative, audience, and seasonal effects shift constantly - the model retrains on a rolling 90-day window to stay calibrated

What we built

A real-time attribution pipeline processing millions of events per day

The platform consists of an event ingestion layer, an attribution computation engine, and a Next.js reporting dashboard. The event ingestion layer connects to the Google Ads API, Meta Marketing API, and TikTok Ads API via a custom ETL pipeline built on Apache Spark and Airflow. Raw conversion events are unified into a single event schema (solving cross-platform discrepancies in event naming and timing). The attribution engine applies the Shapley-value model to each completed conversion journey, producing per-channel credit scores that feed into the reporting dashboard. The model is stored in versioned MLflow experiments, with automatic retraining triggered every seven days.

Event unification layer

Custom ETL normalises events from Google, Meta, and TikTok into a single canonical event schema - resolving discrepancies in click attribution windows, view-through counting logic, and conversion event naming across platforms. This alone eliminated the 'attribution gap' that made cross-platform reporting meaningless.

Shapley attribution engine

For each completed conversion journey, the engine enumerates the contribution of each touchpoint using Shapley values - mathematically fair credit distribution from cooperative game theory. Runs on Apache Spark to process 5M+ events per day within a 4-hour computation window.

Incremental validation framework

Monthly geo-split holdout tests measure the actual incremental impact of each major channel. Results are fed back into the model as calibration signals, ensuring the Shapley estimates track real-world incrementality rather than drifting over time.

Explainability layer

Every attribution report includes a methodology note, channel-level confidence intervals, and a 'what changed' comparison to the previous model version - so marketing directors can defend the numbers to CFOs and media agencies.

Automated retraining pipeline

An Airflow DAG triggers weekly model retraining on a rolling 90-day conversion window. MLflow tracks every model version, enabling rollback if a retrain degrades accuracy on the validation dataset.

Impact

What changed in production

The 35% wastage reduction was measured against actual client spend reallocation following the first 90 days of model output - not a theoretical estimate.

35% reduction in ad spend wastage in 90 days. Platform now processes 5M+ events per day. Used as a core differentiator in Pixis's client-facing product.

35%

Ad wastage reduction

5M+

Events processed daily

6wks

Full platform delivery

Ad platforms integrated

“We finally have an attribution model we can defend in a budget meeting. It changed how our clients think about performance marketing entirely - and it's now a core part of our product differentiation.”

Head of Data Science

Head of Data Science - Pixis

Learnings

What we took away from this project

Explainability is a commercial requirement, not an engineering nice-to-have

Every technically accurate attribution model we'd seen in the market was a black box. Pixis's clients needed to walk into budget reviews and explain why they were reallocating spend away from channels that appeared to be 'working' under last-click. The Shapley value framing gave them a mathematical narrative: 'This channel's marginal contribution, when we account for what would have happened without it, is X.' That made the model commercially viable in a way that a higher-accuracy black box would not have been.

Cross-platform event unification is underestimated

We scoped the event unification layer as two weeks of work. It took four. Every platform uses different attribution windows, different conversion event schemas, and different logic for view-through vs click-through. Building a truly unified event model - one where a conversion in Google and a conversion in Meta mean the same thing - required significantly more mapping work than expected. This is the unsexy part that makes everything else work.

Weekly retraining is necessary, but needs guardrails

Campaign creative, audiences, and seasonality shift constantly. A model trained in January will be miscalibrated by March. But automated retraining without quality gates can also introduce degradation silently. We built an automated validation gate that compares new model performance on a held-out test set against the previous version - and blocks deployment if accuracy declines, alerting the team for manual review instead.

35%

reduction in ad spend wastage

At a glance

ClientPixis

IndustryMarTech / AI Marketing

Timeline6 weeks

Tech stack

PythonPyTorchApache SparkAirflowNext.jsPostgreSQLGoogle Ads APIMeta Marketing API

Capabilities

AI-Powered Development

AI Integration

AI Agents

Build something similar?

We've solved this category of problem before. Let's scope yours.

Start a conversation View related service

More work

More case studies

All work

Professional Services · 12 weeks

AI Risk & Audit Documentation System for EY

72%

reduction in report prep…

Travel & eCommerce · 10 weeks

Intelligent Support Agent Resolving 68% of Queries Autonomously for Booking.com

68%

queries resolved without…

Financial Services · 8 weeks

Real-Time Portfolio Intelligence Platform for SAMCo

3×

faster portfolio risk an…