Thinkscoop
AI Reconciliation Agent Cuts FinTech Processing Time by 78%
Financial Services 10 weeksAI AgentsAI Integration

78%

reduction in processing time

AI Reconciliation Agent Cuts FinTech Processing Time by 78%

Global FinTech (Confidential)

78%

Processing time reduction

<5%

Escalation rate

35hr

Saved per week

200k+

Cases processed without incident

Context

The business context

In fintech operations, reconciliation is the work that never ends. Every transaction that doesn't match across systems becomes a discrepancy. Every discrepancy becomes a manual investigation. Every manual investigation takes 20–45 minutes of an ops analyst's time - pulling data from three systems, cross-referencing timestamps, checking counterparty records, and drafting a resolution action for compliance sign-off. At this company, 40+ hours per week of senior analyst time was consumed by a process that was fundamentally pattern-matching: most discrepancies had a known root cause, a known resolution, and a known compliance requirement. The team had tried rule-based automation once before. It had broken under the weight of edge cases and been abandoned.

The problem

5 specific problems that needed solving

40+ hours per week of senior ops analyst time spent on manual reconciliation - work that was largely pattern-matching on known discrepancy types

Previous rule-based automation abandoned after six months: edge cases (unusual counterparty formats, timing discrepancies, partial settlements) broke rules constantly

Three separate source systems with inconsistent data schemas - cross-referencing required manual translation that rules couldn't encode reliably

Compliance requirement for human sign-off on all resolutions: full automation wasn't viable, but the preparation work could be automated

Growing transaction volume meant the manual workload was increasing every quarter - the team needed to scale resolution capacity without scaling headcount

Global FinTech (Confidential) - solution

Our approach

Automate the reasoning, not just the rules.

The previous rule-based system had failed because rules can't handle ambiguity. Real-world reconciliation discrepancies don't follow a fixed taxonomy - they're edge cases, partial matches, timing anomalies, and data format inconsistencies that require judgment. We proposed a different framing: instead of trying to automate the resolution decision, automate the investigation and evidence assembly, and keep the resolution decision with a human reviewer. A well-designed system would reduce the time to resolution from 30 minutes to 2 minutes - not by eliminating the human, but by doing all the preparatory work the human was doing manually. That reframing unlocked the project: the compliance team approved it immediately because human sign-off was preserved.

Automate the investigation, not the decision: human reviewers approve resolutions, not just exceptions - keeping compliance requirements intact

LangGraph multi-step agent: each step in the investigation has an explicit state and can be inspected, replayed, or overridden by a reviewer

Evaluation harness built before the agent: defined what 'correctly resolved' meant across 15 discrepancy categories before writing production code

Weekly accuracy and escalation rate reporting to the ops team - building trust incrementally rather than asking for full adoption on day one

What we built

A reasoning agent for financial discrepancy resolution

The system is a LangGraph-based multi-step agent with direct API access to all three source systems. When a reconciliation discrepancy is flagged (by existing batch jobs or real-time triggers), the agent begins a structured investigation: it retrieves the transaction record from each of the three systems, normalises the data into a canonical schema, identifies the discrepancy type, queries relevant counterparty and timing records, and generates a draft resolution action with supporting evidence. The completed package - discrepancy summary, investigation trace, recommended resolution, and compliance documentation - is routed to a human reviewer who can approve in a single click or override with a note. Every decision is logged to an immutable audit trail in PostgreSQL.

1

Multi-source data retrieval

Secure API connectors to all three source systems with schema normalisation. The agent translates each system's native data format into a canonical transaction schema before any reasoning step - eliminating the manual translation that made rule-based approaches brittle.

2

Discrepancy classification

The agent's first reasoning step classifies the discrepancy into one of 15 defined categories (timing mismatch, partial settlement, counterparty format error, currency conversion delta, etc.) using a combination of rule-based classifiers and GPT-4o for ambiguous cases. Classification drives the investigation path - different categories trigger different evidence-gathering steps.

3

Evidence assembly

For each discrepancy category, the agent follows a defined investigation protocol: which counterparty records to retrieve, which timing windows to check, which reconciliation rules apply. The assembled evidence package includes all retrieved data, the agent's reasoning trace, and confidence scores for each conclusion.

4

Draft resolution generation

GPT-4o generates a structured resolution recommendation in the format required by the compliance team - including the root cause classification, the resolution action, the regulatory basis for the resolution, and any follow-up requirements. Reviewers see the full evidence chain and can approve with one click or override with a free-text note.

5

Evaluation harness

A suite of 500+ test cases (built from 6 months of historical reconciliation records) validates agent performance on every deployment. The harness measures accuracy by discrepancy category, escalation rate, and false negative rate. Any model update that degrades accuracy on the test suite is blocked from production.

Impact

What changed in production

The 78% reduction in processing time wasn't achieved by replacing human judgment - it was achieved by eliminating the 90% of each case that was data gathering and formatting, not judgment.

Processing time dropped 78%. Escalation rate held below 5%. 35 hours per week returned to the team. 200,000+ cases processed without a data incident.

78%

Processing time reduction

<5%

Escalation rate

35hr

Saved per week

200k+

Cases processed without incident

Learnings

What we took away from this project

Framing for compliance teams is a design decision

The first version of our proposal described the system as 'automated reconciliation.' The compliance team rejected it immediately - their mandate required human sign-off on every resolution. We reframed the system as 'automated investigation with human-approved resolution' and the same compliance team approved it the next day. The technical architecture was identical. The framing determined whether the project happened. Understanding what compliance teams can and cannot approve is as important as understanding what the technology can do.

Test suite quality determines production confidence

The evaluation harness was built from 6 months of historical cases - including every case the team could recall where a rule-based system had failed or where a human had overridden an automated suggestion. Building the test suite from real failure modes, not invented edge cases, gave the ops team something concrete to review. When they could see the agent handling the specific cases that had previously defeated automation, adoption followed naturally.

Escalation rate is a leading indicator of agent health

We tracked escalation rate weekly from day one - not as a failure metric but as a health signal. A rising escalation rate indicates data drift (source systems have changed in ways the agent hasn't adapted to) or scope creep (new discrepancy types are appearing that the agent wasn't trained to handle). Keeping escalation rate below 5% as a weekly SLA gave the team a clear, actionable signal to trigger investigation before accuracy degraded in production.

78%

reduction in processing time

At a glance

ClientGlobal FinTech (Confidential)
IndustryFinancial Services
Timeline10 weeks

Tech stack

GPT-4oLangGraphPostgreSQLFastAPIDatadog

Capabilities

AI Agents
AI Integration

Build something similar?

We've solved this category of problem before. Let's scope yours.

Start a conversation View related service