Idesignthesystemsthatbankstrustwith10 millioncustomers.

Lead Software Architect  ·  16 years  ·  London
Currently Publicis Sapient · London
Shipping Zero Trust identity · UK banking
Recognition Zero Trust Innovator · 2024
Auth Plane · Target SLOs
Delivered
Customers · Scale
10M+
Banking · UK FTSE
Deploy · Lead time
↓ 40%
vs. pre-GitOps baseline
SLA · production99.9%
Records processed · per day50M+
Failure scenarios automated90+
Infrastructure spend reduction↓ 25%
Mesh mTLS coverage100%
Customers secured
0
Zero Trust
Records / day
0
Spring Batch
SLA delivered
0
GKE · HPA/VPA
Faster deploys
0
GitOps · 5+ squads
Selected Work · 2015 — 2026

Four systems that needed to not fail.

Case 01 / 04

Identity & Auth Platform

Lead Architect·Publicis Sapient · UK FTSE-listed bank·2022 — 2024

"10 million banking customers authenticated under a Zero Trust contract — adopted as the enterprise security standard."

End-to-end OAuth2 / OIDC service with signed JWTs, Istio-enforced mTLS, and Apigee + ForgeRock IAM. Every inter-service call policy-gated; tokens minted against HSM-backed keys in GCP KMS. Adopted as the bank-wide standard.

Java 17Spring BootGKEIstioApigeeForgeRockOIDCGCP KMS
Zero Trust Innovator Award · 2024
Fig. 01 · Auth Plane Topology
USER APIGEE Gateway FORGEROCK IdP / OIDC ISTIO MESH · mTLS auth token consent mfa audit GCP KMS · HSM-backed
Case 02 / 04

GKE Cloud-Native Migration

Lead Engineer·UK FTSE-listed bank·2021 — 2023

"Ten-plus microservices on GKE, 99.9% SLA held, deploy lead time cut by 40%, infra spend down 25%."

Owned service decomposition, NFR definition, and inter-service communication design for the bank's mission-critical GKE migration. Spring Boot services with HPA/VPA tuned to real traffic curves; autoscaling design drove a 25% infrastructure cost reduction.

Spring BootGKETerraformArgo CDKafkaDatadogHPA / VPA
Fig. 02 · GKE Workload Topology
GKE CLUSTER · europe-west2 · 3 AZ ns: trading ns: risk ns: platform order-svc · HPA 2→8 risk-engine · HPA 2→6 kafka redis spanner prometheus datadog argo-cd HPA · VPA · GitOps
Case 03 / 04

Core Banking Data Integration

Senior Engineer·UK FTSE-listed bank·2019 — 2021

"50 million records every day, six microservices, zero critical incidents — and IBM IIB licences cancelled."

Replaced IBM Integration Bus with Spring Integration + Spring Batch on Oracle PL/SQL, eliminating six-figure enterprise licensing. Idempotent batch design, poison-message quarantine, automated regression across the full failure surface.

Spring BatchSpring IntegrationOraclePL/SQLKafkaJMS
Fig. 03 · Batch ↔ Stream Topology
upstream · sftp upstream · mq upstream · api SPRING INTEGRATION inbound-adapter transform route + enrich dlq · quarantine SPRING BATCH reader processor writer · chunked ORACLE PL/SQL KAFKA · events 50M rec/day idempotent · replay-safe IBM IIB decommissioned
Case 04 / 04

Cross-Border Payments · Card Controls

Service Owner·Publicis Sapient · Cards Platform·2017 — 2019

"Ninety-plus failure scenarios regression-tested before go-live — every path a customer could take to break the system had a pinned test."

End-to-end ownership of the customer-facing card-controls service on a cross-border payments platform. Idempotency keys on every mutation, saga-based compensation on partial failure, full production delivery lifecycle automated through CI/CD. Governance output: org-wide OpenAPI standards and GitOps delivery across 5+ squads (+30% release velocity).

Java 11Spring BootKafkaSaga PatternPACTCucumberOpenAPIGitOps
Fig. 04 · Card-Controls Saga
APP gateway idempotency card-controls saga orchestrator SAGA PARTICIPANTS risk-check ledger notify network audit fraud-fb KAFKA · saga events Idempotency-Key 90+ failure scenarios compensating transactions
Workshop · Open Source

Things I'm building in the open.

@sumitsr/retireiq
Shipped

Bank-grade agentic AI for retirement planning.

Conversational intelligence platform bridging complex financial data with natural-language interaction. Local-first for development, cloud-scale for production. Application Factory pattern, leak-proof PII sanitization, and a specialist-agent core (Knowledge, Portfolio, Transaction) orchestrated by a semantic router.

USER PII sanitiser re-hydrate proxy semantic router intent classify knowledge agent portfolio agent transaction agent pgvector RAG index Vertex AI embeddings Ollama local LLM DOCKER COMPOSE · LOCAL-FIRST · MAKEFILE
Python · 99.5%· pgvector · Vertex AI · Ollama· 24 commits
Learn More
@sumitsr/sentinel
In development

Production-grade quota monitoring for LLM infrastructure.

An AI quota monitor dashboard built to watch real inference usage at scale — token rates, cost curves, rate-limit proximity. Spring Boot backend polling the Anthropic API, React frontend over REST + SSE, terminal-aesthetic dark UI. Honest status: React frontend complete with live gauges; Spring Boot backend wiring in progress.

React UI :3000 TOKEN USAGE · 62% p95 · 1.4s SSE REST Spring Boot :8080 poller · 30s metrics-store alert-engine Anthropic API rate-limit headers Postgres time-series NOMINAL · WARNING · CRITICAL — TERMINAL AESTHETIC · CRT OVERLAY
Spring Boot · React· SSE · REST· Terminal UI · CRT overlay
Learn More
Career · 2011 — Present

Sixteen years, one trajectory.

2011

SRM Techsol · India

First production systems. Multi-channel marketing platform on MongoDB, Cassandra, Redis. Learned what "at-scale" really means.
2013

Infogain · India

REST/OSB service architecture and Splunk observability. First time owning a subsystem end-to-end rather than shipping features inside one.
2014

Vodafone · India & Germany

Dynamic Dashboard middleware for mobile apps across millions of subscribers. Carrier-volume identity, caching, and A/B framework design.
2015

Publicis Sapient · India

Joined Sapient as Senior Engineer. Cross-border payments, card controls, 90+ automated failure scenarios. The shape of the architect role started to emerge here.
2019 — Present

Publicis Sapient · London

Moved to the London practice. Lead architect on banking-grade identity, GKE migration, governance across 5+ squads. Zero Trust Innovator Award, 2024.
Principle 01
Design for the failure mode, not the happy path.
Ninety-plus error scenarios automated on the cross-border payments card-controls platform. Every path a customer could take to break the system had a pinned test before go-live. The happy path is what customers expect; the failure modes are what they'll remember.
ADR · Compensating transactions over distributed locks
Context
Card-controls mutation touches five downstream services (risk, ledger, notify, network, audit). Strong distributed locking was on the table — operationally expensive, fragile at the boundaries.
Decision
Saga pattern with explicit compensating transactions per participant. Idempotency-Key on every write. Partial failures are modelled, not avoided.
Consequence
Engineers now write the rollback as they write the forward path. 90+ failure scenarios regression-tested. The ops team stopped carrying pagers for this service in month three.
Principle 02
Prefer boring infrastructure and sharp abstractions.
Replaced IBM Integration Bus with Spring Integration — same job, six-figure licence gone, velocity up. The right abstraction is the one a junior can reason about at 2am; the right platform is the one that doesn't page you at 2am. Novelty is a cost, not a feature.
ADR · Retire IBM IIB, adopt Spring Integration
Context
Legacy IIB deployment was carrying annual licensing into six figures, a bottleneck vendor relationship, and a deploy story that couldn't live alongside GitOps.
Decision
Port integration flows to Spring Integration + Spring Batch on the existing Oracle estate. Same pipelines, in the language the team already owned.
Consequence
Licence line eliminated. 50M records/day continue to flow. Integration logic now lives in the same repos, tested with the same tools, deployed by the same pipelines as the application services.
Principle 03
Governance is velocity, not bureaucracy.
OpenAPI standards plus GitOps across five squads produced a 30% release-velocity gain — not in spite of the rules, but because of them. Good governance is the shared library teams choose to import; bad governance is the committee they have to clear. The architect's job is to ship the former.
ADR · OpenAPI-first contracts, PACT for consumer tests
Context
Five squads, dozens of services, integration defects were surfacing in staging rather than in review.
Decision
OpenAPI as the contract of record; consumer-driven contract tests (PACT) blocking producer merges that break a consumer. GitOps as the only path to prod.
Consequence
Release velocity up 30%. Integration defect class effectively closed at PR time rather than at deploy time. Squads coordinate less and ship more — the correct direction.
Stack · Colophon

Tools I reach for, in order.

Languages & Frameworks

  • Java 11 / 17
  • Spring Boot · Batch · Integration
  • gRPC · REST
  • Python
  • PL/SQL

Cloud & Infrastructure

  • GCP · GKE · Pub/Sub · KMS
  • AWS · EKS · EC2 · S3 · IAM
  • Azure
  • Kubernetes · Docker · Helm
  • Terraform

Messaging & APIs

  • Apache Kafka
  • JMS · Pub/Sub
  • OpenAPI · Swagger
  • PACT contract testing
  • Apigee API Gateway

Data & Observability

  • Oracle · PL/SQL
  • Spanner · MongoDB · Cassandra
  • Redis
  • Datadog · Dynatrace
  • Splunk · ELK

Security & Identity

  • OAuth2 · OIDC
  • ForgeRock IAM
  • JWT · mTLS (Istio)
  • Open Policy Agent
  • Zero Trust architecture

Delivery & Testing

  • Jenkins · GitHub Actions
  • Argo CD · GitOps
  • JUnit · Cucumber · RSpec
  • TDD · BDD
  • Scrum · Spotify Squads
Recent Writing

Thinking in public.

Sumit Srivastava
The architect behind the work

I build with care, in public, for as long as it takes.

Sixteen years in, I still think of the job the same way: find the failure mode, design around it, ship the thing, teach the team. I've been lucky enough to do that at scale — for banks, for telcos, for payments — and lucky enough to keep getting harder problems as rewards.

London now; Mumbai and Gurgaon before. The accent has travelled; the method hasn't.

London · UK CKAD · 2022 Zero Trust Innovator · 2024

Let's architect something.

Book a 30-min intro call