Databricks DAIS 2026 HackathonReal World Problem · Real Data · Real ML

The Gap Between
Deal and Delivery

"A deal closes in Salesforce on Friday. Delivery stalls in SAP on Monday. The customer churns by next quarter — and nobody saw it coming."

Deal2Delivery was built to close that gap. Permanently.

11 Problems. 11 Databricks Solutions.

Every problem we identified has a direct Databricks feature behind its solution.

1 / 11
auto

01

Siloed CRM & ERP Data

The Problem

Sales teams live in Salesforce. Operations live in SAP. Neither system sees the other — causing mis-forecasted demand, stock shortages, and revenue leakage from deals that never get fulfilled.

Databricks Solution

Databricks Lakehouse ingests and joins both sources into a single Delta Lake. Unity Catalog governs access across dev, staging, and prod with fine-grained permissions.

Delta LakeUnity CatalogLakeflow DLT

02

Reactive Churn Management

The Problem

Sales reps discover a customer has churned only after they've gone silent for months. There's no early warning system — churn signals from support cases, sentiment, and order behaviour are invisible.

Databricks Solution

XGBoost churn model trained on a composite behavioral label (inactivity + case burden + sentiment + revenue decline). Optuna-tuned with 5-fold CV. Predictions stored in Unity Catalog for governed consumption.

MLflowUnity CatalogXGBoost + Optuna

03

Gut-feel Demand Planning

The Problem

Demand plans are built on intuition and last month's spreadsheet. SKU-level trends, seasonality, and customer buying cycles are buried in SAP tables nobody queries.

Databricks Solution

XGBoost demand forecast model trained on 24 months of SAP order data with lag features (prev month, rolling 3 & 6 month avg). 6-month forward predictions written to a gold table and surfaced in the app.

MLflowXGBoostGold Delta Tables

04

AI Inference is a Black Box

The Problem

When an ML model flags a customer as high-risk, nobody can explain why. Trust in AI predictions is low because the chain from raw data to decision is invisible.

Databricks Solution

MLflow Tracing logs every customer scoring as a trace with nested spans — model input, churn probability, and a Databricks Foundation Model LLM explanation. The Traces tab shows the full chain per customer.

MLflow TracingDatabricks Foundation ModelsSpanType.CHAIN

05

Insights Locked Inside Databricks

The Problem

Business stakeholders — sales managers, account execs, finance — don't have Databricks access. Insights stay in internal dashboards that only data engineers can open.

Databricks Solution

Next.js app on Vercel queries Databricks SQL REST API from server-side API routes. Stakeholders get a public URL with curated KPIs, ML predictions, and OpenAI GPT-4o explanations — no Databricks login required.

Databricks SQL REST APISQL Result CacheNext.js ISR

06

Non-Technical Users Can't Query Data

The Problem

Business questions like 'Which healthcare customers haven't ordered in 60 days?' require a data analyst to write SQL, schedule a Jira ticket, and wait a week for an answer.

Databricks Solution

Databricks Genie AI/BI space backed by all 8 gold views. Users type natural language — Genie generates and runs SQL instantly. A self-improving LLM-as-a-Judge evaluation loop keeps answer quality high using Claude Opus.

Genie AI/BILLM-as-a-JudgeClaude Opus

07

Every Query Hits the Warehouse

The Problem

Without caching, every page load or tab switch sends a new query to the SQL warehouse — adding seconds of latency and burning unnecessary DBUs on repeated identical queries.

Databricks Solution

Two-layer cache: Next.js ISR (5-min TTL at Vercel CDN) + Databricks SQL Warehouse Result Cache (24h TTL). Tab switching is instant after first load. Repeated identical queries cost zero additional DBUs.

SQL Result CacheDelta CacheNext.js ISR

08

No Model Governance or Lineage

The Problem

ML models trained in ad-hoc notebooks are deployed without version control. Nobody knows which model version is in production, what data it was trained on, or how its accuracy has changed over time.

Databricks Solution

Every training run is tracked in MLflow with full metrics, parameters, and artifacts. Models are registered in Unity Catalog with the @champion alias pattern — production always uses the best validated version.

MLflow ExperimentsUnity Catalog Registry@champion alias

09

No Inventory Visibility Against Forecast

The Problem

Demand is forecasted in isolation — nobody can see which SKUs are critically understocked relative to the 6-month ML-predicted demand until purchase orders are already overdue.

Databricks Solution

SAP MARD stock data is ingested to a bronze Delta table, joined with ML forecast predictions in a new gold view, and classified as Critical / Warning / OK per SKU. Surfaced in a dedicated Inventory page.

SAP MARD Bronzegold_demand_vs_supply_gapDelta Lake

10

Static Demand Plans — No Scenario Modelling

The Problem

Sales managers can't answer 'what if we run a 20% promo on Cloud products next quarter?' Demand planning is a fixed output with no interactivity — scenarios exist only in spreadsheets.

Databricks Solution

Scenario Simulator page lets users pick a product category and apply a % adjustment (−50% to +100%) against the XGBoost forecast in real time. Shows unit delta and estimated revenue impact instantly via SQL multiplier.

demand_forecast_predictionsSQL REST APINext.js interactive UI

11

All Customers Look the Same

The Problem

High/Medium/Low churn tiers treat customers as a binary risk signal. Upsell opportunities, win-back plays, and loyalty investment decisions are invisible because there's no richer segmentation.

Databricks Solution

K-Means RFM clustering groups customers into Champions, Loyal, At-Risk, Hibernating, and Prospects. Silhouette score and inertia tracked in MLflow. Segments appear as filters and colour-coded badges in the Customer Risk view.

K-Means · scikit-learnMLflow trackingcustomer_segments table

Meet NovaTech Electronics

A company that looks successful on paper — and is quietly losing customers it never noticed leaving.

70

Enterprise clients

$9.2M

Annual revenue

12

Electronics SKUs

24

Months of history

~18%

Silent churn rate

3

Systems, 0 bridges

NovaTech Electronics sells computing devices, mobile hardware, and accessories to 70 enterprise clients across tech, finance, and professional services. Their account managers live in Salesforce CRM. Warehouse stock, order fulfilment, and procurement run in SAP HANA. These two systems have never spoken to each other — and that silence is costing the business in ways nobody can even measure, because the data is split across systems that refuse to meet.

The Silence Between Systems

Each system knows half the truth. Neither knows enough to act.

Salesforce CRM

The customer-facing half

Knows

Who bought what and when
Open opportunities & pipeline
Support cases & sentiment scores
Customer lifetime value estimates

Blind to

Is the product actually in stock?
Has fulfilment already failed?
What's the procurement lead time?
Are we over-committed on inventory?

SAP HANA ERP

The operations half

Knows

Warehouse stock levels per SKU
Order execution & fulfilment status
Procurement orders & lead times
Product-level demand history

Blind to

Is this customer about to churn?
What deals are closing this quarter?
Which accounts are high risk?
Why did revenue drop last month?

The result: stock shortages appear only after deals fail. Churn is discovered only after customers go quiet. Demand plans are built on intuition. And when the AI flags a risk, nobody can explain why — so nobody acts. This is the deal-to-delivery gap.

What the Gap Costs, Week by Week

Four real scenarios from NovaTech's operations — none of them visible without unified data.

01

The Invisible Stockout

Monday, 9 AM. A sales rep closes a $200K laptop order in Salesforce.

The warehouse in SAP has 12 units. The order is for 80. Nobody knows this until the customer calls three weeks later asking where their devices are.

Fulfilment failure, expediting fees, damaged relationship.

02

The Silent Churn

A top-tier client hasn't placed an order in 94 days.

Their SAP interaction log shows 4 unresolved support tickets and a billing dispute. Their Salesforce account shows 'Active'. The account manager has no idea.

Full account lost. $180K ARR gone before anyone noticed.

03

The Gut-Feel Forecast

Q4 demand planning meeting. The ops team asks: what do we order?

SKU-level trend data sits in SAP tables. The analyst who knows the queries is on leave. The plan is based on last year's spreadsheet and instinct.

Over-stock on slow-moving accessories. Stockout on flagship laptops.

04

The Unexplained Flag

The ML model flags a customer as 87% churn risk.

The sales manager asks: why? The data scientist says: it's a combination of 12 features across inactivity, sentiment, and case burden. The manager doesn't act.

High-risk customer churns. The model was right. No one trusted it.

How Deal2Delivery Solves It

One lakehouse. Both systems. Every insight. No excuses.

Unified data

SAP HANA + Salesforce CRM ingested into a single Delta Lakehouse — one source of truth, governed by Unity Catalog.

Proactive churn detection

XGBoost model trained on 30+ behavioural signals flags at-risk customers weeks before they go silent.

Inventory intelligence

SAP MARD stock joined with 6-month ML demand forecast. Critical gaps visible before the sales call is even made.

Explainable AI

Every churn flag comes with an GPT-4o explanation in plain English. Sales reps trust it because they understand it.

Ready to see the live data?

Every chart and ML prediction below is live from Databricks.