Query Profiling & Spark UI for Databricks SQL
Even the best queries can perform poorly without proper profiling.
Databricks provides powerful tools to analyze SQL query execution, identify bottlenecks, and optimize performance — all through query profiling and the Spark UI.
This guide shows how to profile queries, interpret Spark UI metrics, and apply optimization techniques, illustrated with real-world examples.
A Real-World Story
Meet Sneha, a BI engineer.
- Dashboard queries started running slowly
- SQL queries scanned more data than necessary
- Cost and latency increased unexpectedly
By learning query profiling and using Spark UI, Sneha identifies slow stages, optimizes filters, and reduces query time by 50%.
1. Understanding Query Profiling
Query profiling provides:
- Execution time per query stage
- Bytes scanned and data shuffle sizes
- Resource utilization metrics
Benefits:
- Detects bottlenecks
- Highlights inefficient scans
- Guides optimization for SQL warehouses
2. Using Spark UI with Databricks SQL
Even with serverless or SQL warehouses, Spark UI exposes execution details:
- Stages and tasks: see which steps are slow
- Shuffle read/write: detect expensive operations
- Skewed partitions: find uneven data distribution
Accessing Spark UI
- Run a SQL query
- Click Query Details → Spark UI
- Explore DAG visualization, task metrics, and SQL metrics
3. Key Metrics to Monitor
| Metric | What it Shows | Why it Matters |
|---|---|---|
| Task duration | Time per task | Identify slow stages |
| Shuffle read/write | Data movement | High shuffles = expensive queries |
| Input rows scanned | Rows read from storage | Helps reduce unnecessary scans |
| Spill to disk | Memory overflow | Optimize caching or partitioning |
4. Query Optimization Tips
a) Reduce Data Scanned
- Use partition filters
- Select only required columns
SELECT order_id, amount
FROM sales_orders
WHERE order_date >= '2024-01-01';
b) Use Caching for Frequent Queries
CACHE TABLE silver_orders_summary;
c) Optimize Shuffles
- Repartition large tables carefully
- Avoid cross joins on huge datasets
d) Monitor Skew
- Identify skewed partitions in Spark UI
- Apply salting or repartitioning
5. Using Query History for Profiling
- Access Databricks SQL Query History
- Analyze execution time, scanned bytes, and cluster usage
- Identify frequently slow queries for targeted optimization
Input & Output Example
Input Query
SELECT customer_id, SUM(amount) AS total_spent
FROM sales_orders
GROUP BY customer_id;
Profiling Findings
- 80% of time spent in shuffle
- Some partitions had very high task duration
Optimization
- Partitioned table by customer_id
- Cached intermediate results
Result
- Query runtime reduced from 120s → 45s
- Shuffle bytes reduced by 60%
Summary
Query profiling and Spark UI are critical tools for Databricks SQL performance.
Key takeaways:
- Always profile queries to detect bottlenecks
- Use Spark UI to visualize execution, shuffle, and skew
- Reduce data scanned with partition filters and column pruning
- Cache frequently used intermediate tables
- Monitor Query History to identify slow queries
Following these practices ensures fast, cost-efficient, and reliable SQL analytics in Databricks.
📌 Next Article in This Series: Unity Catalog — Central Governance Explained