Databricks: Control Plane vs Data Plane

Control Plane

The Control Plane is where Databricks manages and orchestrates your workspace and infrastructure.
It contains all the metadata and configuration required to run workloads.

Key responsibilities:

  • Stores notebooks, jobs, cluster configurations, and workspace settings.
  • Handles authentication, authorization, and user management.
  • Manages job scheduling, monitoring, and logs.
  • Orchestrates cluster creation but does not access your actual data.

Essentially, the Control Plane is Databricks-managed and ensures your workspace runs smoothly, without hosting your business data.

Data Plane

The Data Plane is where your data is processed and stored.
This is typically located within your cloud account (Azure, AWS, or GCP), providing data isolation and security.

Key responsibilities:

  • Executes Spark jobs, notebooks, and SQL queries.
  • Stores and reads data from Delta Lake, ADLS, S3, or other storage.
  • Performs data transformations and ML workloads.
  • Keeps data within your organization’s security boundary.

The Data Plane is customer-controlled, ensuring compliance and governance, especially in regulated industries.

Summary

PlaneManaged byContainsKey Purpose
Control PlaneDatabricks
Metadata, notebooks, job configs
Management & Orchestration
Data PlaneCustomerActual data, execution environmentProcessing & Storage