Announced May 2025: Dataproc Serverless is now Google Cloud Serverless for Apache Spark

Google Cloud Serverless for Apache Spark

The new way for Apache Spark from development to production

On-demand Spark: Quick startup, zero ops, improve query performance, and Gemini productivity. Up to 60% lower TCO for Spark workloads.

Apache Spark is a trademark of The Apache Software Foundation.


Features

On-demand Spark: Focus on your code, not clusters

Eliminate the complexities of cluster management and avoid paying for idle, underutilized resources. Google Cloud Serverless for Apache Spark offers quick VM startup and dynamic autoscaling for your interactive, batch, and AI workloads. Spend your time building features, not managing infrastructure. There are no charges during VM startup and shutdown.

Boost performance with Lightning Engine

Experience industry-leading price-performance. Google Cloud Serverless for Apache Spark is powered by our next-generation native query engine, Lightning Engine, in Preview. It delivers significantly faster Spark query and data processing performance, over 3.6x faster** than open source Apache Spark, through its advanced vectorized execution, in-built intelligent caching, and optimized storage I/O, helping you get insights faster and reduce costs.

** The queries are derived from the TPC-DS standard and TPC-H standard and as such are not comparable to published TPC-DS standard and TPC-H standard results, as these runs do not comply with all requirements of the TPC-DS standard and TPC-H standard specification.

Enterprise-ready security and configurations

Run your production Spark workloads with confidence. Google Cloud Serverless for Apache Spark optimizes resources, provides job isolation, and supports Google Cloud’s enterprise security capabilities (including VPC-SC, CMEK, personal authentication, and custom organization policies). It ensures a secure execution environment with capabilities like secure subnets, encryption by default for data at rest and in transit, and no direct VM or root access, minimizing your operational security burden. While built for automation, expert users retain full access to Spark configurations for fine-grained control.

Gemini-powered productivity at every step

Infuse generative AI into your Spark development life cycle. Leverage Gemini for context-aware PySpark code generation in notebooks with intelligent context of your data to supercharge productivity. Get AI-assisted troubleshooting recommendations with Gemini Cloud Assist Investigate to quickly resolve issues, deeper operational insights, and optimize performance.

Easy distributed AI/ML

Seamlessly run distributed training or batch inference workloads. Google Cloud Serverless for Apache Spark offers built-in support for GPU acceleration and comes with pre-packaged popular ML libraries like XGBoost, PyTorch, and Transformers. This leads to significantly faster startup times for AI/ML environments and improves reliability since the images are Google-certified.

Open, flexible, and interoperable

Maintain full flexibility. Google Cloud Serverless for Apache Spark is fully OSS-compatible, so you can bring your existing Spark code and libraries without modification. Develop in your language of choice (Python, Java, Scala, R) using your preferred IDE (BigQuery Studio, Vertex AI Workbench, Jupyter, VSCode) and orchestrate with tools like Apache Airflow/Cloud Composer or BigQuery pipelines. Process all data formats, such as Google-native and open source like Apache Iceberg.

Unified BigQuery experience

Experience the power of Apache Spark directly within BigQuery. Write and run PySpark code alongside SQL in unified Colab Enterprise notebooks, leveraging common metadata through BigLake Metastore, shared security, consistent governance through Dataplex Universal Catalog.

How It Works

 Effortless Spark from idea to production

Common Uses

Serverless pipelines

 Lightning-fast Serverless ETL/ELT

Rapidly ingest, transform, and load massive datasets from diverse sources into BigQuery or Google Cloud Storage. With the unmatched performance of the Lightning Engine and zero operational burden, streamline your data pipelines and ensure fresh data for analytics.

 Lightning-fast Serverless ETL/ELT

Rapidly ingest, transform, and load massive datasets from diverse sources into BigQuery or Google Cloud Storage. With the unmatched performance of the Lightning Engine and zero operational burden, streamline your data pipelines and ensure fresh data for analytics.

Interactive data science and analytics

Interactive analytics and rapid prototyping

Empower your data scientists and analysts with a flexible, high-performance serverless Spark environment. Whether you're performing ad-hoc data exploration, rapid prototyping, or building sophisticated machine learning models, Google Cloud Serverless for Apache Spark provides the speed and tools you need. Develop PySpark and SQL code in BigQuery Studio for a unified experience, or connect from your preferred tools like Jupyter notebooks and VS Code with Google Cloud extensions. Leverage Gemini for code assistance and troubleshooting, the Lightning Engine for rapid query results, and Vertex AI integration for MLOps. From quick data discovery to training complex models with GPUs and pre-packaged libraries, accelerate your entire data science life cycle.

Interactive analytics and rapid prototyping

Empower your data scientists and analysts with a flexible, high-performance serverless Spark environment. Whether you're performing ad-hoc data exploration, rapid prototyping, or building sophisticated machine learning models, Google Cloud Serverless for Apache Spark provides the speed and tools you need. Develop PySpark and SQL code in BigQuery Studio for a unified experience, or connect from your preferred tools like Jupyter notebooks and VS Code with Google Cloud extensions. Leverage Gemini for code assistance and troubleshooting, the Lightning Engine for rapid query results, and Vertex AI integration for MLOps. From quick data discovery to training complex models with GPUs and pre-packaged libraries, accelerate your entire data science life cycle.

Pricing

Transparent, value-driven pricingGoogle Cloud Serverless Spark pricing is based on per-second usage of compute (DCUs), GPUs, and shuffle storage.
Services and usageSubscription type Price (USD)

Data Compute Unit (DCU)

Standard

Starting at

$0.06

per hour

Premium

Starting at

$0.089

per hour

Shuffle storage

Standard

Starting at

$0.04

per GB/month

Premium

Starting at

$0.1

per GB/month

Accelerator pricing

a100 40 GB

Starting at

$3.52069

per hour

a100 80 GB

Starting at

$4.713696

per hour

L4

Starting at

$0.672048

per hour

View pricing details for Google Cloud Serverless for Apache Spark.

Transparent, value-driven pricing

Google Cloud Serverless Spark pricing is based on per-second usage of compute (DCUs), GPUs, and shuffle storage.

Data Compute Unit (DCU)

Subscription type

Standard

Price (USD)

Starting at

$0.06

per hour

Premium

Subscription type

Starting at

$0.089

per hour

Shuffle storage

Subscription type

Standard

Price (USD)

Starting at

$0.04

per GB/month

Premium

Subscription type

Starting at

$0.1

per GB/month

Accelerator pricing

Subscription type

a100 40 GB

Price (USD)

Starting at

$3.52069

per hour

a100 80 GB

Subscription type

Starting at

$4.713696

per hour

L4

Subscription type

Starting at

$0.672048

per hour

View pricing details for Google Cloud Serverless for Apache Spark.

Pricing calculator

Calculate your monthly costs by region.

Custom quote

Connect with our sales team to get a custom quote for your organization.

Get started today

Tutorial for getting started

Have a large project?

Product overview

Use BigQuery connector with Google Cloud Serverless for Apache Spark

Use GPUs with Google Cloud Serverless for Apache Spark

Google Cloud