BACK

BQ ML Pricing

by: CloudBolt / January, 26 2024

Explore the chapters:

BQ ML is a mode in BigQuery that allows you to create, test, and tune Machine Learning (ML) models using standard SQL queries. It supports widely used models such as linear regression for forecasting, K-means clustering for data segmentation, and Deep Neural Networks. Google Cloud’s AutoML and Vertex AI can create the same models. However, one would typically choose BQ ML when training data within BigQuery or when Data Analysts prefer a self-service approach to building models.

BQ ML democratizes ML by allowing Data Analysts without extensive training in Data Science to create and use models based on business data. Development time is also significantly reduced because the data used for model creation already exists within the same data warehouse.

For these reasons, many organizations migrate their analytic workloads exclusively to BigQuery. However, the most critical question remains: What will it cost?

There are two usage patterns: on-demand pricing and flat-rate pricing. Understanding these options is fundamental to creating a well-planned BQ ML cost strategy. Other aspects of Machine Learning Operations (MLOps), such as hyperparameter tuning, model deployment, evaluation, inference, and feature processing, can be conducted within BQ ML and are charged as queries relative to the storage used and the amount of data processed.

Executive Summary

This article will explain the following concepts, which we have summarized here for your easy reference:

Pricing Structure	Description
Free tier	The following parameters have free usage limits:storagedata processeddata inserteddata processed by CREATE MODEL queriesOnce the free usage limits have been surpassed, the user will start to incur costs according to on-demand pricing
On-demand pricing	On-demand pricing is how operations are billed as-is. Processes such as model creation, evaluation, inspection, and prediction incur costs when run ad-hoc. Additionally, costs differ based on the model type. Some examples of model types include logistic regression, k-means clustering, and DNN
Flat-rate pricing	Flat-rate pricing is the primary savings vehicle used for BigQuery workloads. It works by purchasing commitments and assigning those commitments to Google Cloud projects. A commitment pertains to dedicated BigQuery slots. Slots describe a unit of query processing.

BQ ML pricing

Pricing usage patterns can be categorized into:

On-demand pricing
Flat-rate pricing

Note that free usage applies to BQ ML, wherein operations are free to a certain extent. On-demand pricing refers to queries that are billed as-is with no special discounts. Customers can use flat-rate pricing to save money when the number of monthly queries is predictable. There are two model types:

Built-in models
External models

A built-in model is a model that is trained within BigQuery, and an external model uses other Google Cloud services like Vertex AI and AutoML. The pricing for built-in models would depend on model and operation type. With external models, pricing is still affected by model and operation type with the addition of any other Google Cloud services outside of BQ ML like Vertex AI and AutoML.

Free tier

There are four parameters of usage:

Storage
Model prediction, inspection, and evaluation queries
Use of BigQuery Storage Write API
Model creation queries

Parameter	Free usage limits
Storage	The first 10 GB per month is free
Model prediction, inspection, and evaluation queries	First 1 TB of query data processed per month is free
BigQuery Storage Write API	First 2 TB of data inserted into BigQuery via API per month is free
Model creation queries	First 10 TB of data processed by CREATE MODEL queries per month is free

If your usage is well below the free-tier threshold, there’s a good chance you’re not paying for BigQuery. However, once you start scaling your services, you will likely notice that your bill is increasing, which may result from on-demand pricing.

Hybrid Cloud Solutions Demo

See the best multi-cloud management solution on the market, and when you book & attend your CloudBolt demo we’ll send you a $75 Amazon Gift Card.

Book demo

On-demand pricing

This pricing comes on an as-is basis, and the actual costs incurred will depend on the operations you’re carrying out. The available Big Query operations are as follows:

Model creation
Model evaluation
Model inspection
Model prediction

Note that Model Creation for Matrix Factorization, the most commonly used model for recommender systems, is not supported by on-demand pricing and is available as a Flat-rate only.

Operation type	Model type	Pricing
Model Creation	Logistic Regression	$250 per TB
	Linear Regression
	K-means clustering
	Time-series
Model Creation	AutoML tables	$5 per TB, plus Vertex AI training cost
	DNN
	Boosted tree
Model Evaluation	All types	$5 per TB of data processed
Model Inspection
Model Prediction

Flat-rate pricing

This type of pricing is best for clients with large-scale BigQuery model deployments. The predictability of monthly costs makes cost optimization more simple. It works by using reservations for both built-in and external models, and the general flow is as follows:

The user purchases “commitments”
The user assigns slots to reservations
The user assigns one or more projects to a reservation

Image shows the typical steps to use flat-rate pricing (source)

A comprehensive approach to hybrid cloud management

Platform	Multi Cloud Integrations	Cost Management	Security & Compliance	Provisioning Automation	Automated Discovery	Infrastructure Testing	Collaborative Exchange
CloudHealth	✔	✔	✔
Morpheus	✔			✔	✔
CloudBolt	✔	✔	✔	✔	✔	✔	✔

Commitments

Clients purchase “commitments” or dedicated query processing capacity. Commitments are measured as BigQuery “slots.” A slot is a vCPU used to execute SQL queries. The number of slots needed depends on the size of the query and its complexity. A commitment also has a duration. There are three types of commitment:

Annual
Monthly
Flex

Annual commitments last for a minimum of 365 days. Monthly commitments last for a minimum of 30 days. Lastly, Flex slots can last for as little as 60 seconds and are typically used for testing and seasonal demands.

Reservations

After buying commitments, users will assign them to different “reservations.” This resource-allocation system allows you to associate commitments to other workloads. For example, you could attach commitments to reservations called “prod” for production workloads, “dev” for development workloads, and so forth.

Once complete, one or more projects, folders, or organizations must be assigned to a reservation to make the slots usable. For example, if you assign a reservation to a project, then the datasets within that project can use the slots associated with the reservation.

Assignments

Assignments have two possible job types for BQ ML:

QUERY
ML_EXTERNAL

A QUERY job includes BQ ML queries such as model creation, inspection, evaluation, and prediction. Most queries for BQ ML will therefore fall under this type of job. An ML_EXTERNAL job applies to BQ ML queries that use external services such as Vertex AI and AutoML. Only external model reservations have this job type.

Cost Table

Type of commitment	Number of slots	Pricing
Monthly	100	$2000
Annual	100	$1700
Flex	100	$4 per hour or $2920 per month

Limitations

Flat-rate pricing has several limitations, including:

The inability to share reservations with other GCP organizations
An organization can only have a maximum of five projects with an active commitment in specific locations
Commitments are regional resources – you cannot move them between regions

Recommendations

The following are factors that allow you to create more realistic cost estimations:

Clear reservation allocation rules for complex organizations
An accurate estimate of how many slots are needed
Well-managed workloads across different reservations

To help you achieve these, we have provided several recommendations below.

Create an administration project solely for reservations

Google recommends creating an “administrative project” just for reservations, allowing you to centralize billing and management. Projects under the same organization as the administration project can use reservations. Additionally, these projects can share idle and unallocated slots, making slot management more flexible.

Use Flex slots to get a slot estimate

Monitoring tools like Cloud Logging allow you to monitor the average capacity your workloads consume. It’s best practice to use a small number of Flex slots and increase their number as you test your workloads. Examine your logs to identify what number of slots provides the best performance-to-value ratio.

Define workloads clearly

Take advantage of reservations by defining workloads based on purpose or domain. Following the previously outlined methodology, estimate the number of slots required for each workstream.

For example, given 1000 committed slots and three functional areas, the assignment of resources could work as follows:

Assign five hundred slots to the Data Science team. They usually get the most slots because they will be doing the most intensive work
Assign three hundred slots for ETL (Extract-Transform-Load). This function is responsible for cleaning and transforming business data before it arrives at the Data Science and Business Intelligence teams
Assign two hundred slots to BI (Business Intelligence). BI creates reports and visualizations that provide meaningful insights, allowing stakeholders to make data-driven decisions

A comprehensive approach to hybrid cloud management

Only solution with automated discovery, testing, provisioning, security, and cost management

A `single pane`for infrastructure spanning on-premise, private cloud, and multiple public clouds

A comprehensive framework that extends your existing tool investments and fills the gaps

Conclusion

BQ ML is a powerful tool that allows the entire ML Ops lifecycle to exist within a single Data Warehouse solution. It democratizes data and enables Data Analysts without formal Machine Learning training to use SQL to create, evaluate, and deploy powerful industry-standard models.

Regarding pricing, there are two usage patterns: on-demand and flat-rate pricing. On-demand charges on an “as-is” basis and is an efficient choice for one-off workloads. Flat-rate is ideal for enterprises that want predictability in their billing, especially if they run large numbers of monthly BQ ML queries.

Flat-rate pricing works by purchasing committed slots, assigning slots to reservations, and distributing reservations amongst projects. A sound BQ ML pricing strategy depends on the number of monthly queries and the estimated number of slots required if the client chooses flat-rate pricing.

Explore the chapters:

Related Blogs

Beyond Basic Metrics: The 7 Strategic Cloud Cost Metrics for 2025

Let’s be honest—traditional cloud cost management metrics aren’t cutting it anymore. While “Cloud Spend by Service” dashboards and untagged resource…

7 SaaS Cost Optimization Best Practices

The Software as a Service (SaaS) industry continues its robust expansion, significantly reshaping business operations on a global scale. In…

Ready to Run Webinar: Achieving Automation Maturity in FinOps

Automation has become essential to keeping up with today’s fast-paced cloud environment. Manual FinOps processes create bottlenecks, delay decisions, and…

CloudBolt Software Listed in AWS “ICMP” for the US Federal Government

The Future of Cloud Cost Management and Optimization is Here with CloudBolt

CloudBolt Continues to Deliver on Augmented FinOps Vision: CNA, CloudBolt Agent, and Tech Alliance Program

Ready to Run: A Guide to Maturing Your FinOps Automation

Forrester names CloudBolt a Strong Performer for Cloud Cost Management and Optimization

Focus on FinOps: The alignment paradox