Cloud Spanner Cheatsheet
Cloud Spanner is Google Cloud's globally distributed, externally consistent, relational database service that offers high availability, strong consistency, and horizontal scalability.
Core Concepts
- Instance: A logical container for Spanner databases. Instances define geographic placement (regional or multi-region) and compute capacity (nodes or processing units).
- Regional: Data replicated within a single Google Cloud region for lower latency.
- Multi-Region: Data replicated across multiple regions for higher availability and global reads.
- Compute Capacity: Measured in Nodes (1000 Processing Units) or Processing Units (PUs, increments of 100). Determines compute/storage limits and performance.
- Database: Contains tables, indexes, and data. Resides within an instance. Uses Google Standard SQL dialect.
- Schema: Defined using Data Definition Language (DDL). Includes tables, columns, data types, indexes, and relationships (interleaving).
- Tables: Standard relational tables.
- Primary Key: Uniquely identifies rows. Crucial for data distribution and locality. Avoid monotonically increasing keys (like simple sequences or timestamps) for the first part of the key to prevent hotspots.
- Interleaving: Physically co-locates child table rows with their parent row using
INTERLEAVE IN PARENT
. Improves performance for common parent-child joins. - Secondary Index: Allows efficient querying on non-primary key columns. Can include (
STORING
) non-key columns to cover queries. Can be interleaved. - Transactions: Spanner provides ACID guarantees.
- Read-Write Transactions: Standard transactions for reads and writes. Use locking.
- Read-Only Transactions: Do not lock. Can perform strong reads (most up-to-date) or stale reads (read data from a specific timestamp in the past, lower latency).
- Mutations: API calls for inserting, updating, or deleting data (alternative to DML statements, often preferred programmatically for batching).
- Query Optimizer: Analyzes queries and generates execution plans. Use
EXPLAIN
or Query Insights to understand performance. - Google Standard SQL: Spanner uses this SQL dialect (ANSI 2011 compliant with extensions).
gcloud
CLI Commands (gcloud spanner ...
)
Replace [PLACEHOLDERS]
with your values.
Instances
-
List instances:
gcloud spanner instances list
-
Describe an instance:
gcloud spanner instances describe [INSTANCE_ID]
-
Create an instance:
# Regional (e.g., us-central1) gcloud spanner instances create [INSTANCE_ID] \ --config=regional-us-central1 \ --description="My Regional Instance" \ --processing-units=100 # Or --nodes=1 # Multi-Region (e.g., nam7) gcloud spanner instances create [INSTANCE_ID] \ --config=nam7 \ --description="My Multi-Region Instance" \ --nodes=3 # Or --processing-units=3000
-
Update instance capacity:
gcloud spanner instances update [INSTANCE_ID] --nodes=5 # OR gcloud spanner instances update [INSTANCE_ID] --processing-units=500
-
Delete an instance:
gcloud spanner instances delete [INSTANCE_ID]
Databases
- List databases in an instance:
gcloud spanner databases list --instance=[INSTANCE_ID]
- Describe a database:
gcloud spanner databases describe [DATABASE_ID] --instance=[INSTANCE_ID]
- Create a database (optionally with initial DDL):
gcloud spanner databases create [DATABASE_ID] --instance=[INSTANCE_ID] \ --ddl="CREATE TABLE Singers (SingerId INT64 NOT NULL, Name STRING(MAX)) PRIMARY KEY (SingerId)"
- Update database schema (DDL):
gcloud spanner databases ddl update [DATABASE_ID] --instance=[INSTANCE_ID] \ --ddl="ALTER TABLE Singers ADD COLUMN BirthDate DATE"
- Delete a database:
gcloud spanner databases delete [DATABASE_ID] --instance=[INSTANCE_ID]
Database Operations (DDL, Backups)
- List long-running operations:
gcloud spanner operations list --instance=[INSTANCE_ID]
- Describe an operation:
gcloud spanner operations describe [OPERATION_ID] --instance=[INSTANCE_ID]
IAM (Permissions)
- Get IAM policy for an instance:
gcloud spanner instances get-iam-policy [INSTANCE_ID]
- Set IAM policy for an instance (use with caution - replaces policy):
gcloud spanner instances set-iam-policy [INSTANCE_ID] [POLICY_FILE.json]
- Add IAM policy binding:
gcloud spanner instances add-iam-policy-binding [INSTANCE_ID] \ --member="user:[email protected]" \ --role="roles/spanner.databaseUser"
SQL (Google Standard SQL Dialect)
Data Definition Language (DDL)
-
Create Table:
CREATE TABLE Albums ( SingerId INT64 NOT NULL, AlbumId INT64 NOT NULL, AlbumTitle STRING(MAX), ReleaseDate DATE ) PRIMARY KEY (SingerId, AlbumId), INTERLEAVE IN PARENT Singers ON DELETE CASCADE; -- Example interleaving
-
Alter Table:
ALTER TABLE Albums ADD COLUMN MarketingBudget NUMERIC;
-
Drop Table:
DROP TABLE Albums;
-
Create Index:
CREATE INDEX AlbumsByAlbumTitle ON Albums(AlbumTitle); -- Index with stored columns (covering index) CREATE INDEX AlbumsByTitleStoringDate ON Albums(AlbumTitle) STORING (ReleaseDate);
-
Drop Index:
DROP INDEX AlbumsByAlbumTitle;
Data Manipulation Language (DML)
- Insert:
INSERT INTO Singers (SingerId, Name, BirthDate) VALUES (1, 'Alice', '1990-05-15'), (2, 'Bob', NULL);
- Update:
UPDATE Singers SET BirthDate = '1991-12-01' WHERE SingerId = 2;
- Delete:
DELETE FROM Singers WHERE SingerId = 1;
- Batch DML (Execute multiple DML statements atomically):
Note: Mutations API is often preferred over Batch DML for programmatic batching.START BATCH DML; INSERT INTO Singers (SingerId, Name) VALUES (3, 'Charlie'); UPDATE Albums SET MarketingBudget = 50000 WHERE SingerId = 2 AND AlbumId = 1; COMMIT BATCH; -- Or RUN BATCH
Querying (SELECT)
- Basic Select:
SELECT SingerId, Name FROM Singers WHERE BirthDate > '1990-01-01' ORDER BY Name;
- Joins:
SELECT s.Name, a.AlbumTitle FROM Singers AS s JOIN Albums AS a ON s.SingerId = a.SingerId WHERE a.ReleaseDate >= '2020-01-01';
- Query Hints (e.g., force index usage):
SELECT AlbumId, AlbumTitle FROM Albums@{FORCE_INDEX=AlbumsByAlbumTitle} WHERE AlbumTitle = 'My Great Album';
Transactions
- Read-Write Transaction (Conceptual - actual implementation depends on client library):
-- In client library code: Start ReadWrite Transaction SELECT ... ; UPDATE ... ; INSERT ... ; -- In client library code: Commit Transaction
- Read-Only Transaction (Conceptual):
-- In client library code: Start ReadOnly Transaction (can specify strong or staleness) SELECT ... ; SELECT ... ; -- In client library code: Close Transaction (no commit needed)
Key Data Types
BOOL
INT64
(64-bit integer)FLOAT64
(double precision floating point)NUMERIC
(lossless decimal, preferred for financial data)STRING(len | MAX)
BYTES(len | MAX)
DATE
TIMESTAMP
(nanosecond precision, UTC)JSON
ARRAY<type>
(Arrays of most other types, except ARRAY and STRUCT)STRUCT<field_name type, ...>
Performance & Best Practices
- Schema Design:
- Choose Primary Keys carefully to avoid hotspots (avoid sequential leading keys). Consider UUIDs, hash-based keys, or reversing sequential IDs.
- Use Interleaving for parent-child relationships frequently accessed together.
- Use Secondary Indexes for common query patterns on non-key columns. Use
STORING
to cover queries.
- Writes:
- Use the Mutations API for batching writes programmatically.
- Avoid excessively large transactions.
- Reads:
- Use Read-Only transactions when possible.
- Use Stale Reads (
exact_staleness
ormax_staleness
) for lower latency when slightly old data is acceptable.
- Monitoring:
- Monitor CPU utilization (keep below ~65% for regional, ~45% for multi-region).
- Watch latency, QPS, lock contention (
spanner.googleapis.com/transaction/lock_wait_seconds
). - Use Query Insights to identify slow or expensive queries.
- Query Optimization:
- Analyze query plans using
EXPLAIN
(via client libraries or Cloud Console). - Ensure queries use appropriate indexes.
- Avoid
SELECT *
if only specific columns are needed.
- Analyze query plans using
- Instance Sizing: Start with appropriate compute capacity and scale up/down based on monitoring.
Pricing Components
- Compute Capacity: Per node-hour or PU-hour (billed hourly).
- Database Storage: Per GB-month (includes tables, indexes, metadata).
- Backup Storage: Per GB-month.
- Network Egress: Per GB (egress between regions or out to internet).