logo

YAML Cheatsheet

YAML is a human-friendly data serialization standard for all programming languages. It's often used for configuration files, data exchange between languages, and more. Its main benefits are readability and ease of writing.

Core Concepts

  • Human Readable: Designed to be easily read and written by humans.
  • Data Serialization: Represents data structures like lists, dictionaries (mappings), and scalars (strings, numbers, booleans).
  • Structure via Indentation: Uses Python-style indentation (spaces, not tabs) to indicate structure. Consistency is key (typically 2 or 4 spaces per level).
  • Key-Value Pairs: Forms the basis of mappings (dictionaries/objects).

Basic Syntax Rules

  1. Indentation: Defines nesting and structure. Tabs are forbidden. Use a consistent number of spaces (e.g., 2 spaces) for each level.
    parent:
      child: value # 'child' is nested under 'parent' (indented)
      sibling_child: value # Also nested under 'parent'
    sibling: value # 'sibling' is at the same level as 'parent' (same indentation)
    
  2. Colon (:) + Space: Separates keys from values in mappings. A space must follow the colon.
    key: value
    
  3. Hyphen (-) + Space: Denotes items in a sequence (list/array). A space must follow the hyphen.
    - item1
    - item2
    
  4. Case Sensitive: Keys and values are generally case-sensitive. (Key is different from key).
  5. Comments: Start with a hash symbol (#). Everything from # to the end of the line is ignored.
    # This is a full-line comment
    key: value # This is an inline comment
    

Data Types

Scalars (Single Values)

  • Strings:
    • Unquoted (Plain): Usually don't require quotes. Read until a special character like : followed by a space, or # followed by a space. Can be ambiguous sometimes.
      my_string: This is a plain string
      ambiguous_string: 'true' # Quote if it could be misinterpreted as boolean/number
      version_string: '1.2.3' # Quotes often used for version numbers
      
    • Single Quoted ('): Prevents interpretation of most special characters (escape sequences like \n are not processed). To include a literal single quote, use two single quotes ('').
      single_quoted: 'This contains literal \n, but ignores it. It''s literal.'
      
    • Double Quoted ("): Allows interpretation of escape sequences (\n, \t, \x..., \u..., \\, \"). Use when you need special characters represented literally.
      double_quoted: "Line 1\nLine 2\tTabbed, includes \"quotes\"."
      
  • Numbers:
    • Integers: 123, -45, 0
    • Floating-point: 3.14, -0.5, 1.23e+4, .inf (infinity), -.inf (-infinity), .nan (Not a Number)
    • Hexadecimal: 0xFF (prefix 0x)
    • Octal: 0o77 (prefix 0o - standard in YAML 1.2+)
    integer_value: 100
    float_value: 3.14159
    scientific: 6.022e23
    infinity: .inf
    hex_value: 0xDEADBEEF
    
  • Booleans:
    • True values: true, True, TRUE, yes, Yes, YES, on, On, ON
    • False values: false, False, FALSE, no, No, NO, off, Off, OFF
    • Convention: Prefer lowercase true and false for clarity.
    is_enabled: true
    needs_update: false
    use_feature_x: yes # Valid, but 'true'/'false' often clearer
    
  • Nulls: Represents absence of value or null.
    • null, Null, NULL, ~ (tilde)
    • Leaving a value empty after the key also implies null.
    • Convention: Prefer lowercase null.
    optional_field: null
    empty_value: ~
    implied_null: # Equivalent to null
    

Collections

  • Sequences (Lists / Arrays): Ordered collection of items. Denoted by a hyphen-space (- ) at the beginning of each item line. Items must have the same indentation level.

    # List of strings
    fruits:
      - Apple
      - Orange
      - Banana
    
    # Inline list (Flow style)
    colors: [red, green, blue]
    
    # List containing mixed types and nested structures
    mixed_list:
      - item1
      - 200
      - key: value # Nested mapping
      - - nested_item1 # Nested sequence
        - nested_item2
    
  • Mappings (Dictionaries / Objects / Hashes): Unordered collection of key-value pairs.

    # Simple mapping
    person:
      name: John Doe
      age: 30
      city: New York
    
    # Inline mapping (Flow style)
    coordinates: { x: 10, y: 20 }
    
    # Nested mapping
    server_config:
      host: example.com
      port: 8080
      security:
        enabled: true
        protocol: TLSv1.2
    

Multi-line Strings

Control how newlines and indentation are handled in string blocks.

  • default: newlines become spaces.
  • |: preserves newlines.
    • |+: keep extra newlines after the block.
    • |-: remove extra newlines after block.
  • >: newlines are converted to spaces.
    • >+: keep extra newlines after the block.
    • >-: remove extra newlines after block.

More details:

  • Literal Style (|): Preserves newlines exactly as they appear. Indentation relative to the first line is also preserved.

    • |: Keep newlines, strip single trailing newline.
    • |+: Keep newlines, keep all trailing newlines.
    • |-: Keep newlines, strip all trailing newlines (including blanks).
    literal_style: |
      This is line one.
      This is line two.
        This line is indented.
    
    # Example with chomping indicators
    keep_trailing: |+
      Line A
      Line B
    
    strip_trailing: |-
      Line C
      Line D
    
  • Folded Style (>): Folds newlines within a paragraph into spaces. Blank lines indicate paragraph breaks (become single newlines). Indentation that's deeper than the first line is preserved (and introduces a newline).

    • >: Fold newlines, strip single trailing newline.
    • >+: Fold newlines, keep all trailing newlines.
    • >-: Fold newlines, strip all trailing newlines.
    folded_style: >
      This is the first sentence
      which folds into a single line of text
      because it's not separated by a blank line.
    
      This starts a new paragraph (due to the blank line above).
        This indented line also starts a new paragraph (newline + indentation).
    
    # Example with chomping
    strip_folded: >-
      Some text.
    

More examples:

$ cat <<EOF | yq '.content'
content:
  a b
  c d e

  f g
EOF
# a b c d e
# f g
$ cat <<EOF | yq '.content'
content: |
  a b
  c d e

  f g
EOF
# a b
# c d e
#
# f g
#
$ cat <<EOF | yq '.content'
content: |+
  a b
  c d e

  f g
EOF
# a b
# c d e
#
# f g
#
$ cat <<EOF | yq '.content'
content: |-
  a b
  c d e

  f g
EOF
# a b
# c d e
#
# f g
$ cat <<EOF | yq '.content'
content: >
  a b
  c d e

  f g
EOF
# a b c d e
# f g
#
$ cat <<EOF | yq '.content'
content: >-
  a b
  c d e

  f g
EOF
# a b c d e
# f g

Interactive examples: YAML Multi-line String Demo

Advanced Features

  • Anchors (&) and Aliases (*): For reusing parts of the structure (DRY - Don't Repeat Yourself).

    • &anchor_name: Defines an anchor on a node (mapping, sequence, or scalar).
    • *anchor_name: References (aliases) the node previously marked with that anchor.
    defaults: &defaults # Define anchor 'defaults'
      adapter: postgres
      host: localhost
      pool: 5
    
    development:
      database: myapp_dev
      <<: *defaults # Merge the 'defaults' mapping here
      host: dev-db # Override host from defaults
    
    test:
      database: myapp_test
      <<: *defaults # Reuse 'defaults' again
    
    • <<: is a special merge key type used to merge mappings.
  • Multiple Documents: Separate documents within a single stream/file using triple hyphens (---). End a document optionally with triple dots (...). Useful for streaming or configuration sequences.

    # Document 1
    config_version: 1
    environment: development
    ---
    # Document 2
    config_version: 2
    environment: production
    feature_flags: [new_ui, faster_search]
    
  • Tags: Explicitly specify the type of a node. Usually inferred, but can be explicit for clarity or custom types.

    • !!str, !!int, !!float, !!bool, !!null, !!map, !!seq (Standard YAML core schema tags)
    • !customTag or !<!foo.com/bar> (Application-specific or custom tags)
    explicit_string: !!str 123 # Force interpretation as string
    explicit_float: !!float 1 # Force interpretation as float 1.0
    custom_object: !person # Custom tag (requires app support)
      name: Alice
      age: 25
    binary_data: !!binary | # Standard binary data tag
      R0lGODlhDAAMALMBAP8AAP///wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
      ACH5BAUKAAEALAAAAAAMAAwAAARFAAQCSPQhQMqvAQzl1PyNklUAWLaB505s6yRguCTY
      REQY/EFx7QAADs=
    

Single-quoted vs Double-quoted

  • single: don't escape (e.g. \n, \" has no effects). To escape single quotes: ''.
  • double: escapes work. To escape double quotes: \".

By Language

Go

JSON is included in Go's standard library, but not YAML. There are a few third-party options:

  • go-yaml: https://github.com/go-yaml/yaml
    • inconsistent with how JSON is handled in golang.
    • go-yaml works virtually identically to the JSON library, except instead of JSON struct tags, you use YAML struct tags, and instead of MarshalJSON and UnmarshalJSON, you use MarshalYAML and UnmarshalYAML.
  • sigs.k8s.io/yaml: https://pkg.go.dev/sigs.k8s.io/yaml
    • wrapping go-yaml
    • first converts YAML to JSON using go-yaml and then uses json.Marshal and json.Unmarshal to convert to or from the struct.
    • Benefits: reuse JSON struct tags as well as the custom JSON methods MarshalJSON and UnmarshalJSON unlike go-yaml.
    • As the name suggests: used in k8s.

go-yaml

Install gopkg.in/yaml.v3:

$ go get gopkg.in/yaml.v3
import "gopkg.in/yaml.v3"

// Marshal
bytes, err := yaml.Marshal(&foo)

// Unmarshal
err := yaml.Unmarshal(bytes, &foo)

k8s yaml

API:

func Marshal(o interface{}) ([]byte, error)
func Unmarshal(y []byte, o interface{}, opts ...JSONOpt) error

Import:

import "sigs.k8s.io/yaml"

Python

yaml module can be used to easily read or write yaml files.

Install:

$ pip install yaml

In code:

>>> import yaml

# Read from file
>>> yaml.load(open("test.yml"))

# Write to File
>>> yaml.dump(obj)