Beyond Platform Thinking at Ritchie Brothers - Build Things No One Expects, in a Place No One Expect

Watch talk on YouTube

The story of how Thoughtworks buit YY at Ritchie Bros (RB). Presented by the implementers at Thoughtworks (TW).

Backgroud

  • RB is a auctioneer in the field of heavy machinery
  • Problem: They are old(ish) and own a bunch of other companies -> Duplicate Solutions
  • Goals
    • Get rid of duplicates
    • Scale without the need of more personel

Platform creation principles

  • Platform is a product
  • Building is a exercise in software eng. not operations
  • Reduce dev friction

Platform overview

  • Platform provides selfservices
  • Teams manage everything inside their namespace themselfes
  • Multiple global locations that can be opted-in and -out

Principles and Solutions

Compliance at source of change

Developers own their pipelines

  • Dev teams are responsible for scanning, etc
  • Platform verifies thath the compliance scans have been done (through admission control)
  • Examples:
    • OPA + Gatekeeper for admission -> Teams use snyk for scanning and admission checks the scan results
    • ira as admission hook for approval -> PO approves in Jira, admission only acceps if webhook is approved

Platform Operators

  • Implemented: S3 Operator, IAM Operator, DynamoDB Operatopr
  • Reasons:
    • Devs should not need access to AWS/GCP directly
    • Teams have full control while not needing to submit tickets or write terraform
  • Goals
    • Abstract specific details away
    • Make the results cloud-portable (AWS, GCP, Azure)
    • Still retain developer transparency
  • Example: DynamoDB Database
    1. User: creates dynamo CR and ServiceRole CR
    2. K8S: Create Pods, Secrets, Configs and Serviceaccount (related to a IAM Role)
    3. User: Creates S3 Bucket CR and assignes ServiceRole
    4. K8s: Injects secrets and configs where needed

Observability

  • Tool: Honeycomb
  • Metrics: OpenTelemetry
    • Operator reconcile steps are exposed as traces

Q&A

  • Your teams are pretty autonomous -> What to do with more classic teams: Over a multi-year journey every team settles on the ownership and self-service approach
  • How teams get access to stages: They just get themselves a stage namespace, attach to ingress and have fun (admission handles the rest)