Automation-first operations — how a lean team holds a broad scope

Orchestration, observability, human-in-the-loop control, and the discipline of 'done over described': a measured reading of an engineering method.

(Margaux Lefèvre: Chief Technology Officer)

1 June 2026 · 7 min

// with contributions from

Marek NowakSRE & DevOps Lead

Ilona PavlenkoScrum Master

Kateryna KovalProduct Manager IT

The observation. A small engineering team today runs infrastructure that, fifteen years ago, would have demanded a whole department. This is not a question of headcount : it is a question of method. The last decade of engineering has produced a body of practice — orchestration, observability, human-in-the-loop control — that lets a handful of engineers supervise a broad scope without being overwhelmed by it. DORA (DevOps Research and Assessment) documents it year after year : neither tooling nor team size predicts performance, but specific, measurable capabilities. It is this discipline, not a fad, that we call automation-first here.

From automating tasks to automated operations

Anyone can automate a task. Building a layer of automated operations is another trade. The distinction is the one Google formalised in Site Reliability Engineering (2016) : repetitive manual work — toil — is not merely costly, it is toxic to a team, because it grows linearly with the service while headcount cannot. The answer is not “work faster” but “let the system run itself where it is safe, and call a human only where it is necessary”.

An automated operations layer rests on three pillars that the literature treats separately but that only work together : orchestration (deciding what runs, when, in what order), observability (telling you what the system is actually doing), and human-in-the-loop control (keeping irreversible decisions in accountable hands).

Orchestration: describe intent, not gestures

Modern orchestration rests on a principle borrowed from infrastructure-as-code and popularised by tools like Kubernetes : you describe the desired state, and a controller continuously reconciles reality with that intent. This declarative model changes the nature of the work : the engineer no longer fires actions, they define invariants. The system becomes idempotent — rerunning an operation breaks nothing — and auditable : the gap between intent and reality is itself data.

Observability: you only steer what you measure

An automated layer without observability is a dangerous black box. The distinction between monitoring (watching known indicators) and observability (being able to ask new questions of a system without redeploying it) is now standard ; the Cloud Native Computing Foundation's OpenTelemetry project made it interoperable around three signals — traces, metrics, logs. The SRE notion of SLOs (Service Level Objectives) and the error budget follows : you define acceptable reliability in advance and accept spending an error budget rather than chasing an illusory 100 %. This is what lets a small team sleep : the alert fires on a considered threshold, not on every twitch.

Human in the loop

Mature automation does not aim to remove the human, but to place them where their judgement matters. The classic human-factors distinction — human-in-the-loop, human-on-the-loop, human-out-of-the-loop — guides the design : reversible, frequent actions can run themselves ; irreversible or ambiguous ones demand explicit approval. Lisanne Bainbridge's research on the ironies of automation (1983) remains strikingly current : the more automated a system, the more critical and difficult the human's residual role becomes, since they only step in on the cases the machine could not handle. Designing an automated layer therefore also means designing clean stop points, approvals and rollbacks.

The discipline of “done over described”

A lean organisation is defined less by what it plans than by what it finishes. Lean thinking, from the Toyota Production System and formalised by Womack and Jones, treats started-but-unfinished work as inventory — capital tied up, not progress. Undelivered work in progress creates no value ; it creates risk and noise. The discipline of “done over described” favours a small increment that is shipped, verified and observable over a grand intention documented but never put into production. It is the exact operational counterpart of DORA's finding that small batch size and deployment frequency predict stability.

How a lean team covers a broad scope

The answer is not individual heroism — it is design. Four levers, all documented : cut toil by automating recurring, safe tasks ; make the system observable so any anomaly becomes an actionable signal rather than an investigation ; keep humans on irreversible decisions through explicit approval gates ; and ship in small increments so every change is verifiable and reversible. This is not doing less ; it is concentrating effort on judgement and returning the repetitive to the machine.

Where we stand

Montandor Andorra is a young house with a deliberately lean engineering team. From the outset we chose an automated operations layer over headcount growth : declarative orchestration of our processes, end-to-end observability, and — a non-negotiable rule — an accountable human on every irreversible action. We practise “done over described” : we prefer a small module in production, verified, to a grand plan left on paper. This is no singularity ; it is the methodical application of a public body of engineering practice at the scale of a small company.

“A small team does not hold a broad scope by working harder ; it holds it by designing better. What we learn from the great houses of engineering — Google on reliability, Toyota on lean — is that discipline is measured in what you ship, not what you announce. And that a well-tuned machine must always leave the grave decision to an accountable person.”
— Wouter Meijboom, CEO, Montandor Andorra.

Sources

Beyer, Jones, Petoff & Murphy — Site Reliability Engineering, Google / O'Reilly, 2016 (toil, SLOs, error budget).
DORA — Accelerate State of DevOps reports (Forsgren, Humble, Kim), 2014-2023 : capabilities predictive of delivery performance.
Womack & Jones — Lean Thinking, 1996 ; Toyota Production System (work in progress as inventory, small batches).
Lisanne Bainbridge — Ironies of Automation, Automatica, 1983 (the residual, critical human role).
Cloud Native Computing Foundation — Kubernetes (declarative reconciliation) and OpenTelemetry (traces, metrics, logs).
Charity Majors et al. — literature on observability and the monitoring / observability distinction.

Research led by Margaux Lefèvre (CTO), in collaboration with Marek Nowak (SRE & DevOps Lead), Ilona Pavlenko (Delivery / Scrum) and Kateryna Koval (Product Manager IT).