Understanding State Machines for DevOps-Style System Administration



  • One of the more complicated or unusual concepts within the DevOps-style system administration space is the introduction of the concepts of state machines. Most system administrators will have never worked with a state machine and may find the concept foreign.

    State machines are powerful in system administration because they allow us to define the desired "state" of a machine (server, switch, router, desktop) and the state machine takes that description and performs the necessary tasks to return the machine to that state or to achieve that state.

    For example, a state description might include that the package "curl" is to be installed. The state machine would check if curl was or was not installed and if it was lacking it would acquire and install that package or, if it could not do so, it would fail with a failure alert.

    Due to the nature of a state machine, they can be used to keep a system from being modified or be used to build a new system from the ground up. Because state machines work from a state description, that description can be version controlled to track changes over time. State descriptions can be applied to many machines at once, as well.

    Using state machines, such as Ansible, SaltStack, Chef or Puppet, allow us to spend less time thinking about "how" to accomplish a task and instead focus on a goal to be achieved. And it allows systems to be self healing and monitoring. Healing by returning themselves to an established state in case they are modified somehow; and monitoring by being able to report a deviation from state.

    State machines are one of the most important new innovations in how systems can be maintained taking the concepts of automation to a completely different level.