We believe in the future software will be embedded in our workplaces, warehouses, shops, garages, hospitals and homes. It will connect to a variety of different software services. Cloud-based software, software as a service, internet-of-things and the integration of software solutions are some of the significant drivers behind this software development revolution.
The promise is that with the software we can sell more, serve more, ship more by using software services to integrate and augment businesses and our lives.
Creating software of the future requires a specialised process that integrates refinement and quality control. One that breaks down significant challenges and eases them with time. DevOps is this strategy, and one major issue has been how different DevOps is from one organisation to another. We think the reason DevOps is different across organisations is because of how different most organisations are. We actively discourage anyone from copying devops models from other organisations because it means we also copy the pitfalls and issues with the model. Netflix has a model, so does Amazon and Spotify and many organisations. While they may seem successful if you break-down the outcomes that they seek its possible for your organisation to do achieve similar.
For example, it might seem everyone is doing continuous delivery. However, in practice, only very mature development practices have this capability, and usually, it comes after some significant investment in capital and time in DevOps practices and tooling.
DevOps doesn’t need a definition; It is a software development process built on a culture of continuous improvement, feedback and measurement. It promotes best practices which reduce the complexity of integrating software solutions. The DevOps Toolchain enables high levels of automation and integration while boosting collaboration and communication capabilities.
25 years ago and several times since, I read ‘The Goal’ and I love this quote. Does this sound familiar?
any manageable system is limited in achieving more of its goals by a very small number of constraints. Eliyahu M. Goldratt (The Goal published)
A good DevOps process should minimise the effects of an organisations main constraints.
An organisation has a fixed number of software development, test and infrastructure employees. Strategically they have more opportunities than the ability to mobilise the right number of people.
The only thing we consider in regards to time is to value it. We don’t want you to move faster, rather we would suggest looking for those bottlenecks that slow you down.
At Servana, we often say you can breakdown the most important aspects of DevOps by looking at the people, practices, tools and culture within the organisation.
DevOps organisations value their employees highly. The workplace should promote trust, respect and a excellent work-life balance. Employees should all be involved in meaningful activities. We discourage role-based work. For example Releases, Tests and Infrastructure should all be managed by the DevOps Toolchain. In practice this means people work on this tooling configuration but not actively on the infrastructure. The main outcome of this is enhanced mobility for your employees (i.e they can move around the organisation freely and help out with new projects). It also makes it easier to on-board new engineers and helps leverage your human capital more strategically. Our advice is to keep it simple. When you hire the best and smartest people you shouldn’t need to tell them what to do.
Certain conditions are necessary for DevOps to thrive. At the same time, some cultural states extinguish many of the advantages received from a DevOps process.
Empirical Decision Making
Important decisions should be made based on actual data, metrics, consensus with teams not based on the ideas, feedback, blog post or experience from a single or group of consultants. Remember when you copy a DevOps Model, you also copy its flaws which can give you a few headaches. The culture should promote empirical decision making within teams and support the process of making decisions empirically.
- Programming language selection
- Application architecture (should not be a top-down activity)
- Tool selection
- Cloud architecture
- Testing solutions
- Pipeline stages 7 Release automation
Collaborative Work Environment
Agile practices compliment DevOps. They bring a team or multiple teams together. Standups and retrospectives are excellent communication events and collaborative strategies just as peer reviews. Tools feature here as well with Atlassian Jira, Trello, Slack, Microsoft Teams and services like Github, Gitlab, Jenkins.
Ideally, I would want a complex problem to be a team burden and using the different skills in the team to solve any problem with numerous attempts at a empirical proof.
Peer reviews are also essential. If developers can be open to review by other developers and not necessarily senior developers,it will help to consolidate shared knowledge of best practices. Pull requests are also an excellent way to share updates of code merges and accepting reviews on a pull request.
When has a bureaucratic decision overturned a better choice? Usually, this happens because the right level of monitoring, benchmarking isn’t around to support better decisions. Reversion like this to the mean is when an organisation has learned safe habits which have been r
There is a massive opportunity to make the mistake that using the tools encourages a good DevOps practice. However, from what we have seen, the presence of the tools doesn’t usually signal a good DevOps practice. The way we look to identify good DevOps practices is to look at how happy and productive the people are. This human-centred approach is a reminder that people do most of the valuable work. If the tools aren’t human-centred, we can still fail to create a good DevOps practice.
Quite a few practices are standard in organisations doing DevOps. The goal for these practices is to assist in the refinement and improvement of the software.
Examples; Git Mecurial Subversion
These tools help with source code logistics. They help developers merge changes across many developers. They manage the history of each change, making it easy to see how source-code has evolved. This history can also be used to triage regressions later on in the development process.
These source-code managers are also useful when preparing releases and applying fixes with several strategies leveraging their architecture. For example, a strategy called Gitflow suggests that it is easier to manage the state of source code across many branches than in a single branch. While this is a subjective idea, Gitflow has traction with teams that prefer its procedural approach.
A company doing DevOps should have source code repository.
To develop software better today, we need to be able to capture requirements analyse them solve problems and create work, do work. We also need to improve the software incrementally with an allowance for regular feedback and changes to priorities. Agile development is the clear leader of this approach, and we particularly like how it enables us to take big problems and break them down into smaller, manageable problems. In our development process, we often do a shorter spike which is usually less than a days work which is then followed by a proof of concept. Once done, higher quality work enters the backlog while the spike and proofs-of-concept are strictly time-boxed, limiting its downside.
Another part of agile we think is critical is the standup ritual. It is an excellent opportunity to discuss work done and work to do. It also provides a venue for informing stakeholders, which is useful to prevent stakeholder updates being necessary at random times during the day.
On Kanban, in practice, we enjoy using it because it represents the flow of work very well. The limits to swim lanes also help limit the amount of work in progress, and it also helps limit unplanned work from entering the swim lane. Your mileage may vary I recommend you do your research.
Continuous Integration & Continuous Delivery
Source code logistics can be a tedious part of the development process, and it is advisable to automate this as soon as possible. Any automation here will reduce the feedback loop significantly for your development teams, and improvements in this space can yield huge productivity improvements.
The purpose of the CICD platform is to manage the various processes that contribute to the refinement of source code from initial development through to release. There have been some massive innovations in this space, but the most important is the pipeline.
The pipeline is a workflow automation concept that is in use in many of the popular CICD platforms today. As a piece of configuration, the pipeline is responsible for consolidating all the necessary configurations that assist in refining source code. Previously all of this automation would be in separate places, and this would lead to numerous inefficiencies in the development process. A variant of the same project will require updating hundreds of different configurations to support the variation. With a pipeline, it can be copied and modified easily cutting weeks off the development time.
Pipelines have numerous other benefits and are a vital contributor to scaling software development practices.
Software testing is vital. The process of refining software with tests means a development team can focus on delivering changes and the team as a whole has confidence in the quality of the software as it passes tests. The ability to automate tests is also a key evolution in release automation. The best places to execute tests is within the CICD platform.
Test-driven development is a practice which encourages the creation of a test before or during code creation; a developer creates the tests. Here are some typical test strategies.
- Behaviour-driven development
- Smoke tests
- Regression tests
- API tests
- User Interface testing
It can be difficult for software teams to budget time for tests within the sprints or project lifecycle. Robust cultural support for testing and test-driven development is required to bolster this capability.
Infrastructure, pipelines, application runtimes all require configuration and the biggest time-waster is a lack of configuration management to automate these processes. If you wanted a server and the infrastructure engineer configures it by hand, you have a problem. With configuration management, a team can take advantage of cloud services and provide a level of self-service that supports fast-paced development processes.
There are several configuration management languages available.
Infrastructure-as-Code is a part of this evolution and takes the configuration of infrastructure to a lower level. It enables the design and creation of hosting environments in a single configuration file. IaC is very common for cloud services. The two main tools we see in this space are;
- Tools offered by the cloud vendors (i.e cloudformation for AWS)
The pattern we promote is one where we use Pipelines to integrate and execute config changes further accelerating the development process.
The pace of software development isn’t slowing anytime soon, and with all these new applications and services out there, the risks are only going to increase. One clear strategy to address this is to use the CICD platform and the testing patterns to reduce the security implications in the wild.
Security should be addressed as early in the development process as possible. It will reduce cost and if prioritised will improve with time.
We advise at a minimum there is an excellent level of dependency testing within a pipeline. If a project uses containers, we also recommend the screening of live containers daily. The purpose is to ensure that no exploits are currently possible.
We use tools like Clair, OWASP within our CICD platform to scan builds and containers that are currently live.
There is a simple maxim to consider security, and so the best protection is many layers of security. Consider a firewall it secures a network by managing access between different network interfaces. Host-Based firewalls manage access on different ports on a local interface. The presence of both or none doesn’t guarantee security. However, they add to any existing security strategy. Add the right level of access control to a server. Limit permissions on the network, and at each point, the security of the whole system improves.
With all this security, we are still just one dependency away from allowing access from inside. The fundamental principles of a good security strategy are the ‘separation of concerns’ and ‘principle of least privilege’.
The best time to add security is at the beginning; the costs of security work at this stage are at the lowest — test source code with security safety in mind. Add multiple layers of security and limit privileges inside your network.
you can’t manage what you can’t measure. Peter Drucker
Yes, this is a management quote, but it applies to software development too. I have strived since the very beginning to justify the cause for measuring the development process. It can unlock more investment and provide a report card for progress, helping organisations value their software development capabilities like an asset. In DevOps, we have various kinds of monitoring tools that we use, and so monitoring is quite a big topic.
Having all the logs for different infrastructure and application components in one place will improve productivity and reduce the feedback loop for developers and support folks.
Using monitoring tools as a way to benchmark success is an integral part of getting real about the kind of work you are doing. If something often fails the pain of witnessing this continuous failure should motivate the right people to do something about it.
Popular monitoring tools we use are;
- Kibana ELK
- Pinger Most monitoring tools provide alerting functionality we use this often as well.
One productivity hack is to configure alerts to go to Slack. That way, everyone who cares can get them.
Scenario: You work on an innovative e-commerce platform, it is highly successful, but you also want to make changes to it very often. What do you do? Today we have many new kinds of release strategies.
- Blue Green
- Rolling updates
While I have a preference for updating services using the rolling updates strategy (its simple), I have used Blue/Green on many occasions and Canary deployments. There is usually a better solution, and one of our roles as devops engineers is to establish which one of these matches the businesses requirements.
The deployment is one part of the release process. We also have some change management and testing. Once a release is complete, we usually run some of our tests on the production environments to ensure the deployment was successful. Often called a smoke test. Smoke tests are an excellent way to finalise a release. We do this to avoid any surprises later, and it can take a few hours for errors to be detected. The smoke test should be the last stage in the pipeline and once complete. It should be safe to finalise the release confidently.
Many organisations want the ability to rollback. I don’t see any reason why anyone would want the option to do this; it can be challenging to integrate rollbacks into a pipeline. If for example, you have a pipeline that begins with the following stages;
Checkout > Build > Deploy Dev > Test > Deploy QA > Test > Deploy PreProd > Sign Off > Deploy Prod > Smoke Test
In the above example, a rollback managed by an alarm triggered with errors in the application logs. The issue here is the pipeline is out of sync with production. Communicating the rollback is vital, and support for a potential rollback is essential; however, all the work done in the pipeline will need to be repeated.
Checkout > Build > Deploy Dev > Test > Deploy QA > Test > Deploy PreProd > Sign Off > Deploy Prod > Smoke Test > Rollback
The above example represents the rollback visually, which solves the problem of communicating the rollback state. I don’t like rollbacks because we aren’t capturing the value of the pipeline and the costs can quickly escalate. Fix the causes in the pipeline which result in the most rollbacks.
There is a release pattern we’ve been using for over two years now. It came about because we didn’t like how our pipelines were getting longer and longer accommodating all the different required stages. At the same time, we began supporting the development of multi-cloud applications. We call this new pattern a ‘role-based pipelines’. Rather than having one large pipeline in a release process, we may have a few pipelines. It ensures that we capture the value of the work done in the pipelines at each stage.
> checkout > build > publish > event
> event > update infra > deploy > smoke tests
It may not be necessary to use a pipeline for deployments if you use specific release management tools like Cloudbees Flow. However, in this case, we use a deployment pipeline to manage multi-cloud deployments.
devops === (people + practice + tools + culture)
After reading this, you may have many more questions. I’d be happy to discuss this with you at our office or over video chat, reach out to us via our contact form, and I’ll be in touch.