Many businesses today are focusing on managing more effectively their cloud environments, while also working to reduce errors and ensure the performance of their business critical applications. Our SRE, Cloud, and Cybersecurity Studio is at the forefront of these efforts to improve the management of our client’s infrastructure.
We wanted to share here some of the elements the team have been working on- both tools that we’ve developed internally, as well as how we use open source and commercially available tools to best help our clients with their technology environments.
SRE Bootstrapper: Qubika’s tool to rapidly set up new cloud environments and more easily provision network instances
With our SRE Bootstrapper tool we automate the cloud infrastructure setup when we start working with a new client. A typical manual process for setting up the infrastructure, and provisioning the network instances, takes about a day and a half – mixed in of course with the risks of human errors. With the tool, we do it in minutes. Having an automated process means we save a significant amount of time right at the beginning of projects with clients. It also enables us to provision almost everything – instances, networks, secrets, logs, certificates, domains, and scaling, and much more.
We decided to build the tool using Terraform as it enables us to use infrastructure-as-code. In plain English, that means we can simply write the code, “hit play”, and everything is automatically deployed.
We’re already working on the future iterations of Bootstrapper. In particular we want to ensure our teams have significant flexibility to adapt to specific or unique situations. For example, the next iteration will make it easier for us to create multiple environments (production, development, QA, staging etc) with the same infrastructure-as-code configuration.
Tools for developers
Here at Qubika, we focus on building quality software. As part of this, many of our teams take advantage of Sentry, which provides open-source error tracking and performance monitoring.
We have also created a module in Terraform for error monitoring, which sets up alarms on CloudWatch for any AWS hosted system – this provides notifications about aspects such as application downtime, 5XX HTTP errors, and resource utilization. Importantly, it automates the creation of alarms on the infrastructure – this means, for example, that businesses can ensure their customers never get to the point where they experience their system being down. Notifications are sent via email or Slack (or whichever preferred channel) so that the team can take preventive actions.We also use Github Actions to implement continuous integration/ continuous development (CI/CD) pipelines, which then automate things such as running unit tests on code, linter on every merge, automatic image building and compilation, automatic deploys, and much more.