It will be interesting to those who develop or transfer the infrastructure to microservices, implement DevOps on top of Kubernetes and go to cloud native in every way. I’ll tell you about our path, at the end of the article I will share our groundwork for the microservices environment – I will give a link to a template that will be convenient for developers and testers.

How we lived before Kubernetes: dev servers, bare metal and Ansible

We lived and live in a mode of constant changes and experiments: AV tests, testing various hypotheses. We launch new services, then, if something does not work, we cut it out.

Once we had a monolith in the PHP, which brought a lot of pain and suffering. To provide an acceptable time-to-market, we went the typical way – we began to saw this monolith for microservices. As a result, it turned out that a large monolith turned into many small monoliths. This is normal, it happens to everyone who has faced a similar task.

Then we started trying other technologies, in particular, we got Golang, and later it became the main development language. There were questions: how to develop, test and deploy all of this? The answer was obvious – you need a dev server. Each developer should have a dev-server, where he can connect to write high-quality and high-performance code there.

As a result, the guys wrote a dev server. The result was a web interface that controlled docker-compose on the servers. There was also a container with Source Code, which was mounted in docker-compose. The developer could connect via SSH and program. Testers also worked there, everything worked perfectly.

But with the increase in the number of services, it became impossible to work. The moment came when it was necessary to deploy something, not to unpack the containers. We took bare metal, rolled Docker there. Then they took Ansible, wrote several roles. Each role is a service where lay docker-compose, which “came” to one of the them.

So we lived: in nginx we registered upstream with our hands, said which port should go to, where this service lives. There was even a yaml file where all the ports were listed so that applications would not compete for them.

How we came to Kubernetes and built infrastructure on it

Obviously, one cannot live like this, orchestration is needed. We understood this in 2017-2018, then it was not clear where to get this orchestra. Kubernetes was just beginning, there were HashiCorp Nomad, Rancher, OpenShift. We tried Nomad, it was not bad, but we did not want to rewrite docker-compose to Nomad configs.

About Kubernetes, we immediately realized that foreign colleagues tried to do it, they did not always succeed. And we did not have bearded admins who could make us a cluster. They began to think how to implement this. Even then, Kubernetes was, for example, in Amazon, but we remember the locks, after which they urgently moved. Therefore, this option was immediately discarded, also because of expensive traffic there.

And then Kubernetes appeared on the Mail.ru Cloud Solutions platform as the Mail.ru Cloud Containers service. We have already moved our S3 storage there from Amazon, we decided to try K8s as well. We deployed a cluster in the cloud, everything worked.

For testing, we decided to deploy some stateless service there. Took an API for mobile applications, deployed – it works. Sent there 50% of the traffic – it works. Yes, something fell periodically, but the guys fixed it, everything was fine. As a result, the entire infrastructure was transferred, now it is built around Kubernetes, mainly these are dev and stage servers.

Each developer has his own Minikube in VMware, with which he works. We launch new projects in Kubernetes in the MCS cloud, we also deploy Managed MySQL, which immediately arrives with all the slaves, replications and backups in S3.

We still have legacy on bare metal, including a docker cluster running Ansible, but someday we’ll figure it out.

How to live with a technology zoo and not suffer

The technology zoo is now not as scary as it was, lets say, in 2011. It’s even normal if you can take specialized tools and technologies from different places and use as you like. For example, we use Golang for some things, but Data Scientist works in Python, you cannot force them to write in GO or PHP.

In general, we have two rules:

  • dockerize: there must be containers;
  • observability: these containers must be observable.

To continue the analogy with the zoo: there are cages, and it is not so important who sits in these cages. The main thing is that water and food arrive regularly, automatically and uniformly, and the “vital products” of services – logs, are shipped somewhere centrally.

For observability, we have a standard stack: each application writes logs to stdout, from where everything is transferred centrally to EFK. That is, the developer can come and see the logs in Kibana. App metrics we drop off at Prometheus, dashboards and alerts as standard in Grafana. Jaeger is an Opentracing story that shows who accesses which service if we don’t know or don’t want to deal with it in other ways.

How to develop and test with all of this

Let’s say a new developer comes to us, he sees 100 services and 100 repositories. He immediately has questions. How to deploy these 100 services, and how to configure? Where are the databases? What accounts are there? And there are many such questions. Because of this, the release of the new developer took indecent time, he could sit for a week and set everything up.

As a result, we have developed a 1-Click development environment. Each developer has his own Minikube with conditionally infinite cores and memory, deployed in the VMware cloud. Plus a database – it comes daily from production, is obfuscated, compressed and put on ZFS. This is a personal development of our admin. We have been engaged in cost cutting for a long time, we needed to give all the developers a base and at the same time not go broke.

There are snapshots in ZFS, an API developer can roll the database directly from production in two seconds. With autotests, the same story: we start by API, and everything works.

The developer is happy, DevOps and admins are happy because all processes are uniform, repeatable and unified. But there is one thing.

Multi-level Layer System

As Linus Torvalds said: “Talk is cheap. Show me the code. ” So, we use a multi-level layer system. There are trivial layers: dev, stage, prod, which come to mind for everyone who is going to make CI / CD.

But there are still developers, they need some of their domains, some user specific stories, so we have a users values ​​layer. But this is not enough – you still need to test. Suppose we made a branch, maybe several services, and we need to pass this to the tester so that it repeats. For this, we have a layer with values ​​for tasks, i.e. tasks values.

Another, slightly holivarny moment – we do not use Tiller, settled on Helm, but we use it in fact as a template engine. That is, we use only helm-template, it gives a yaml file at the output, which can be attributed to Minikube or a cluster, and nothing else is needed.

How the K8s-helm repository works

As I said, we have obvious layers dev, prod and stage, there is a yaml file from each service, when we saw a new service, we add the file.

Additionally there is a “daddy” dev.server with the most interesting. There are bash scripts that help, for example, create a new user: do not create 100 services with your hands and do not fill in yaml files, but simply run in one command. This is where all these yaml files are generated.

In the same folder there is a subfolder tasks. If we need to make some specific values for our deployment, we just create a folder with task numbers inside, commit the branch. Then we tell the tester: “Such a branch lies in the repository, take it and run it.” He starts, yanks the command that lies in the bin, and everything works – no need to configure by hand. The miracle of DevOps is an infrastructure as code.

When a new developer arrives, they give him Minikube, an archive folder with certificates and a domain. In general, he only needs kubectl and Helm. He clones the repository to himself, shows kubectl the path to his config, runs the make_user command with his name. And for all services, copies are created for him. Moreover, they are not just created – there is a database that was given to him, the developer prescribed credentials for it, and all these credentials go to other services.

The user has been created. How can I deploy everything now? Nothing complicated here either – we launch deploy.sh with his name, and everything comes to the developer in the default namespace in his Minikube, everything is immediately available on his domain.

If the developer has programmed something, he takes the issue ID and gives it to the tester. The tester copies this branch, launches a deploy, and an environment with a new feature appears in its cluster.

K8s-helm assembly

The repository itself is additionally compiled and built into CI / CD processes. Nothing special here – just kubectl with certificates and Helm.

Suppose you made a deploy, and there is a stage when you need to first deploy the stage, and then run the tests there using Jenkins. From the repository, you have an image compiled with Helm. We run the deploy namespace service_stage run command and everything takes off.

Then comes CI, here .drone.yml, but about the same thing will happen in GitLab or somewhere else.

Next, Jenkins starts, which runs the tests on stage. If everything is ok, then it starts almost the same deployment, but already on the prod. That is, this mechanism not only makes life easier for developers and testers, but is also used to deliver features to products.

We love open source, we want to invest in the development of DevOps, so we made a template that you can use and uploaded it to the github (https://github.com/carprice-tech/k8s-helm). There is everything that I talked about: you can take, watch, test and apply. It will be useful to everyone who implements microservices, or suffers from the fact that the team implements microservices, or wants to build DevOps processes around this.