To micro or not to micro
Architects’ story about the architecture of the B2B multitenant cloud system
Disclaimer: This article shares knowledge from a project made by one of Hycom employees however the project was not delivered by Hycom.
Microservices are an architectural paradigm known for over a decade. A large monolithic system is divided into smaller independent components. These components offer their services through an exposed API – usually REST. Similarly, one central data source, most
often a relational database, is divided into separate schemas that are managed by single microservices (Database per Service pattern). As a result, each microservice becomes autonomous, independent, and completely decoupled from the others.
The concept of microservices brings many benefits but also a lot of challenges. Here, I describe the history of defining an architecture project where I was a lead architect, the experiences acquired during it, and conclusions.
While the system was dedicated to financial transactions, the priority was security and data consistency. Of course, the system also had to provide high availability, performance, and scalability. We estimated that ultimately the system would generate about 86 GB per day (ca. 30 TB per year). The system available in the cloud and designed for businesses was to be used by different entities, so the platform had to provide multitenancy.
The architecture and the solution
Due to the above requirements, we opted for an architecture based on microservices. To make microservices even more independent from each other and to achieve maximum system resistance to failures and errors, we based communication between services on the concept of a message broker. Due to the large possibilities of routing configuration, this
service was implemented using RabbitMQ. All important events in the system - domain events – flowed through the broker, where they were saved permanently (data consistency). A managed PostgreSQL cluster service was used for data storage, and in turn, the log stream was stored in JSON format in MongoDB.
The technology stack used was Java 11, Spring Boot, and Spring Cloud, and on the front–end ReactJS – for a web application for administrators, and React Native – for end-users who had mobile interface. All kinds of files (jpg, gif, png etc. and PDF) were kept in S3. The system was deployed in the cloud as a set of Docker containers in the Kubernetes environment. To reduce the size of Docker images, a multi-stage build was used, and a dedicated JRE was built.
We tried to apply good practices and recognized patterns to the world of microservices. In terms of data patterns, the already mentioned Database per Service pattern was used. Transactional Outbox pattern was used to maintain data consistency and, most importantly, atomicity of writing to the local PostgreSQL schema together with event broadcast to RabbitMQ broker. Sagas, in turn, ensured the correct rollback of distributed transactions.
The whole system, 14 microservices in total, was available via API Gateway. As the system provided a public API – we launched an additional gateway (gatekeeper), which was intended only for external 3rd party users with appropriate access keys. Service discovery was provided by the Kubernetes runtime environment. Similarly, the entire configuration was based on the facilities available in Kubernetes.
In terms of infrastructure patterns, the orchestrator's built-in health checks (resource requests and limits) were used. Central log management was based on the EFK (Elasticsearch-Fluentd-Kibana) stack, and Spring Cloud Sleuth took care of distributed tracing, i.e., the ability to identify a query flow by a unique identifier.
Entry to the cluster was provided by ingress implemented with Nginx, where the HTTPS connection was terminated. Certificate management was provided by the ACME cert-manager helm chart. User authentication was ensured by JWT tokens signed with a private key (asymmetric cryptography), and all authorization and data access issues were provided by a single microservice called auth.
Several security mechanisms built into the orchestrator were also used, such as the ability to use private Docker registry, dedicated service account, resource requests and limits (CPU cycles and RAM), role-based access control (RBAC), and network policies. Automation of testing processes and generation of artefacts (Docker images) was implemented using Jenkins and GitLab CI.
Experience and conclusions
The project proved to be a technological success. Performance and chaos monkey tests conducted before the production launch showed how the applied architecture meets the requirements for data consistency, system resiliency, and efficiency. This gave us a lot of satisfaction.
However, if we had started working on this project once again, we would have built a modular monolithic system implemented as one component, available via REST API, but built in such a way that, if necessary, it could be easily divided into separate components like microservices.
The project was implemented for a German company providing on-demand loyalty programs for companies. This B2B solution enabled business clients to start their loyalty programs where the end-users could exchange points into whatever rewards they were able to buy on the Internet according to the number of points they had collected.
Surprisingly, the decomposition of the entire system into independent domains turned out to be relatively simple. Perhaps, while making this division, we made some logical mistakes, which had never been revealed. Anyway, regardless of the architecture adopted, this effort always pays off.
The underlying pattern of the microservices paradigm – Database per Service, has an exponential impact on the complexity of the system. In particular, it complicates the maintenance of data consistency. Since each microservice has its individual database, we cannot change data in two microservices in a transactional manner (ACID). As a rule, it is therefore impossible to maintain transactional data consistency – we are talking about so-called eventual consistency. There are design patterns like SAGA or Transactional Outbox, but there is no doubt that the complexity of the system increases significantly.
Likewise, for the same reason, we cannot aggregate data managed across two microservices with simple SQL expression (JOIN). A simple table on the user interface presenting, for example, system users and the number of objects generated by each of them from a different domain becomes a challenge. Again, we have patterns (Aggregator, Backend-For-Frontend), but SQL JOIN is much simpler.
Due to the number of components, the operational complexity of the overall system also increases significantly. You need an experienced operations team that will arrange the CI / CD pipelines in the most optimal way. The runtime environment itself – usually Kubernetes – is also not the easiest to configure and manage. Again, simple aspects known from the monolith become a big challenge, e.g., log aggregation, distributed tracing, monitoring and
The QA team doesn't have it any easier either. Testing contracts of individual microservices is relatively simple. But microservices are not separate entities, they implement certain business requirements together, and therefore when testing the whole system, we must consider all the dependencies. Mocking is here the keyword.
Keep in mind that when entering the world of microservices, we enter the complex space of distributed systems. We gain enormous value at the cost of complexity and much higher effort. It is hard to say now how much higher, but we estimate that it is quite a lot (2-3 times).
So, if we were to take up the challenge again, in the short term, which is interesting from the startup point of view (validation of the idea, feedback from the market), we would decide to implement a modular monolith, still having potential migration to the microservice architecture in the back of the head.
Keep in touch with us.