Failing at Microservices.
Please avoid our mistakes!
Microservices are the new fad in software architecture, and while I think they are generally the correct philosophy to take with service design and composition, the pattern can certainly lead you quickly into trouble. If you don't know what Microservices are, I recommend reading the article written by Jame Lewis and Martin Fowler on the topic. In this post, I intend to provide some observations from my experience recently implementing and maintaining a microservice architecture.
As the title of this post points out, my team struggled (I'd say failed) at implementing a microservice architecture. There were a number of factors that lead to this failure and most were not related to technology and implementation practices. However, in the parts related to developing microservices, we did fail.
Looking back, I would say our failure was a result of a number of factors:
1. Philosophical Differences between Developers.
The first issue related to our team. Our team members fell into one of three categories:
- Loved the idea of microservices.
- Hated the idea of microservices (in favor of more traditional monolithic stacks).
- Indifferent, but not well-equipped to build and maintain microservices.
In general, you can expect some disagreement on how an architecture should be organized amongst a team. Unfortunately, our management structure was setup to encourage the kind of democratic/egalitarian decision making Fred Brooks warns us about in The Mythical Man-Month. This meant every decision that was made in regards to implementing microservices were needlessly debated by engineers falling into the second category which sapped a lot of time and effort (not to mention causing a bit of emotional friction).
Category three engineers also presented a significant problem to our implementation. In many cases, these engineers implemented services incorrectly; in one example, an engineer had literally wrapped and hosted one microservice within another because he didn't understand how the services were supposed to communicate if they were in separate processes (or on separate machines). These engineers also had a tough time understanding how services should be tested, deployed, and monitored because they were so used to the traditional "throw the service over the fence"to an admin approach to deployment. This basically lead to huge amounts of churn and loss of productivity.
2.Barriers Specified by Service Boundaries.
Engineers tended to have a really hard time staying consistently productive. In many cases, this was because the services became artificial barriers. In a more traditional system, the boundaries of development tends to be granular to the frontend, backend, and sometimes the database. For our architecture it was the individual services (which were much more numerous).
We decided to split the backend into 8 separate services, and made the bad decision of assigning services to people. This reinforced the notion of ownership of specific services by developers. Because services were owned by people, developers began complaining of Service A being blocked by tasks on Service B. This should obviously make no sense since there should be no compile time dependencies between the services, but nevertheless, it happened. Instead of offering to help the developer of Service B, or taking tasks from off the backlog related to another service, developers raised the drawbridges of their services as if they were castles (often gold-plating them or adding unimportant features) and waited for the sprint to conclude.
Separation of services and responsibilities also lead to detachment. Developers lost sight of system goals in favor of their individual services' goals. This lack of concern became a very bad habit; we stopped trying to understand what our peers were doing even though we were responsible for reviewing their pull requests.
3. Effective Service Separation.
From a technical perspective, we encountered three big issues when it came to defining or separating services:
Build Dependencies.
There was a lot of desire to share code, particularly common utility code, between services. In general, you want to be able to build and mature microservices independently. Balancing these desires becomes a huge tradeoff. You can either replicate functionality across services and increase the maintenance burden, or you need to carefully construct shared libraries and deal with dependency conflicts that may arise.
This doesn't sound like a big deal, but it caused us a shit ton of heartache. One person defined a general abstraction to database library, but the database library wasn't well encapsulated (in fact it had a transitive dependency on an older version of our web server - Jetty). For circumstances I won't go into, this lead to a week of downtime as we refactored services and separated the client from our shared code base.
Service and Model Contracts.
We also didn't effectively specify our service API's, particularly the model passed between each service. Initially, developers tried creating shared library projects to specify model and interfaces between services. This worked OK for Java, but did nothing for our frontend developers, to which we agreed on schema definitions using "JSON stubs". A stub/prototype/example JSON object is just not great a defining the expectations of a model.
We continuously encountered issues between the front and backend were serialization issues (UI using an Array
, but Java expecting a String
). While this isn't an issue specific to microservices, the problem is compounded when you increase the number of places these data representation issues can occur. We haven't solved this problem yet, but we're looking into using cross-platform technologies like Avro and JSON schema to allow us to perform model validation.
Modes of Communication.
Finally, the ways that services communicate between each other needs to be explicitly defined. This includes serialization, security, request options, error handling, and the list of expected responses. I would also recommend not trying to use one mechanism to fit all use cases. We initially tried to use messaging (AMQP) as the sole form of communicating between services and it didn't work well. We eventually decided to adopt a CQRS communication pattern for services where commands are sent asynchronously via messaging and queries synchronously using web services, but did not have the time to fully implement the pattern.
4. Service Granularity.
Another lesson I learned was to not get too granular with microservices at the beginning of a project, but rather to slowly progress towards microservices as you encounter performance issues. Each microservice is it's own configurable and deployable unit, which means you are adding more burden to the DevOps team. It's also a performance hit if traffic to your application is small and your services are needlessly incurring extra latency because of remote communication.
Instead of doing what we did (starting with 8 services), try starting with two or three services of logically related functionality (they won't be micro, however). When you feel that a section of one of those services needs to be scaled independently from the rest of the functionality, that's a good indicator of a need to create a microservice.
5. DevOps and CM Burden.
In their article on microservices, James Lewis and Martin Fowler mentioned the need for a good DevOps team to deal with the burden incurred by microservices. This is absolutely true and probably the area we did the best. We spent a considerable amount of our time designing and implementing the CI/CD pipeline for building, configuration, and deploying our microservice architecture and it worked fairly well. In the process, we learned a lot of lessons:
Configuration.
The more dynamic service configuration is, the harder it will be to maintain. We learned that it was often easier and less error prone to change the environment than to dynamically configure services. For instance, it's easier to keep references to files in configuration static (pki.pem
) than to use a templating system to select an environment (production.pem
). Instead, try to use the deployment process to select the correct set of artifacts a service might need; we maintained separate RPM repositories and S3 buckets for each environment and only needed to specify the location of that environment.
Continuous Integration and Deployment.
Microservices are supposed to offer the ability to allow more granular updates and deployments. This comes at a cost. Your CI/CD process will need to be setup to handle single-service or group-of-services builds and deployments. This process can be tricky, especially since you may have many more moving parts in the process.
Monitoring.
No surprise here. You're not going to have one Tomcat instance with four WARs. You now have many independent processes and machines to manage (based on your microservice deployment strategy). Each one of those processes needs to be monitored and maintained.
Maintenance Burden.
As the number of microservices in your architecture grows, your team will find itself increasingly locked into the processes you chose early on. Changes to those processes will generally be much larger than what you have experienced with monolithic applications. We found that small changes to configuration caused a cascade of changes to service deployments, which in turn forced us to do a significant amount of testing of the deployment process itself.
Conclusion
If you decide to go the route of microservices, these are my set of recommendations:
Ensure your team is up to the task.
If you have a team of engineers with any form of communication issues, I would not recommend this approach. This pattern of architecture takes a lot of coordination. It also takes a lot of technical skill, which is not limited to ability to write applications. Engineers in all facets of the project need to be comfortable with DevOps tools and frameworks because they're going to be more critical in a microservices environment than a monolithic one.
Ensure you have the resources.
Microservices (arguably) have a bigger upfront cost in terms of DevOps work than monolithic environments. The difference is, you choose to pay this cost upfront rather than paying it when your monolithic application starts to slow and become a real nightmare in production. If your team has to learn the essential toolsets to employ microservices, while trying to meet an ambitious deadline, you're probably not going to succeed (this was our case).
Start small and make adjustments as needed.
Don't try to get too granular with services until you need to. Wait until you need to scale or mature a section of an application independently before breaking it out into a microservice. If you work in a polyglot environment, try keeping your service granularity specific to languages/platform at first.
Keep It Simple Stupid.
Try the simplest implementation first. Don't get too fancy. Microservices bring a lot of complexity to an architecture, you don't need to compound the problem by adding needless features or imposing architectural restrictions unless they are absolutely necessary.
Employ tools that help the process.
Probably the most relevant technology to help support your microservice architecture is a Platform as a Service. There are a lot of existing PAAS implementations that will simplify the DevOps burden. Employ CI/CD infrastructure that supports your goals. Atlassian Bamboo worked well for us, particularly since we used EC2 as a deployment environment. Use automated provisioning (Puppet, Chef, Ansible, etc.). Basically, get comfortable with DevOps.
Take a pragmatic approach.
Use common sense. Do what's best for your team and your customer's or product's goals. This may even mean re-evaluating the microservice use case. At the end of the day, given our team composition, we should of gone with a more traditional approach. Personally, I preferred the microservice architecture, we just couldn't make it happen.
Stumbling my way through the great wastelands of enterprise software development.