Serverless isn't Effortless
Lessons we learned in our first serious Serverless project.
I've been working on a fairly substantial Serverless project over the last week and I wanted to offer some thoughts to developers considering Lambda-styled infrastructure.
Usually when I say "Serverless", I mean the architectural style of using Lambdas on hosted infrastructure and not the Serverless project. I will also use the terms "Lambdas" and "Serverless" interchangeably. When I mean the Serverless project, I will use a link to denote the difference.
1. Serverless projects won't be any smaller than microservices of similar complexity.
One thing we noticed immediately was that we were basically writing the same amount of code with our Lambdas as we were with our microservices. When you think about it, this makes sense.
Most microservice applications use a web framework. Developers tend to just write a bunch of handlers (functions) that get used by that framework. These applications also have shared models, utility code, etc. We basically found ourselves writing Hapi.js routes as Lambdas (we even modeled the common boilerplate like Hapi.js). This really surprised us. I think, subconsciously, we thought we would end up with less code.
2. Lambdas are not appropriate for all use cases.
Lambdas (obviously) aren't long-lived processes. For 90% of software systems, this is probably a good idea. However, if you do need to keep application state in process between requests, Lambda is not going to work for you. You will instead need to delegate that storage to a database or cache. This need to be stateless eliminates other scenarios for Lambda, like applications that need to perform distributed locks or participate in leader elections. You also probably won't use Lambdas for low-latency, high-throughput applications either. I'm not saying that existing Lambda infrastructures aren't fast, rather that they don't tend to suit the extreme conditions you would find in a high-frequency trading system.
3. Lambdas may still need infrastructure.
When you start playing with Lambda functions, you will probably rely on data streams or services provided by the Cloud Provider. For instance, pairing RDS and ElasticCache with Amazon Lambda can be a real win. However, if you need to integrate databases and services not provided by Amazon, you're back to managing infrastructure. Our team uses MongoDB and InfluxDB for a lot of our workloads (not to mention that most of our architecture our microservices hosted on ECS). Since our team was already managing microservice infrastructure, Lambda wasn't much of an advantage operationally.
4. Testing and Troubleshooting Lambdas is not fun.
You've probably heard this before, but testing Lambdas is more effort than you might think. The first problem is that there are runtime disparities between running them locally and on AWS. The Serverless framework has the ability to serve your Lambdas over an HTTP interface, but this is not the same as proxying them with API Gateway. In fact, we immediately ran into issues mapping our HTTP responses with API Gateway even though they worked fine locally.
Second, given that testing Lambdas on AWS provides a more realistic environment, you will find yourself frequently changing the code and redeploying functions. Depending on how far you are to the region you are testing in, it can take some time to deploy the code. You are not going to get that immediate response a lot of people are accustomed to with their local environment. This was particularly painful for us because when we ran integration tests locally we have to interact with remote databases through a VPN, slowing down the overall process.
Finally, your options for troubleshooting Lambdas are limited. In terms of AWS Lambda, you are left to inspecting logs in CloudWatch. If you have a lot of executions of the same Lambda function, you will be crawling the logs looking for the specific invocation. Personally, I'm not a huge fan of CloudWatch -- you might consider integrating a third-party logging provider for better filtering and traceability.
Another idea might include adding some monitoring/traceability into the application layer (like using a custom logger or enriching your models with events). An AWS-specific option is to use AWS Step Functions to coordinate tasks (Step Functions is literally a workflow engine for Lambdas). With Step Functions, you can decompose Lambdas into smaller coordinated steps and AWS will keep track of the overall state of the workflow. This may not help you debug a test, but it can give you insight into the overall flow of a complex Lambda process from a business process perspective.
Conclusion
Serverless is really cool. We really like not having to manage infrastructure, particularly when the applications we write are simple. Lambdas are also especially useful in AWS where they enjoy deep integration with other products.
However, there are some downsides to using Serverless. As I mentioned above, Lambdas do not necessarily lead to faster development or smaller code bases. They are also not a one-size-fits-all solution, particularly if you need stateful, low-latency, or long-lived processes.
Before you commit to the architecture you should consider whether the operational advantages are worth the extra complexity and effort in testing and traceability.
Special Thank You to the Serverless Team
Issues with developing and operationalizing Lambdas aside, we found the Serverless framework to be really fantastic for our purposes. If the framework didn't exist and we were left to using AWS APIs to manage the Lambda lifecycle, I think we probably would have given up on Lambda computing and gone back to a more traditional microservice. I want to personally thank the Serverless team and contributors for all their hard work.
Stumbling my way through the great wastelands of enterprise software development.