By now, the benefits of microservices have been well established. Reliability. Scalability. Productivity. What you don’t hear about are the tradeoffs, challenges and complexities it takes to transition from a monolith to a microservice. All of these smaller components in a microservice may make it easier to develop software, but instead of only having one application to manage, you now have hundreds or even thousands of services to worry about.

“Microservices often bring the same complexities that come with distributed computing. You have more things to build and deploy. You have more things to monitor. Concerns such as data consistency and handling multiple updates across multiple microservices is a tricky problem to solve,” said Nandan Sridhar, product manager at Google.

The keys to microservice management
Building a monolithic application is actually the easy part because you have all your teams working together in the same room, at the same time and on the same codebase, according to Lawrence Crowther, head of platform architecture at Pivotal.

The problem, however, is you don’t have the ability to upgrade, patch, redeploy or re-architect individual components of an application like you would in a microservice. In a monolith, any change you make can change the entire application, and that can cause negative impacts to the rest of the business, according to Ross Garrett, technology evangelist at Cloud Elements.

“Many companies fail to implement microservices simply because it is drastically clashing with the way they themselves are organized on a human level and a process level,” said Viktor Farcic, senior consultant at CloudBees. “Many companies don’t understand that you can’t have technology change without culture change, and the other way around.”

To successfully transition and maintain a microservice architecture, people and technology need to be aligned, Farcic explained. On a human level, companies need to convert their teams into autonomous small groups of people who can operate without much management. “That means if they want to successfully manage microservices, they need to have a single team in charge of it and that team can’t depend on anyone else within the company,” said Farcic. “Many companies still treat teams as children that need to be guided, ask for permission or be watched over.” If that dependency exists, then most of the benefits of microservices are gone because you introduce too much wasted time.

Those teams not only have to be autonomous, but they also have to be cross-functional. “It is almost unthinkable to deploy, create, design and operate a microservice-based app without an organization adopting a DevOps-centric approach to management where development and operations come together,” said Sandro Guglielmin, senior solution architect at Instana. Teams need to push code on a continuous basis into a production environment, and absorb and accelerate change on a continuous basis, and they can’t do that if they don’t work together, he explained.

On the technology side, Farcic noted companies can’t skip through time. If they have technology or processes that are say 50 years old, they can’t just jump to the present. “You need to go through all that history that others went before you much faster than real time, but you still need to go through all of that and learn what things like cloud, containers, and infrastructure as code mean before you get to the present tense, otherwise, the gap might be [too] big to jump through,” he said.

Having autonomous teams is also good when choosing a set of technologies, according to Instana’s Guglielmin, because it promotes a polyglot environment, allowing independent teams to implement different components, frameworks, and programming languages.

Once you deploy microservices, Guglielmin said, there are three principles to successfully manage: Observability, auditability and portability. Services should be observable, enabling anyone to inspect its state, health, and current performance. Services should be auditable, meaning you can obtain information about what a service has done in the past under certain circumstances in a certain business context. And services should be portable, meaning you can move a service from one environment to another without having to make changes at the code level, Guglielmin explained.  

In addition, an artificial intelligence solution is necessary because it can become impossible and incredibly complex to maintain control and keep track of all the microservices — not to mention all the small pieces running and executing in a microservice. “A machine learning-based system is capable of learning the platform, how it operates, and what different performance prerequisites are necessary for different times of the day or week. It can predict how a service will perform, anticipate resource utilization, and prevent possible outages,” said Guglielmin.

IBM approaches microservices management with five guiding principles: operations, monitoring, eventing and alerting, root cause analysis, and collaboration. “The principles assist the operations team to adopt microservice-based applications. They also help developers think about the operational facets of their application, as both developers and operations share a common goal of services that are robust and of high quality,” Ingo Averdunk, distinguished engineer at IBM, wrote in a post.

While a microservice doesn’t require the use of containers, IBM’s Daniel Berg, distinguished engineer and cloud container service architect, explained that containers fit nicely with microservices because they enable rapid development and delivery while standardizing operational aspects. Containers enable software to be packaged into isolated, lightweight bundles, and a container orchestration tool like Kubernetes provides the ability to deploy, scale, self-heal, load balance and rollback.

Things that should be monitored in a microservice include: availability, performance, response time, latency, error rate, and application logs, according to Berg. An event-management system is needed to correlate all the data from feeds like service monitoring, log monitoring, and infrastructure monitoring, and provide actionable alerts when something happens.  “You absolutely need to have monitoring pieces put in place from day one so you understand what is happening within that distributed environment, those distributed instances and how they work together,” said Berg.

Berg also suggested a service mesh to help provide the overall visibility and insight into your microservices. According to IBM, a service mesh can be thought of as “network of interconnected devices with routers and switches, except in this case the network exists at the application layer, nodes are services, and routing, delivery, and other tasks are off-loaded to the service mesh.” It provides the visibility into how services are interacting, and enables a user to control those interaction models programmatically. “A service mesh moves a lot of the complexity that is typically in your application code and distributes it out into the mesh. It moves the complexity of routing, managing failures, retry logic, policy enforcement, security, and metric gathering to the mesh itself and distributes them across the microservice architecture,” said Berg.

Another key element to tracking and understanding everything that is happening within a microservice is through collaboration. ChatOps platforms should be utilized and provide a central place where people can interact, and those interactions and communications can be logged. “The delivery pipeline process should be well documented and well understood because if changes are delivered into your production environment or all of your environments, everyone on the team should know how that system works and how to contribute to it,” said Berg.

Teams can also collaborate on root-cause analysis and prevent the incident from happening again. “This investigation must be operated in a blameless culture; only through that approach are people willing to share their insights and help others to learn from the experience,” Averdunk wrote.

Pivotal’s Crowther also noted microservice management practices can come from the design patterns of the architecture itself. Design patterns like service discovery can automatically find new services that become available in the platform so developers don’t have to worry about it. Centralized configuration allows developers to deploy across multiple different environments without having to worry about the infrastructure. And a circuit breaker pattern can safeguard against unpredictability. For instance, if you have a microservice that depends on another microservice, but can’t guarantee that microservice is always available, you can provide an alternate path to a different service. “A good example would be the Amazon homepage. If a customer goes to buy a book, but the recommendation service is currently down, it shouldn’t stop people from buying the book,” said Crowther. “If you have a graceful way of handling that outage on the recommendation engine without disrupting the rest of the flow, then that is a good design pattern to implement because at least you are giving 95 percent of the functionality back to the customer.”

Lastly, if you really want to successfully manage a microservice architecture Cloud Element’s Garrett warned against falling into the trap of what other people define as best practices. “Just because it says in some book it should be done this way doesn’t mean that is how it should be done in your organization,” he said. “It shouldn’t be an exercise of academia. It should be focused on business outcomes.”

Microservices management solutions
A lot of challenges that exist in microservices are around the fact that networking, security, monitoring, distributed tracing and resiliency are done piece by piece in every service rather than in a more consistent way, according to Varun Talwar, a product manager at Google Cloud.

To address this, Google, IBM and Lyft announced the open source project Istio in May of last year. Istio is an open platform designed to connect, secure, manage, and monitor microservices in a uniformed way. “Writing reliable, loosely coupled, production-grade applications based on microservices can be challenging. As monolithic applications are decomposed into microservices, software teams have to worry about the challenges inherent in integrating services in distributed systems: they must account for service discovery, load balancing, fault tolerance, end-to-end monitoring, dynamic routing for feature experimentation, and perhaps most important of all, compliance and security,” the Istio team wrote in a post.

Istio is a service mesh that proxies service interactions and provides three sets of value-added features: Security, observability and traffic management and networking, according to Talwar. It ensures all service interactions are secure and encrypted no matter where those services are deployed, and provides service identity and access control. It monitors services in a consistent way, provides automatic metrics logs and traces all traffic within a cluster. And it enables fine-grained control of traffic behavior with routing rules, retries, failovers and fault injection.

“The service mesh empowers operators with policy control and decouples them from feature development and release processes, providing centralized management regardless of the scale and velocity of applications,” Google wrote in a post.

The team plans to release Istio 1.0 later this year. In addition, the team will work towards improving Istio’s performance and availability in various environments. The project can currently be deployed on Kubernetes, with plans to support additional platforms like Cloud Foundry and Apache Mesos.

“The main goal for Google is to enable developers and operators to operate in this microservices world much more easily,” said Talwar. “Our main goal is to make sure microservices is not just great on paper. We want to make it usable and operable at scale.

In addition to Istio, Google has been working on the Open Service Broker, an API that tackles service delivery and consumption. “Through the Open Service Broker model CIOs can define a catalog of services which may be used within their enterprise and auditing tools to enforce compliance. All services powered by Istio will be able to seamlessly participate in the Service Broker ecosystem,” the company wrote.

For development teams building Internet of Things solutions, TIBCO Software recently introduced Project Flogo: an ultralight edge microservices framework aimed at integrating IoT devices. “We are a firm believer that you can’t take an existing technology stack or framework and retrofit it for IoT,” said Matt Ellis, product management and strategy architect at TIBCO.

Flogo started out solely as an integration solution for IoT, but TIBCO recently reintroduced the project with a focus on microservices. “While Flogo was designed for IoT, the same lightweight and performant characteristics also benefit cloud-based microservice deployment models. Its robust extension framework supports use cases like service discovery, circuit breaker, and other cloud-native microservice patterns―and even ‘nanoservices’ for serverless computing,” the company wrote.

Microservices are a good fit for Internet of Things because it allows you to break out functions and pieces and push them to various edge devices without the need for an entire monolithic application running at the edge. “You embrace microservices because you want to build smaller, discrete units of work that are easily or more easily managed. Building microservices that you push out to the edge is something we see as an architecture paradigm,” said Ellis.

Edge computing enables the ability to add additional logic such as on-device aggregation and filtering, and reduces network dependencies. Similarly to microservices, edge computing has two principles: applications need to be optimized, and able to run without dependencies.

According to Ellis, the team is currently working on adding more serverless computing support to the project. Project Flogo recently announced support for AWS Lambda in October of 2017. AWS Lambda enables developers to run code without having to worry about provisioning or managing services. “You no longer focus on microservices, whereby you build small services that contain a few operations, rather, you focus on the development of functions and a function is exactly that, a single unit of business logic implementing value. That said, with microservices you may build and deploy 10s of microservices, however with functions you’re building 100s or 1000s. Another magical bit is the fact that you literally can scale infinitely, but also scale back to zero when your functions are not in use. That is, you don’t pay for idle time,” the Project Flogo team wrote.

Other microservices management solutions to consider include: Linkerd, a transparent proxy for discovery, routing, failure handling and visibility into modern apps; and Prometheus, a  open source monitoring solution.