Microservices at scale: A complexity management issue

Published: July 2nd, 2021

The benefits of microservices have been touted for years, and their popularity is clear when you consider the explosion in use of technologies, such as Kubernetes, over the last few years. It seems that based on the number of successful implementations, that popularity is deserved.

For example, according to a 2020 survey by O’Reilly, 92% of respondents reported some success with microservices, with 54% describing their experience as “mostly successful” and under 10% describing a “complete success.”

But building and managing all of these smaller units containing code adds a lot of complexity to the equation, and it’s important to get it right to achieve those successes. Developers can create as many of these microservices as they need, but it’s important to have good management over those, especially as the number of microservices increases.

According to Mike Tria, head of platform engineering at Atlassian, there are two schools of thought when it comes to managing proliferation of microservices. One idea is just to keep the number of microservices to a minimum so that developers don’t have to think about things like scale and security.

“Every time they’re spinning up a new microservice they try to keep them small,” Tria said. “That works fine for a limited number of use cases and specific domains, because what will happen is those microservices will become large. You’ll end up with, as they say, a distributed monolith.”

The other option is to let developers spin up microservices whenever they want, which requires some additional considerations, according to Tria. Incorporating automation into the process is the key to ensuring this can be done successfully.

“If every time you’re building some new microservice, you have to think about all of those concerns about security, where you’re going to host it, what’s the IAM user and role that you need access to, what other services can it talk to—If developers need to figure all that stuff out every time, then you’re going to have a real scaling challenge. So the key is through automating those capabilities away, make it such that you could spin up microservices without having to do all of those things,” said Tria.

According to Tria, the main benefits of automation are scalability, reliability, and speed. Automation provides the ability to scale because new microservices can be created without burdening developers. Second, reliability is encapsulated in each microservice, which means the whole system becomes more reliable. Finally, nimbleness and speed are gained because each team is able to build microservices at their own pace.

At Atlassian, they built their own tool for managing their microservices, but Tria recommends starting small with some off-the-shelf tool. This will enable you to get to know your microservices and figure out your needs, rather than trying to predict your needs and buying some expensive solution that might have features you don’t need or is missing features you do.

“It’s way too easy with microservices to overdo it right at the start,” Tria said. “Honestly, I think that’s the mistake more companies make getting started. They go too heavy on microservices, and right at the start they throw too much on the compute layer, too much service mesh, Kubernetes, proxy, etc. People go too, too far. And so what happens is they get bogged down in process, in bureaucracy, in too much configuration when people just want to build features really, really fast.”

In addition to incorporating automation, there are a number of other ways to ensure success with scaling microservices.

Incorporate security

Because of the nature of microservices, they tend to evoke additional security concerns, according to Tzury Bar Yochay, CTO and co-founder of application security company Reblaze. Traditional software architectures use a castle-and-moat approach with a limited number of ingress points, which makes it possible to just secure the perimeter with a security solution.

Microservices, however, are each independent entities that are Internet-facing. “Every microservice that can accept incoming connections from the outside world is potentially exposed to threats within the incoming traffic stream, and it has other security requirements as well (such as integrating with authentication and authorization services). These requirements are much more challenging than the ones typically faced by traditional applications,” said Bar Yochay.

According to Bar Yochay, new and better approaches are constantly being invented to secure cloud native architectures. For example, service meshes can build traffic filtering right into the mesh itself, and block hostile requests before the microservice receives them. Service meshes are an addition to microservices architectures that enable services to communicate with each other. In addition to added security, they offer benefits like load balancing, discovery, failure recovery, metrics, and more.

These advantages of service meshes will seem greater when they are deployed across a larger number of microservices, but smaller architectures can also benefit from them, according to Bar Yochay.

Of course, the developers in charge of these microservices are also responsible for security, but there are a lot of challenges in their way. For example, there can often be friction between developers and security teams because developers want to add new features, while security wants to slow things down and be more cautious. “As more apps and services are being maintained, there are more opportunities for these cultural issues to arise,” Bar Yochay said.

In order to alleviate the friction between developers and security, Bar Yochay recommends investing in developer-friendly security tools for microservices. According to him, there are many solutions on the market today that allow for security to be built directly into containers or into service meshes. In addition, security vendors are also advancing their use of technology, such as by applying machine learning to behavioral analysis and threat detection.

Make sure your microservices don’t get too big

“We’ve seen microservices turn into monolithic microservices and you get kind of a macroservice pretty quickly if you don’t keep and maintain it and keep on top of those things,” said Bob Quillin, chief ecosystem officer at vFunction, a company that helps migrate applications to microservices architectures.

Dead code is one thing that can quickly lead to microservices that are bigger than they need to be. “There is a lot of software where you’re not quite sure what it does,” said Quillin. “You and your team are maintaining it because it’s safer to keep it than to get rid of it. And that’s what I think that eventually creates these larger and larger microservices that become almost like monoliths themselves.”

Be clear about ownership

Tria recommends that rather than having individuals own a microservice, it’s best to have a team own it.

“Like in the equivalent of it takes a village, it takes a team to keep a microservice healthy, to upgrade it to make sure it’s checking in on its dependencies, on its rituals, around things like reliability and SLO. So I think the good practices have a team on it,” said Tria.

For example, Atlassian has about 3,000 developers and roughly 1,400 microservices. Assuming teams of five to 10 developers, this works out to every team owning two or three microservices, on average, Tria explained.

Don’t get too excited about the polyglot nature of microservices

One of the benefits of microservices—being polyglot—is also one of the downsides. According to Tria, one of Atlassian’s initial attractions to microservices was that they could be written using any language.

“We had services written in Go, Kotlin, Java, Python, Scala, you name it. There’s languages I’ve never even heard of that we had microservices written in, which from an autonomy perspective and letting those teams run was really great. Individual teams could all run off on their own and go and build their services,” said Tria.

However this flexibility led to a language and service transferability problem across teams. In addition, microservices written in a particular language needed developers familiar with that language to maintain them. Eventually Tria’s team realized they needed to standardize down to two or three languages.

Another recommendation Tria has based on his team’s experience is to understand the extent of how much the network can do for you. He recommends investing in things like service discovery early on. “[At the start] all of our services found each other just through DNS. You would reach another service through a domain name. What that did is it put a lot of pressure on our own internal networking systems, specifically DNS,” said Tria.

Figuring out a plan for microservices automation at Atlassian

Atlassian’s Tria is a proponent of incorporating automation into microservices management, but his team had to learn that the hard way.

According to Tria, when Atlassian first started using microservices back in early 2016, it had about 50 to 60 microservices total and all of the microservices were written on a Confluence page. They listed every microservice, who owned it, whether it had passed SOC2 compliance yet, and the on-call contact for that microservice.

“I remember at that time we had this long table, and we kept adding columns to the table and the columns were things like when was the last time a performance test was run against it, or another column was what are all the services that depend on it? What are all the services it depends on? What reliability tier is it for uptime? Is it tier one where it needs very high uptime, tier two where it needs less? And we just kept expanding those columns.”

Once the table hit 100 columns, the team realized that wouldn’t be maintainable for very long. Instead, they created a new project to take the capabilities they had in Confluence and turn them into a tool.

“The idea was we would have a system where when you build a microservice, the system essentially registers it into a central repository that we have,” said Tria. “That repository has a list of all of our services. It has the owners, it has the reliability, tiers, and anyone within the company can just search and look up a surface and we made the tool pretty plugable so that when we have new capabilities that we’re adding to our service.”

Article Tags

Atlassian, microservices, reblaze, vfunction

About Jenna Barron

Jenna Barron is News Editor of SD Times.

View all posts by Jenna Barron

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Microservices at scale: A complexity management issue

Figuring out a plan for microservices automation at Atlassian

Article Tags

Subscribe to SDTimes

About Jenna Barron

Related Articles

Modernizing your approach to governance, risk and compliance

Atlassian Rovo is now generally available, giving users fast access to all enterprise knowledge

vFunction’s latest capabilities aim to improve microservices governance, reduce technical debt

April 2024: People on the Move