Guest View: Move fast and fix things: It’s time for an API audit

Published: April 12th, 2018

If you run an open API program, the current controversy surrounding Cambridge Analytica’s use of Facebook data to create psychographic profiles of millions of Facebook users should concern you, and not just because of how your profile data may have been used.

I recall being very surprised at how much data I could access through Facebook’s application programming interface (API) back when they first released it. I could easily navigate through a specific user’s news feed and friends list and all but replicate that user’s web of social interactivity with only a handful of calls. Facebook opened this data to allow developers to create games and applications that enhanced the core purpose of Facebook at the time — connecting people and allowing them to share their lives with their friends online. While the terms of service made it clear that data was not intended to be captured and stored, there was also nothing stopping a developer from breaking those rules — and nothing Facebook could do to easily tell if the rules had been violated.

Subsequent updates to the Facebook API limited the access to much of that data, but the genie was already out of the bottle. It appears the data Cambridge Analytica used may have been gathered some time prior to 2015, before those limits were put in place.

It isn’t just Facebook
Facebook is taking a big hit on all this controversy, but there’s a part of me that feels it’s somewhat undeserved. The same data that may have been used to target specific audiences with messages of questionable veracity also allowed companies like Zynga to flourish, and helped Facebook evolve from a simple social bulletin board to a genuine social platform. I don’t believe any of this was malicious on Facebook’s part. I think it’s the unintended consequences of a drive toward radical openness marred by a culture of “move fast and break things.”

Now it’s time to move fast and fix things. If you run an API program that is open to the public, you should take this as a warning to audit your APIs now to understand exactly what data you’re exposing, who has access to it, and how that data is connected to other API endpoints in your system.

Why an API audit is important
As an example, the early Facebook API allowed a fair amount of a user’s friend data to be exposed as part of the user’s profile data. This meant if your friend granted a third-party app access to their data, that app would also give some limited access to your data, even if you didn’t grant access to that app. I’m aware of at least one other social network API that not only returned a user’s profile data in a single call, it also returned every one of their followers. Aside from the fat payload that created, it meant giving more data to the application than it actually required or requested.

Proper normalization of RESTful endpoints combined with endpoint-level access restrictions is one of the best ways to avoid this type of situation. For example, a user’s profile may be accessible from the endpoint ‘/users/rzazueta’. Rather than list all connected friends as part of that response, the data should contain a link to the friends list, i.e. ‘/users/rzazueta/friends’. When endpoint-level access controls are applied in the code, only those with the ability to read the friend’s endpoint would gain access to that information.

This, of course, means you need to set up your API packages, user roles, and endpoints in a way that allows for that level of control. Most API management systems make this relatively easy, but can only help if you’ve designed your API correctly.

If you have never done so, now is a time to perform an audit of your API to map what data is accessible to which users and ensure you’re not exposing more than you intend to — even if your API is internal only.

API Audit 101
Start by creating a map connecting which users and user roles have access to which endpoints. Ideally, all of your users will have consistent access through a set of roles rather than individual custom access. If that’s not the case, consider creating new roles that will suit those customers’ needs.

Next, look at the data in each of those endpoints. If you’re applying content filtering to limit what data is returned to a specific user or role, make sure you mark that down. In a well-designed RESTful API, your endpoints would return only the data they are responsible for. Any data related to other endpoints should only be accessible through those endpoints, referenced through a hyperlink, as in my user profile and friend list example above. It’s tempting to provide all of that data in a single response to cut down on the number of API requests, but it also opens the door to exposing more data than intended.

If your API is designed to return more data in fewer calls, you should consider moving that logic from the core API code to a layer that calls on the core APIs to consolidate and respond to those requests as a separate function. This pattern, called “Backend for Frontend” or “BFF,” has been adopted by companies such as Netflix to make it easier to create APIs that serve specific client needs. BFFs allow for an extra level of access control, as they should be limited by the same access levels as the customers using them.

Over the years, I’ve spoken with a number of companies who have hesitated in moving forward with an API program for fear it could be a vector of attack for hackers. The Cambridge Analytica case would seem to confirm those fears, though perhaps not in the ways once imagined. The situation serves as a clear warning to API providers that data security must go beyond basic access controls and firewalls. Good API management systems can significantly improve the security of your APIs. Those designing the APIs, however, must keep in mind the potential unintended consequences of their design decisions.

Article Tags

API, APIs, Cambridge Analytica, data, developers, Facebook

About Rob Zazueta

Rob Zazueta is the director of digital strategy at TIBCO.

View all posts by Rob Zazueta

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Guest View: Move fast and fix things: It’s time for an API audit

Article Tags

Subscribe to SDTimes

About Rob Zazueta

Related Articles

ScyllaDB X Cloud’s autoscaling capabilities meet the needs of unpredictable workloads in real time

Databricks adds new tools like Lakebase, Lakeflow Designer, and Agent Bricks to better support building AI apps and agents in the enterprise

AI is currently in its teenage years, battling raging hormones

In MCP era API discoverability is now more important than ever