Machine learning - Getting to deployment

Published: November 10th, 2020

The benefits of machine learning (ML) are becoming increasingly clear in virtually all fields of research and business. There is an increasing array of tools that are becoming available to help people move in the right direction – though hang-ups can, and do exist, this guide strives to allow practitioners to find their footing on AWS utilizing the PyTorch tool specifically.

From data collection, cleaning, and analysis – the amount of work required to prepare data for an ML model is very extensive. Getting there is no easy feat, and once you have it ready, getting from data to deployed models can seem like rocket science. For many data scientists, it can feel like doing extensive planning for a big trip, getting every detail in order and ready, and then showing up at the airport and being escorted to the cockpit to fly the plane. Where do you even begin?

While many ML models are run on machines on premises, not everyone has access to capable workstations that can crunch large amounts of data in acceptable time frames. Many researchers are turning to AWS with NVIDIA GPU capable instances to run their workloads. But logging into these systems can be confusing for people who are new to AWS and aren’t sure where to begin.

Deployment can be incredibly challenging, but like any skill, having a good guide can help show you the right path and give you the real-world experience so you can maximize your efficiencies. At Six Nines, we’ve developed a guide to help practitioners who are just starting out to understand the decision-making processes needed to get their data models from concept to a working ML training deployment, and then scale those deployments into clusters. The guide, titled “Getting started with a ML training model using AWS & PyTorch,” helps walk practitioners through decisions around which AWS instances are right for the ML model they’re trying to train, and what steps to take to get started. Beginners just starting, up to skilled practitioners who are looking for a shortcut to getting their models into the right cloud environment can benefit from this tutorial.

The guide examines three of the major machine learning instance types using NVIDIA GPUs available through AWS, from single GPU to multi-GPU deployments. These include the following:

Amazon EC2 G4 Instances – The G4 instances are the most cost-effective instance for small scale training and inferencing. Great for early proof-of-concept and situations where time sensitivity is not a limiting factor.
Amazon EC2 P3 Instances – Accelerate your machine learning with high performance computing in the cloud using the P3 Instances. Use these instances to speed up your training and iteration time so that you can do more with your ML models.
Amazon EC2 P3dn Instances – Explore larger and more complex machine learning algorithms with twice the power of the P3 Instances. Choose this instance when you are ready for fast turn-around on your model training, or when you have needs for distributed ML training.

Once you’ve selected the instance that is right for your purpose, the guide provides walkthroughs of specific training models to help give you some direction on the steps that need to be taken to work with the most popular types of ML applications.

These include:

Training a ResNet-50 ImageNet Model using PyTorch on a single AWS G4 or P3 Instance
Training a ResNet-50 ImageNet Model using PyTorch on multiple AWS G4 or P3 Instances
Training a BERT Fine Tuning Model using PyTorchon a single single AWS P3 Instance
Training a BERT Fine Tuning Model using PyTorch on multiple AWS P3 Instances
Object Detection Training using mask-R-cnn on AWS P3dn instances

Machine Learning is becoming a critical tool for organizations of all types, but one of the most challenging things is to know where to start. There are a lot of considerations and factors to manage when deploying a machine learning model – or a fleet of machine learning models. Six Nines is glad to help by providing resources, and even man-power for getting it done.

To download the “Getting started with a ML training model using AWS & PyTorch” guide, please click the link. Please feel free to use the page as a resource for feedback and conversation with our community about your process, and anything that can be done to help you along.

Article Tags

artificial intelligence, cloud, deployment, machine learning

About Matthew Brucker

Matthew Brucker is a director and solutions architect at Six Nines.

View all posts by Matthew Brucker

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Machine learning – Getting to deployment

Article Tags

Subscribe to SDTimes

About Matthew Brucker

Related Articles

Snowflake introduces agentic AI innovations for data insights

Plotly brings vibe coding to visual data app development

Four trends reshaping Kubernetes platform engineering

Discerning reality from the hype around AI