Q&A: Getting past the hype of AI development tools

Published: June 6th, 2024

- SD Times

Assisting development with AI tools can be quite a decisive topic. Some people feel they’re going to replace developers entirely, some feel they can’t produce good enough code to be useful at all, and a lot of people fall somewhere in the middle. Given the interest in these types of tools over the last few years, we spoke with Phillip Carter, principal product manager at Honeycomb, in the latest episode of our podcast, about his thoughts on them.

He believes that overall these tools can be beneficial, but only if you can narrow down your use case, have the right level of expertise to verify the output, and set realistic expectations for what they can do for you.

The following is an abridged version of the conversation.

SD Times: Do you believe that these AI tools are good or bad for development teams?

Phillip Carter: I would say I lean towards good and trending better over time. It depends on a couple of different factors. I think the first factor is seniority. The tools that we have today are sort of like the worst versions of these tools that we’re going to be using in the next decade or so. It’s kind of like how when cloud services came out in like 2010, 2011, and there were clear advantages to using them. But for a lot of use cases, these services were just not actually solving a lot of problems that people had. And so over a number of years, there was a lot of “hey, this might be really helpful” and they eventually sort of lived up to those to those aspirations. But it wasn’t there at that point in time.

I think for aiding developers, these AI models are kind of at that point right now, where there’s some more targeted use cases where they do quite well, and then many other use cases where they don’t do very well at all, and they can be actively misleading. And so what you do about that depends very heavily on what kind of developer you are, right? If you’re fresh out of college, or you’re still learning how to program and you’re not really an expert in software development, the misleading nature of these tools can be quite harmful, because you don’t really have a whole lot of experience and sort of like a gut feel for what’s right or wrong to compare that against. Whereas if you are a more senior engineer, you can say, okay, well, I’ve kind of seen the shape of problem before. And this code that it spat out is looks like it’s mostly right.

And there’s all sorts of use it to it, such as creating a few tests and making sure those tests are good, and it is a time saver in that regard. But if you don’t have that sense of okay, well, this is how I’m going to verify that it’s actually correct, this is how I’m going to compare what I see with what I have seen in the past, then that can be really difficult. And we have seen cases where some junior engineers in particular have struggled with actually solving problems, because they sort of try it and it doesn’t quite do it, they try it again, it doesn’t quite do it. And they spend more time doing that than just sort of sitting through and thinking through the problem.

One of the more junior engineers at our company, they leaned on these tools at first and realized that they were misleading a little bit and they stepped away to build up some of their own expertise. And then they actually came back to using some of those tools, because they found that they still were useful, and now that they had more of an instinct for what was good and bad, they could actually use a little bit more.

It’s great for when you know how to use it, and you know how to compare it against things that that you know are good or bad. But if you don’t, then you’ve basically added more chaos into the system than there should have been.

SDT: At what point in their career would a developer be at the point where they should feel they’re experienced enough to use these tools effectively?

PC: The most obvious example that comes to mind for me is writing test cases. There this understanding that that’s a domain that you can apply this to even when you’re a little bit more junior in your career. Stuff is going to either pass or fail, and you can take a look at that and be like, should this have passed? Or should this have failed? It’s a very clear signal.

Whereas if you’re using it to edit more sophisticated code inside of your code base, it’s like, well, I’m not really sure if this is doing the right thing, especially if I don’t have a good test harness that validates that it should be doing the right thing. And that that’s where that seniority and just more life experience building software really comes into play, because you can sort of have that sense as you’re building it, and you don’t need to sort of fall back on having a robust test suite that really sort of checks if you’re doing the right thing.

The other thing that I’ll say is that I have observed several junior engineers thrive with these tools quite a bit. Because it’s not really about being junior, it’s just that some engineers are better at reading and understanding code than they are at writing it. Or maybe they’re good at both, but their superpower is looking at code and analyzing it, and seeing if it’s going to do the job that it should do. And this really pushes the bottleneck in that direction. Because if you imagine for a moment, let’s say they were perfect at generating code. Well, now the bottleneck is entirely on understanding that code, it really has nothing to do with writing the code itself. And a lot of more junior people in their career can thrive in that environment, if the writing of the code is more of a bottleneck for them. But if they’re really good at understanding stuff and reading it, then they can say, this thing actually does do things faster. And they can almost use it to sort of like generate different variations of things and read with the output and see if it actually does what it should be doing.

And so I don’t know if this is necessarily like something that is universal across all engineers and junior engineers but like if you have that mindset where you’re really good at reading and understanding code, you can actually use these tools to a significant advantage today and I suspect that will get better over time.

SDT: So even for more senior developers (or junior devs that have a special skill at reading and understanding code), are there ways in which these tools could be overused in a negative way? What best practices should teams put in place to make sure they’re not like relying too heavily on these AI tools?

PC: So there’s a couple of things that can happen. I’ve done this before, I’ve had other people on the team do this as well, where they’ve used it and they sort of cycled through the suggestions and so on, and then they’ve sort of been like, wait a minute, this would have been faster if I just wrote this myself. That does happen from time to time, it actually doesn’t happen that often, but it can.

And there are some cases where the code that you need to write is just, for whatever reason, it’s too complicated for the model. It may not necessarily be super conceptually complicated code, it’s just that it might be something that the model right now is just not particularly good at. And so if you recognize that it’s outputting something where you’re scratching your head and going like I don’t really agree with that suggestion, that’s usually a pretty good signal that you should not be relying on this too heavily for at this moment in time.

There’s the ChatGPT model of you say you want something and it outputs like a whole block of code, you copy + paste it or do something. That’s one model. The other model that I think is more effective that people lean on more, and that, frankly, is more helpful is the completions model where you’re, you’re actually writing the code still, but son like a single line by single line basis, it makes a suggestion. Sometimes that suggestion is bonkers, but usually, it’s actually pretty good. And you’re still kind of a little bit more in control and you’re not just blindly copy + pasting large blocks of code without ever reading it.

And so I think in terms of tool selection, the ones that are deeply ingrained in you actually writing the code are going to lead to a lot more actual understanding of what’s going on, when you compare that to the tools that just output whole big blocks of code that you copy + paste and sort of hopes it works. I think organizations should focus on that, rather than the AI coding tools that barely even work. And maybe it’ll get better over time, but that’s definitely not something organizations should really depend on.

There’s another model of working with these tools that’s developing right now, by GitHub as well, that I think could show promise. It’s through their product called GitHub Copilot Workspace. And so basically, you start with like a natural language task and then it produces an interpretation of that task in natural language. And it asks you to sort of validate like, “hey, is this the right interpretation of what I should be doing?” And then you can add more steps and more sub interpretations and edit it. And then it takes the next step, and it generates a specification of work. And then you say, okay, like, do I agree with the specification of work or not? And you can’t really continue unless you either modify it or you say, “yes, this looks good.” And then it says, “Okay, I’ve analyzed your codebase. And these are the files that I want to touch. So like, are these the right places to look? Am I missing something?” At every step of the way, you intervene, and you have this opportunity to like, disagree with it and ask it to generate something new. And eventually it outputs a block of code as a diff. So it’ll say, “hey, like, this is what we think the changes should be.”

What I love about that model, in theory, and I have used it in practice, it works. It really just says, software development is not just about code, but it’s about understanding tasks. It’s about interpreting things. It’s about revising plans. It’s about creating a formal spec of things. Sometimes it’s about understanding where you need to work.

Because if I’m being honest, I don’t think these automated agents are going to go anywhere, anytime soon, because the space that they’re trying to operate in so complicated, and they might have a place for, tiny tasks that people today shunt off to places like Upwork, but for like replacing teams of engineers actually solving real business problems that are complicated and nuanced, I just don’t see it. And so I feel like it’s almost like a distraction to focus on that. And the AI powered stuff can really be helpful, but it has to be centered in keeping your development team engaged the entire time, and letting them use their brains to like really drive this stuff effectively.

SDT: Any final thoughts or takeaways from this episode?

PC: I would say that the tools are not magic, do not believe the hype. The marketing is way overblown for what these things can do. But when you get past all that, and especially if you narrow your tasks to like very concrete, small things, these tools can actually really be wonderful for helping you save time and sometimes even consider approaches to things that you may not have considered in the past. And so focus on that, cut through the hype, just see it as a good tool. And if it’s not a good tool for you discard it, because it’s not going to be helpful. That that’s probably what I would advise anyone in any capacity to, to frame up these things with.

Article Tags

AI, development, honeycomb, software development

About SD Times

View all posts by SD Times

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

Q&A: Getting past the hype of AI development tools

Article Tags

Subscribe to SDTimes

About SD Times

Related Articles

MariaDB unifies transactional, analytical, and vector databases in MariaDB Enterprise Platform 2026 release

Sonar announces new solution to optimize training datasets for coding LLMs

Software engineering foundations for the AI-native era

Report: Developers want to be measured on more than just technical metrics