Perspective

AI and Coding: How ChatGPT and Other NLPs Can Get You Most of the Way There

While it may feel as though Natural Language Processors (NLPs) such as ChatGPT emerged overnight, in actuality, they have been around for quite some time.

What once felt like a novelty has quickly become embedded in workflows across the globe, and software development is no exception. In the past year, tools like GitHub Copilot and ChatGPT have gone from experimental curiosities to everyday companions for many developers. These NLPs are touted as making our lives easier, but how much can they actually (accurately) accomplish at this time? What are their limitations?

Many stories have emerged of how AI could one day replace many jobs. On the list of purported at-risk fields are coders and software engineers, but can these programs actually produce the same caliber of work as a trained and experienced developer? Cloudberry’s Senior Developer, Mikhail Kornienko sought to find out just how advanced these offerings are at performing the craft he has perfected in his 20+ years in this field. The following details Mikhail’s experience in testing out the capabilities and shortcomings of NLPs in writing (meaningful) code.

What tools are currently available?

The AI landscape for coding is rich—and evolving fast. But two tools have set themselves apart. The following have been reviewed and are being used in Cloudberry development:

GitHub Copilot: $10 per month subscription
OpenAI’s ChatGPT: Limited free version available, $20 per month subscription or paid-per-token subscription

GitHub Copilot was originally created as a tool to support programming. Embedded directly into the Integrated Development Environment (IDE), Copilot can tackle the mundane and repetitive. It anticipates patterns, auto-completes functions, and can even scaffold modules. Some IDEs, like VS Code or PHPStorm, do support this integration, however, some do not (at least not without jumping through hoops) such as Apple’s Xcode.

How can these tools best be leveraged as they exist today?

Copilot

We primarily allow Copilot to generate code as it sees fit. In many cases, the code served up is in pretty good shape, saving time on grunt work. AI doing the grunt work, under strict developer supervision, that is, would be our recommended best practice for using AI at the moment.

So what constitutes “grunt work”:

Creating initial chunks of code as it sees fit, based on the context it has access to (which is usually the source code, and some additional context provided during the coding – such as programming correcting some things Copilot has created initially)
Creating chunks of code on a “per-request” basis. This is when Copilot writes code based on a written a prompt provided by the programmer, which explains the functionality of the code (function, class, etc)
Creating predictable structures (for example sequences of objects, where some of their contents change predictably – such as IDs incrementing, etc – based on a few first items in the sequence)

While passing off grunt work can help lighten a programmer’s workload, strict developer control is necessary as the AI can generate code that does not properly work. Although this is rare, it does happen. More often, AI generates code with small, easy-to-miss bugs or gotchas, which can result in extensive collateral damage. Perhaps most hilariously, the AI can also generate “dreamt-up code”, which appears to be correct but doesn’t actually rely on any existing functionality for a language.

In our workflow, we would let Copilot run alongside the programmer, providing chunks of code, quickly reviewing the proposals and either accepting them as-is, accepting them on the condition that the code be modified before being used, or rejecting the code altogether.

When Copilot does provide unsatisfying code, sometimes the rejected code can still be helpful. Copilot’s suggestions might provide some “ideas” or “hints” of its thought process which can then be hopped on by a programmer. In such a case, we can fuse the programmer’s thoughts and input provided by Copilot to arrive at a functional solution. On its own, Copilot can often interpret the programmer’s “revised” thoughts, and with a few “manual” hints (i.e. the programmer writing a few lines of correct code), it might start providing much better suggestions.

ChatGPT

While Copilot is used for general-flow coding, ChatGPT can be used for even more.

Here is the way we approached this experiment: we set up a local server that talks to the ChatGPT API, using its on-demand pricing (rather than the $20/month plan). This route is much more affordable. The free plan was not utilized, because at times it tends to be throttled or altogether unavailable (even though the free tier can at times be faster, and could be a higher iteration of the GPT engine completely – such as GPT 4.0 vs GPT 3.5). So far, for our use cases, GPT 3.5 has worked well enough.

The primary use of ChatGPT is for “research”, or writing code on-demand, followed by a programmer talking to the system and asking it to make specific changes to the code, or providing feedback.

Using ChatGPT for research is like having documentation for any programming language, concept, or algorithm, but in an extremely fast and flexible way. All this information is already available by doing general Google searches or looking up information in docs, but ChatGPT makes this process easier and exponentially faster in most cases. Depending on the level of information provided, it is possible to jump right into coding. Or ask ChatGPT to expand on a specific topic, digging deeper into certain concepts. This could be very useful when writing code in a new language or for an unfamiliar framework – Copilot won’t be useful in this case because it has very little knowledge on how things should actually be coded. ChatGPT can be fed natural language requests, and output the code. This code, in turn, can be processed by the programmer, analyzed, learned, and checked against any potential issues. It can be iteratively and interactively improved or modified in a human-readable form, simply by chatting with the ChatGPT system.

When it comes to programming (though also for most professional topics), ChatGPT works best when a user limits or channels its focus to something specific, rather than fetching data from a wide array of available sources. This “channeling” is normally done using “prompts”. A prompt is given to the system in the beginning of interaction (prompts for GPT systems are a pretty complex topic in general, and we are have yet to completely understand them). The following is an example prompt that could be used when writing code in PHP/Laravel framework:

“You are a Lavarel programmer helper. You help the person who asks you different programming-related questions. The system you are helping with is built using Lavarel. You must use Laravel 8 or later to generate responses. Be brief and concise. When unsure about something, mention it.”

Overall, AI systems are helpful and can be pretty big time savers, but just like everything else, in order to make these systems serve you well, you will need to use them and learn them. In the case of AI systems, learning to control and limit them is key – always be ready to be deceived or get a generated response that will kill your database… or worse.

A Developer’s risk of being replaced: Mikhail’s take

As a developer, embracing the changes and adapting has always been one of the most important things to not only stay relevant but remain in the field at all – so whether or not to embrace AI is not really a difficult choice to make.

The concern of making developers obsolete has been hotly debated. Some developers have tried (and continue to try) not to touch the AI-assisted approach with a ten-foot pole. They could have the last laugh, who knows? Throughout development’s history, developers have gone from no documentation to documentation in the form of Xerox copy, to piles of books with documentation and code snippets to use, to very early days of online communities, very rough search engines, to Google and Substack, and finally, now, to the ChatGPTs of the world (!) and have not yet become obsolete. Does this mean all of the talk of being replaced is simply the latest exaggeration? If we choose to stay relevant and embrace it, we must also embrace the inherent consequences that accompany this scary and exciting advancement.

Back to Perspectives

Next perspective

Making the System Usability Scale Work