How I use Github Copilot

drawing — DALL-E 3 Prompt "Cyberpunk Software Engineer who has long hair with a parrot on his shoulder, the parrot has a sign hanging around his neck that says "Github". The developer is surrounded by computer screens"

Introduction

Nothing has revolutionized my daily life in software development quite like Github Copilot. The original version of Copilot is basically a code autocomplete system on steroids. Although I haven't yet explored the newer features like Copilot for CLI or any of the other things in Github Next, after using Copilot in VsCode for a year it's hard to think about a life before it.

In college, I learned basic C and C++. Then, when I decided to transition from electrical engineering to software, I taught myself Javascript/HTML/CSS (Thanks Code Academy!). After that, I jumped into Python, advanced C++, Java, Rust, GoLang, and some random others. Although most languages have similarities (like every language has some sort of concept of a for loop, if statement, etc), pretty quickly the syntax diverges and it can be hard to remember what language used which syntax. For example, if you want to add to an array, is the command arr.push(), arr.append()? It can be hard to remember what syntax is correct if you are switching between languages on a regular basis, or coming back to a language you haven't used in a while. This is even more pronounced of an issue when it comes to dealing with libraries.

Last year I made a transition from a team that worked primarily with C++ to a team that worked primarily in Python. In Python we use the popular data science libraries pandas, numpy, transformers, just to name a few. Although I worked with all of these libraries for a little bit in grad school, I quickly found that my knowledge of the library APIs was severely limited.

A few years ago, my approach to learning a new library would have looked something like:

Find library documentation on the web
Read all the docs and try to search through to find the relevant information
Try out the library function that I think solves my problem
When it doesn't work, look for a new function or look at the library source code to figure out what is going wrong.

With Github Copilot, my approach mostly now looks like this:

Type a comment into my code that describes what I want to do. E.g. "Load the parquet file from s3://x/y/z.parquet, filter for only items that satisfy x constraint, and print a histogram using 300 bins"
Copilot fills out the code to complete the task.
If it's right, I observe the code that is suggested and now know which function in the library accomplishes the task
If it's wrong, I jump to the library's source code and look at what went wrong with the way the function was called.

The key difference here is that I usually no longer have to figure out the best function call to complete a task. Because Copilot is powered by a Large Language Model (LLM), it is trained on a large amount of existing code, meaning that it already has knowledge of what function is popularly used to solve the task that is being described. In this way, I'm trusting that the majority of developers are using the correct logic. There's a famous story about how a Stack Overflow solution that had a bug in it was found all over the place in codebases because everyone was copying and pasting that solution into their code 🤣 story here. It's not unreasonable to assume that something like this will exist in the code that Copilot suggests.

If you are using a new library that isn't public, Copilot may not have a great understanding of how it is used (though even that is changing as Copilot increases its context length ability to have local library source code available in its context). But if the library is something very common like pandas, Copilot knows that library better than most people. I trust Copilot to tell me which is the best function to use, instead of me trying to figure it out for myself.

Although it's not uncommon for Copilot to suggest code that isn't quite right, 95% of the time it is exactly what I need and saves me a massive amount of time typing and reading. It gives me more time to focus on the algorithm/application that I'm building, and lets me avoid the headache of reading documentation that can't always be trusted. If something doesn't work, I normally now bypass the documentation and go straight to the source code of the library. The documentation might be wrong, but the source code is exactly what the library is really doing.

General paradigms for how I use Copilot

Let's go over a few high level techniques that I use to interact with Copilot:

Write a python comment on a new line, and after hitting enter to move to the next line, allow github Copilot to fill out the text

drawing

write a function with a descriptive name, and have it complete the function

drawing

Convert text to the desired format for copying and pasting

drawing

These were just simple examples, but once you use it for a bit and start to ask Copilot to perform more complicated operations, you really start to appreciate the time it helps to save.