Getting Started With Open Source Development

Jamie Taylor
Monday, Dec 26, 2022

The hands of two separate people holding a three sticky notes, arranged in a cascading manner. Each sticky note is a different colour and have legends in blue ink, the text for each reads To Do (on a lilac coloured sitcky note), Doing (on a yellow sticky note), and Done (on a pink sticky note). Of the people holding the sticky notes, only their hands and forearms are in the image.

The cover image for this post is by Tudor Baciu

This blog post was written by Jamie.

You’ve likely been told that the best way to get started in software development/engineering/coding/whatever, is to contribute to an open source project. And that’s great advice; the world runs on open source software, and we need more and more people to support it. In fact, most of the people who contribute to open source software do so in their own time

unless we’re talking about things like Angular (funded by Google), React (looked after by Facebook), .NET (a Microsoft open source thing), or most Linux stuff

If you’ll forgive a little generalising, that is.

So how should an aspiring developer build up an open source presence, specifically if they have the hopes of getting a job in the industry?

First Thing’s First

The first thing to remember is that, for whatever reason, a hiring manager or team lead is almost never going to look through your GitHub/GitLab/BitBucket profile. The recruiters that you talk to will say that they will, but they most certainly will not. Most folks will hand wave with something along the lines of

I don’t have the time to look through a GitHub repo
- some team lead

And that will likely be true, but it wont stop them from sending you the dreaded Tech Test. Something that they’ll expect you to spend anywhere from 45 minutes to several hours on.

It’s my personal opinion that Tech Tests are used to weed out the folks who don’t want to take Tech Tests, and as such are only useful in getting the candidates who have the time to spend on taking them. In my view, this makes them exclusionary as you’re automatically discounting anyone who (for one reason or another) is unable to spend the time required to pass the Tech Test.

Another thing about the Tech Test is that it won’t have anything to do with the work you’ll actually be doing. Here’s a couple examples of Tech Tests I’ve been asked to take in the past:

write some code which sorts T-shirts by size in 30 minutes
calculate the tax on toll bridge fairs in 60 minutes
write an entire, functional, bug tracker in two hours

You could argue that these are designed to make you show off how you think. But that can only be achieved in an in-person Tech Test where you can literally talk through your thinking process.

Even people like Guido van Rossum (creator of the Python programming language) have been asked to sit Tech Tests in the past. However, the good companies and recruiters will want to see your work as a collection of repos. So it’s worth having a bunch of them, anyway.

And once you have a set of repos in a GitHub/GitLab/BitBucket profile, remember to link through to your favourite ones on your CV/Resume. If I don’t know about them, I might not go looking.

Actually Contributing

So the first thing you’ll want to do is take a look at some open source repos. You’ll want to choose a repo which is in a technology that you know, in a language that you are comfortable with. I wouldn’t recommend contributing to a large open source repo if you’re hoping to learn the language at the same time, as the other contributors might not have the time to help give pointers on the things you need to learn.

You’ve picked out a repo, it’s in a technology and language that you understand and have worked with in the past. What now?

Go take a look at the issues tab:

A screenshot of the facebook/react repo's Issues tab showing 891 open issues.

Each issue will have a conversation about what the problem is, whether it’s a legitimate problem with the repo or whether it’s a misunderstanding or misconfiguration, an example repo showing off the problem, and perhaps even a starting point for how to solve it.

Read through this description and get a handle on what the problem is and how you could solve it.

When you’re happy with your understanding, leave a comment saying that you’d like to pick this issue up. In your comment, talk about how you might go about fixing the issue, what tests you can envision writing to verify that your fix has worked, and ask for any pointers for any related areas of the code.

Now you’re ready to fork the project.

Forking The Project

You should almost never work in the original repo (also known as “upstream”). Unless you’re told otherwise, you should fork the repo (which creates a copy, just for you) and work there.

A screenshot of the facebook/react repo's fork button, showing that it has been forked over 41,500 times

Some repos will have a CONTRIBUTING.md file, you should always check that file before you do anything. This will tell you how you can contribute to the repo - some repos will have different rules and processes for contributing.

the CONTRIBUTING.md file in the facebook/react repo points to React’s ‘How to contribute’ page

Clone the forked version of the repo down to your computer and start working on the changes.

It will be a good idea to run any and all tests in the repo before starting your work, that way you’ll know if your changes have caused any of them to fail as you work on your fix. It’s also worth always making sure that the pre-existing tests always pass.

It also might be a good idea to keep in touch with the other repo maintainers by posting about your progress in the issue that you picked up. That way, they can get a feel for your progress, and you can also ask questions and talk about how you’re fixing the issue.

What’s A Pull Request?

When you’re done making the changes in your forked version of the repo, you’ll need to create a pull request. This is a way of asking the upstream repo (the one you forked from) to pull in your changes and accept them.

Before you create the pull request, it will be worth doing some (if not all) of the following:

Running any applicable linters across your changes
Checking for typos (and fixing them) or debugger statements (and removing them)
Running all of the tests
Merging the latest from upstream into your forked repo

This last one catches people out all of the time. In the time since you’ve started working on your changes, the upstream will have moved on. So you’ll need to get the latest from the upstream and merge them into your branch. This avoids a lot of merge conflicts - but not all.

It will also be worth double checking through the CONTRIBUTING.md file, and ensuring that you have been adhering to the code of conduct

in fact, you’ll need to be adhering to the code of conduct from the get-go

If there isn’t a code of conduct, do you work and interact with the other maintainers as if there is one and that it’s based on v1.4 (the most popular version) of the Contributor Covenant.

Now you’re ready to create a request to pull your changes into the code base. This will likely kick off a code review: some of the long time contributors will look at your proposed solution and will see if they can find any problems with it.

They’re not doing this to be mean spirited. Once they’ve pulled your changes into the upstream repo it’s no longer your code, it’s theirs and they have to maintain it. They’re also looking for issues that you might not know about (most likely your inexperience with the code base).

You’ll likely go through a few rounds of the reviewers suggesting changes. But your reward for doing that will be you changes being merged into the upstream repo.

Congratulations: you’ve just contributed to an open source repo.

Just Code?

Of course there are other ways to contribute to an open source repo. You can write documentation, help with translations, or a myriad of other ways. You’ll have to read through the CONTRIBUTING.md file for the repo you’re interested in, as every repo is different.

What If I Don’t Want To Work On Other People’s Code?

That’s a totally fair question, and a valid point. For one reason or another, you may not want to contribute to some other repo. So why not start one of your own? But what should you make?

I’ve always given the advice that you should find something that you’re interested in, or something from your life that you want to automate, and work on that. It doesn’t have to be the greatest app or package ever either. And it doesn’t matter if there are applications out there that already do what yours does.

Here’s some ideas:

Todo list
Habit tracker
Basic maths library
Static site builder
A 2D game using HTML canvas and JavaScript

These might not be the most exciting ideas, but in putting them together you’ll be using the same techniques as more complex application ideas. It might be worth starting with a Todo app (or any of the other ideas) first, to make sure that you have the personal bandwidth to see it through to a satisfying point.

Once you’re happy with your progress on these app ideas (even if they never ends up being finished), you can move on to something else.

What I Want To See In An Open Source Portfolio

Suppose I was hiring, and I was looking through your repos. Here’s what I would want to see:

a number of projects which are important to you

These can be your own personal projects, or they can be contributions to some other open source repo. What I’m looking for is a drive to create something and to solve problems. I also don’t mind if they’re not in languages and frameworks that I’m used to using. One key point (if it’s your own repo) is that I want clear steps for getting it up and running after cloning the code.

analytical thinking through the code

When I’m reading through some code, I want to be able to see if I can understand WHY it’s doing something. I don’t care about WHAT the code is doing, because I can read the code. I want to know why you wrote it the way you did. Are you dealing with some esoteric file format? Is this your own spin on the A* search algorithm? Why did you choose to write the code the way that you did? Did you stay away from something in the framework, and why?

documentation

Getting the repo running is half the battle. Are there any secret keys that I need? Do I need to have a database somewhere? Am I expected to use an external API? Tell me what I need and where to configure everything.

tests

It doesn’t matter if you write the tests before or after you write the code. What I’m looking for is the thought and design work that goes into writing composable, testable code. Of course, the tests have to pass

and they have to pass by asserting something useful, too

the main branch needs to work

The main branch needs to work when I hit the run button in my IDE. Other branches can represent work-in-progress, but I want to see main working, and the barrier to entry for getting it working needs to be as low as possible.

In Closing

Getting into open source development is way easier than it was back in the early ’00s (when I first started), and it’s almost trivial to contribute something meaningful to a popular repo these days. But contributing to other people’s repos isn’t the only way to do it; you can build up a portfolio of applications and libraries which show off you skill and thinking.

Just be sure to list your favourites in your CV/Resume, otherwise I might not ever know.