The Newbie’s Bioinformatics Toolkit

408

Picture it: you have just agreed to a research project that has a strong bioinformatics foundation, but you have never coded before. Where do you even start? You could spend hours looking through various packages and programs to see which of them offer you the utility you require on a day to day basis, only to find that some of them are not well supported while others outright crash every time you use them.

In order to save you time (and headaches!), we have put together a list of the most useful applications that every researcher, no matter what stage of career they are at, should have in their bioinformatics toolkit. All of the recommended software is completely free to use.

Statistical Analysis with R and RStudio

R wasn’t really well used when I first began my bioinformatics journey, but it has fast become one of the most practical scripting environments for those in the biological sciences. Quickly seeing the versatility of this application, I started delivering workshops to my colleagues on R basics. R is a statistical programming application that offers superior versatility through an environment that supports scripting, data analytics and visualisation. Even better is the fact that there are tons of modules and packages available to help you perform the tasks you need to do each day.

We have a guide on using R and RStudio which you can find here. We have also put together a guide on how to make your code more reproducible by using R Markdown.

Text Editing with Atom

Almost every time you’ve wanted to write out some text, you’ve probably opened Microsoft Word. However, for bioinformatics, you need a different type of text editor. The reason for this is because Word inserts a bunch of codes and characters that you cannot see that break code. Your operating system might come with a text editor, such as Notepad, but while this will allow you to code, it is not very programmer friendly.

While there are many, many text editor choices for every operating system, the one that I have found to be the best is Atom. Atom is a hackable text editor that offers versatility through packages that make your programming life easier. One of the addons I love is the PlatformIO IDE Terminal which offers you a terminal command line for easy testing of your scripts within one window. Atom also has syntax highlighting for a range of programming languages, and you can customise how your editor looks with the range of available themes.

Atom text editor for bioinformatics
The Atom text editor with the PlastformIO IDE Terminal package addon.

Python and Markdown with Jupyter Notebook

We have talked about keeping your work organised with R Markdown before in our previous post, but you are going to be using Python a lot as a bioinformatician, so finding a similar solution is crucial. Enter the Jupyter Notebook application. While Project Jupyter offers a ton of tools as part of their JupyterLab software, the basic notebook software offers everything you need to keep your code organised and reproducible.

The notebooks offer excellent productivity through the use of code blocks that you can run independently of each other. This makes it really easy to test certain aspects of your code without having to run all of it. If you are stuck on a certain command, you can quickly and easily debug that code alone, which saves you time and resources, depending on what you are trying to do. All of the visualisations you produce will be output within the notebook document, too, so there’s no need to go and open files from your file browser.

Jupyter Notebooks also offer the ability to install accessory packages called nbextensions that make your work easier. Another awesome function is the ability to specify which Python version you want to use in each notebook.

A simple jupyter notebook with markdown and code.
A simple Jupyter notebook with markdown and code.

If you are going to be working with more and more notebooks, then we strongly recommend you install the full JupyterLabs software. We will be bringing you tutorials on how to install and use Jupyter notebooks in the future.

Happy coding!

I’ve given you the rundown of the applications I feel are an essential part of the Newbie’s Bioinformatics Toolkit. Though I am no longer a newbie bioinformatician myself, I still use each of these applications each and every day. Let us know what you think, and sign up for our newsletter below for updates on our new tutorials, how-to guides as well as all of our tips and tricks!