Set up a Ubuntu machine

System Choice

I chose Ubuntu 18.04 LTS. Not any earlier version because to my knowledge most, if not all software for bioinformatics analysis will support 18.04. One exception to this is Nanopore’s Albacore base caller, but it is being replaced by Guppy anyways. 18.10 (the current highest version) was not chosen , because the improvements in features do not seem to justify an upgrade away from LTS.

Hardware

GPU

Ubuntu seems to dislike some high-spec GPUs. First, it will cause the installation to stuck. Even after successful installation, it often cannot boot into the system.

To resolve boot issue: highlight Ubuntu at the boot selection screen and press E, the type nomodeset at the end of the line starting with Linux. Then press F10 to boot.

To resolve the issue completely, GPU driver need to be installed. This tutorial is a good guide to install via command line. One caveat is that to install via sudo apt install nvidia-driver-410 instead of ... nvidia-410 according to this Stack Exchange post. Just now I also found this guide that seems to support GUI installation.

Hard disk

To rename a hard disk on 18.04 LTS, follow the instructions here. The search engine results on AskUbuntu website may have been outdated.

Software

Browser

My current default web browser across all platforms is Firefox for its privacy policy and plug-ins. This is also the default of Ubuntu so there is no need to install this. However, I do need to sign in my account and then sync my settings and plug-ins.

IDEs

I am currently using Pycharm Professional for python and RStudio for R. RStudio was downloaded from the web browser and Pycharm Professional was installed via the Ubuntu software manager - I found the installation via browser a bit troublesome because I had to input command to make it available across the system.

Pycharm

When setting up Pycharm, I chose to omit Vi support but select .md and R support.

RStudio

  1. Turn off save .RData when exit - all data should be able to be regenerated by a script for reproducibility
  2. install.packages(c("tidyverse","rmarkdown")) for data science
  3. install.packages(c("blogdown", "bookdown", "pagedown")) for personal computer

A common trick I have used when errors appear during installing R packages is to restart the R session. This makes sure that the packages are installed when R is fresh.

Text Editor

Of course sometimes I do not need a full-fledged IDE for my tasks e.g. viewing a .txt file. My current text-editor in most machines is Atom. It is pretty awesome because of the number of packages available. However, one big short-coming of it is its start-up time. It took around 3 seconds whenever I want to open Atom from scratch (its Ubuntu version seems to be faster, though). Thus, I am also testing Sublime text to see if it fits my needs.

Git, GitHub and SSH

I recommend to use GitHub with ssh so that one does not have to key in their username and password every time when they try to access their repositories. The tutorial by Github is pretty awesome and can be followed by this first step and second step.

To set up the ubuntu machine itself as a server, follow the advice here.

Office Software

It will be arguably easier for data analysis if office software is incorporated, IMVPO (in my very personal opinion). First of all, not all data you get will be the standard version like csv or text file, since excel files are still commonly used for small scaled data analysis. Thus, having a software that can open excel files definitely ease the process of getting the data out. This is especially so if the excel data is not in a tidy format (e.g. there are combined rows and columns). Rather than using a programming language to parse through such a file, it is so much easier to use excel’s native functions to turn combined cells into individual cells with repeated values.

Word, PowerPoint and pdf software will also help to just view these formats for reference. It is certainly not necessary to install Microsoft (since there isn’t on Ubuntu) and there are many open-source or free office software available. So far I have been using WPS on my windows machine for more than one week and could not really find much difference between it and Microsoft.

Other Software

Remote Desktop

I am using Anydesk currently because it supports both Windows and Ubuntu system. The server computer seems to be lagging after the connection has been cut but at least it does not randomly cut off my connection like Teamviewer does.

Programming Languages

Python

Anaconda is used to manage Python. It can be easily used to set up virtual environment on different versions of Python and install software. The installation instructions can be found here

R

R comes with the 18.04 LTS but is not updated. To update R, on terminal:

  1. sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9. This includes the apt-key needed to download R from a server.

  2. sudo echo "deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/" >> /etc/apt/sources.list. Here we need to add the CRAN server to apt’s source list. The reference to 18.04 is bionic, and it can be trusty etc. for previous Ubuntu versions. If the above command does not work, use an editor such as nano to edit the source.listfile manually by adding deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ to the file on a new line

  3. sudo apt-get update to update changes made to the source.list file

  4. sudo apt-get install r-base r-base-dev to install/update R.

Avatar
Timing Liu
Computational Biologist & Medical Student

Personalizing medicine