3 Computational resources

3.1 Wynton

Wynton is a large, shared high performance computer cluster that is available to all UCSF members. If your work is too large to run on your own laptop or would benefit from a highly parallelized workload, Wynton should be used.

3.1.1 Wynton Best Practices

If you plan to use wynton, you should know about the following.

3.1.1.1 Login vs Development vs Data Transfer Nodes

The login (log) node is where users can submit jobs or transfer to other nodes, such as development nodes. While submitting jobs to the job scheduler is okay, running software on these nodes is not advised.

The development (dev) node is where users may run quick tests. This is also the node where you may access prepackaged software already available on wynton, or install your own software locally. You can access this node by signing into the login node, and then enter:

ssh dev1

to log onto the dev node. There are multiple dev nodes so if dev1 is not available, there may be an alternative.

The data transfer (dt) node should be used if transferring files to and from wynton.

3.1.1.2 Scratch Drive

There is local (/scratch) and global (/wynton/scratch) scratch space.

3.1.1.2.1 Local

Local scratch should be used to hold intermediate files produced by your scripts. This is a much faster drive space that is local to the node you are using to run your script.

3.1.1.2.2 Global

Global scratch is accessible by all nodes and is where you should keep your large files or store results to be shared with others. To use this space, it is recommended to use the following command to create a folder with your username so that your work can be isolated from others:

mkdir "/wynton/scratch/$(whoami)"

As an example, if your username is jonsnow, you would now have a folder for your work under /wynton/scratch/jonsnow.

3.2 Aragorn

Aragorn is a moderately powerful desktop computer housed in our lab (built in 2014) that can be used for small workloads. Generally, if you can run it on your relatively new laptop but don’t want to tie up using your computer, Aragorn is a good alternative. However, it is limited in its ability, and should only really be used for short term usage. Longer term jobs (up to 2 weeks) should be run on Wynton.

  • How to access Aragorn/theMines Who to ask for account R Studio?
    • At least one member should be responsible (ie. an admin) for Aragorn, in charge of adding users.
    • For now ask our Bioinformatician, Brian Palmer

Specs:

Component spec
Processor Intel i7-4770
RAM 16gb DDR3
Storage 250gb SSD + 1tb HDD
Power Supply 430W

3.3 Arwen

3.4 The Mines ⛰️

The Mines are our network attached storage (NAS), essentially a small server with a very large amount of storage (~60 tb). Used for long term storage of raw data. Data is backed up nightly to an offsite location.

3.5 Nextflow

Nextflow is a workflow language that can generate job submissions for many HPC scheduler platforms. Recommended for production level pipelines. An example is Sun Grid Engine, the scheduler currently used by Wynton HPC (Jan 2023). More information about Nextflow on their website. Example pipelines can be found under the EPPIcenter Organization on GitHub or at nf-co.re.

3.6 Installing software

If you are using a shared resource, you will find that there are issues with insufficient privileges to install at the site level. You should instead install binaries locally. One package manager that is recommended is conda or miniconda3. Conda environment/ best practices (link somewhere)

3.7 Github

A code repository where members of the EPPIcenter store pipelines, tools and analysis. GitHub enables collaboration and sharing projects under the EPPIcenter organization. Members are recommended to create a GitHub account and request access from one of the organization admins.

3.8 Docker Hub

A repository for docker images, to store and share custom images. A user that needs access to the image can easily pull the image from the repository. Another use case is configuring nextflow to use docker, which will automatically pull a specified image. To learn more, please refer to the docker docs.

3.9 Stack Overflow for Teams

A knowledge sharing platform for EPPIcenter questions and answers. Users are encouraged to post questions, tutorials, and answers. Additionally, when a user is stuck on a particular problem, they can search through the EPPIcenter Stack Overflow to seek a solution to their particular problem.