Bioinformatics@HSL: BBT Workshop Descriptions

Descriptions of Basic Bioinformatics Tools Workshops

The Basic Bioinformatics Tools Workshop Series provides seminar and hands-on computer workshops on bioinformatics tools and resources to faculty, staff and students of UNC-Chapel Hill.  Class sessions are held at the Health Sciences Library in the Biogen-Idec Classroom (307; 3rd floor). For some classes, registrants have the option to attend online via Blackboard Collaborate.

Below are descriptions of workshops that have been regularly offered in past semesters or are scheduled for the current semester.


UNIX for Biologists, Part 1: Introduction

Time:  Monday, 3/5/2018, 10:00am - 12:00pm

Instructor: Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: Streamed to HSL 307 OR online via Blackboard Collaborate

The ability to operate in a command line environment is becoming increasingly valuable in the genomics era. This workshop and the subsequent "UNIX for Biologists, Part 2: Longleaf" workshop are designed for experimental biologists who want to familiarize themselves with UNIX. This workshop will provide a very basic introduction to the UNIX operating system. Participants will learn how to login (software required/procedure) and interact with a UNIX server. We will cover basic UNIX system commands for interacting with the server and navigating the file system. We will learn about UNIX directory structure organization and file permissions.

Account Requirement: You will need an account on the “Longleaf” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Longleaf” account via the ONYEN Subscribe to Services utility is available at this link: http://help.unc.edu/help/how-do-i-get-an-account-on-killdevil-research-computing-cluster/. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: No prior knowledge of UNIX is required. NOTE: For those unfamiliar with UNIX and Longleaf/UNC computer cluster usage, this class and UNIX for Biologists, Part 2: Longleaf are prerequisites for the next generation sequencing classes that follows them in the BBT workshop series: "Introduction to Next Generation Sequencing Data and Quality Control", "BBTools: A Toolbox for High Throughput Sequencing Data Analysis", and "Differential Expression Analysis of RNAseq Data."


UNIX for Biologists, Part 2: Longleaf

Time:  Thursday, 3/8/2018, 10:00am - 12:00pm

Instructor: Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: HSL 307 OR online via Blackboard Collaborate

The ability to operate in a command line environment is becoming increasingly valuable in the genomics era. This workshop and the earlier "UNIX for Biologists, Part 1: Introduction" workshop are designed for experimental biologists who want to familiarize themselves with UNIX. This workshop will introduce UNC-Chapel Hill resources available to bioscience researchers and will focus primarily on resources on the "Longleaf" compute cluster operated by UNC ITS-Research Computing. After a quick recap of basic file/folder operations used in a UNIX environment, we will cover how to move files between the server and a local computer. We will introduce the “Longleaf” compute cluster and the module-based software used for managing jobs on the cluster. Finally, participants will learn to use two basic bioinformatics tools (NCBI BLAST and UCSC BLAT) on the command line on the Longleaf cluster and how to access UNIX/Linux systems via a graphical user interface (X-Win32, XQuartz).

Account Requirement: You will need an account on the “Longleaf” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Longleaf” account via the ONYEN Subscribe to Services utility is available at this link: http://help.unc.edu/help/how-do-i-get-an-account-on-killdevil-research-computing-cluster/. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: It would be very helpful to attend UNIX for Biologists, Part 1: Introduction, if you are not familiar with UNIX. If you are somewhat familiar with UNIX then you could potentially skip attending Part 1.  NOTE: For those unfamiliar with UNIX and Longleaf/UNC computer cluster usage, this class and UNIX for Biologists, Part 1: Introduction (from this series or prior offerings) are prerequisites for the next generation sequencing classes that follows them in the BBT workshop series: "Introduction to Next Generation Sequencing Data and Quality Control", "BBTools: A Toolbox for High Throughput Sequencing Data Analysis", and "Differential Expression Analysis of RNAseq Data."


Introduction to Next Generation Sequencing Data and Quality Control    

Time: Thursday, 3/22/2018, 1:00pm - 3:00pm

Instructor:  Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: HSL 307

Genome wide analysis of gene expression has been a popular technique since the advent of microarrays. Next Generation Sequencing (NGS) technology was applied for this purpose early in the life cycle and has since become one of the most popular uses of NGS. We will start with a general overview of NGS technologies and consider commonly used data formats. Because of time/space constraints, we will be using a small subset of sequences (derived from a real dataset). I will introduce FastQC which is a popular application used for quality control (QC) of NGS data. We will then take a look at examples of results from FastQC (aside from the one we will run on our test data). This will be followed by an overview of data scanning/trimming programs. These programs look for contaminating sequences (e.g. adapters) and can also trim data based on various other criteria.

Account Requirement: You will need an account on the “Longleaf” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Longleaf” account via the ONYEN Subscribe to Services utility is available at this link: http://help.unc.edu/help/how-do-i-get-an-account-on-killdevil-research-computing-cluster/. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: For those unfamiliar with UNIX and Longleaf/UNC computer cluster usage, UNIX for Biologists, Part 1 and UNIX for Biologists, Part 2 (from this series or prior offerings) are prerequisites for this class.


BBTools: A Toolbox for High Throughput Sequencing Data Analysis    

Time:  Tuesday, 3/27/2018, 2:30pm - 4:00pm

Instructor: Tristan De Buysscher, Bioinformatics Scientist, UNC Center for Bioinformatics
Location: HSL 307

The BBTools suite contains tools that assist with multiple types of high throughput (HT) -sequencing data analysis. After an overview of tools included in BBTools toolbox, we will learn how to use “bbduk” (decontamination using k-mers) for scanning/trimming data to remove adapter contamination. This will be followed by “bbmap” (sequence data aligner) which we are going to use to align the test sequence data.

Account Requirement: You will need an account on the “Longleaf” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Longleaf” account via the ONYEN Subscribe to Services utility is available at this link: http://help.unc.edu/help/how-do-i-get-an-account-on-killdevil-research-computing-cluster/. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: For those unfamiliar with UNIX and Killdevil/UNC computer cluster usage, UNIX for Biologists, Part 1 and UNIX for Biologists, Part 2 (from this series or prior offerings) are prerequisites for this class.


Differential Expression Analysis of RNAseq Data    

Time:  Tuesday, 4/10/2018, 2:00pm - 4:00pm

Instructor: Tristan De Buysscher, Bioinformatics Scientist
Location: HSL 307

RNAseq provides a snapshot of the transcription state of the cells from which RNA was harvested. One of the most common applications of this sequencing data is to look at changes in the transcriptome state between two or more conditions the cells were exposed to. We will discuss the various considerations in analyzing RNAseq data and run through a practical example using DESeq2.

Account Requirement: You will need an account on the “Longleaf” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Longleaf” account via the ONYEN Subscribe to Services utility is available at this link: http://help.unc.edu/help/how-do-i-get-an-account-on-killdevil-research-computing-cluster/. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: For those unfamiliar with UNIX and Longleaf/UNC computer cluster usage, UNIX for Biologists, Part 1 and UNIX for Biologists, Part 2 (from this series or prior offerings) are prerequisites for this class.


Introduction to NCBI Biomolecular Database Searching

Time:  Not currently scheduled

Instructor: Barrie Hayes, Bioinformatics & Research Data Librarian
Location: HSL 307

This workshop will introduce participants to Bioinformatics and biomolecular databases can be powerful tools to support biomedical and life sciences research. While the National Center for Biotechnology Information (NCBI) provides free access to millions of biomolecular records, learning how to leverage these tools can be daunting. This 90 minute workshop will provide an overview on how to use NCBI databases, and introduce strategies for using Entrez to conduct text searches.

Prerequisites: No prior knowledge of NCBI database searching is required.


Introduction to BLAST Searching

Time:  Not currently scheduled

Instructor: Barrie Hayes, Bioinformatics & Research Data Librarian
Location: HSL 307

This workshop will introduce the basics of sequence similarity searching using the NCBI Basic Local Alignment Search Tool (BLAST). Participants will conduct BLAST searches and examine the type of information retrieved.

Prerequisites: No prior experience with BLAST searching is required.


Browsing Genes and Genomes with Ensembl and Ensembl Genomes     Register     (Registration closes Monday 6/4/2018)

Time:  Tuesday, 6/5/2018. 9:00am - 5:00pm
Note: There will be a 1-hour break for lunch and 15 minute breaks in the morning and afternoon. The workshop is scheduled to conclude at 5:00pm.

Instructor: Astrid Gall, PhD, EMBL-EBI

Location: HSL 307
Registration Fee: $20.00
Sponsorship: This workshop is sponsored by the Health Sciences Library Research Hub with support from the School of Medicine Office of Research.

The Ensembl (www.ensembl.org) and Ensembl Genomes (www.ensemblgenomes.org) projects offer integrated genome, variation, gene regulation and comparative genomics data of vertebrate and non-vertebrate genome sequences, respectively, on open access web browser platforms.

The one-day browser workshop offers participants the possibility of gaining lots of hands-on experience in the use of the Ensembl genome browser, but also provides them with the necessary background information. The workshop is primarily targeted at wetlab researchers.

Content

The workshop consists of a series of modules, listed below. Most modules consist of a presentation and a demonstration of the tools, followed by the opportunity to do exercises. Participants are encouraged to bring problems/questions about their research and we will try to tackle these during the workshop using Ensembl. The exact mix of modules can be varied, depending on the preferences of the participants.

  • Introduction to Ensembl: origin, goals and organization of the Ensembl project
  • Genebuild: how are Ensembl gene and transcripts predictions made?
  • Data export with BioMart: retrieving genomic information using a web interface (no programming required).
  • Comparative genomics and proteomics: orthologues, protein families, whole genome alignments and syntenic regions
  • Variation: SNPs and other polymorphisms, haplotypes, linkage disequilibrium, structural variants like CNVs
  • Regulation: Sequences that may be involved in gene regulation, and integration of ENCODE data

A typical one-day workshop will consist of the first three of these modules, plus two other modules of your choice. Participants' interests and preferences for the modules will be determined through a pre-course survey of registrants. 

Prerequisites: The only prerequisite for this workshop is general knowledge of molecular biology and genomics and a familiarity with web browsers.


New  Introduction to R    

Time:  Tuesday, 4/5/2018, 9:00am - 11:00am

Instructor: Yucheng Yang, Postdoctoral Fellow, Department of Genetics
Location: HSL 307

In genetics, sequencing technologies generate extensive amounts of data. Statistical analyses are increasingly widely used to analyze these data, especially single-cell RNA sequencing (scRNA-seq) data. In this 2-hour workshop, we will first take a brief look at R, the most popular statistical software, and then run through a practical example in scRNA-seq clustering analysis using R functions (class.R).

 

Prerequisites: No prior knowledge of R software is required. Additionally, the Introduction to R class does not require completion of any of the other workshops in the Basic Bioinformatics Tools workshop series.


Support Research, Teaching, & Learning - Give to the HSL