skip navigation

Bioinformatics@HSL: BBT Workshop Descriptions

Descriptions of Basic Bioinformatics Tools Workshops

The Basic Bioinformatics Tools Forum series provides seminar and hands-on computer workshops on bioinformatics tools and resources to faculty, staff and students of UNC-Chapel Hill.  Class sessions are held at the Health Sciences Library in the Biogen-Idec Classroom (307; 3rd floor). For some classes, registrants have the option to attend online via Blackboard Collaborate.

Below are descriptions of workshops that have been regularly offered in past semesters.


UNIX for Biologists, Part 1: Introduction   

Time:  Not currently scheduled

Instructor: Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: HSL 307

The ability to operate in a command line environment is becoming increasingly valuable in the genomics era. This workshop and the subsequent "UNIX for Biologists, Part 2: Kure" workshop are designed for experimental biologists who want to familiarize themselves with UNIX. This workshop will provide a very basic introduction to the UNIX operating system. Participants will learn how to login (software required/procedure) and interact with a UNIX server. We will cover basic UNIX system commands for interacting with the server and navigating the file system. We will learn about UNIX directory structure organization and file permissions.

Account Requirement: You will need an account on the “Kure” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Kure” account are available at this link: http://help.unc.edu/help/getting-started-on-kure/#P24_1521. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: No prior knowledge of UNIX is required.  NOTE: For those unfamiliar with UNIX and Kure/UNC computer cluster usage, this class and UNIX for Biologists, Part 2: Kure (from this series or prior offerings) are prerequisites for the next generation sequencing classes that follow them in the BBT workshop series: "Next Generation Sequence Data: Introduction and Quality Control" and "RNAseq: The TopHat Suite."


UNIX for Biologists, Part 2: Kure   

Time:  Not currently scheduled

Instructor: Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: HSL 307

The ability to operate in a command line environment is becoming increasingly valuable in the genomics era. This workshop and the earlier "UNIX for Biologists, Part 1: Introduction" workshop are designed for experimental biologists who want to familiarize themselves with UNIX. This workshop will introduce UNC resources available to bioscience researchers and will focus primarily on resources on the "Kure" compute cluster operated by UNC ITS-Research Computing. After a quick recap of basic file/folder operations used in a UNIX environment, we will cover how to move files between the server and a local computer. We will introduce the “Kure” compute cluster and the software used for managing jobs on the cluster. Finally, participants will learn to use two basic bioinformatics tools (NCBI BLAST and UCSC BLAT) on the command line on the Kure cluster.

Account Requirement: You will need an account on the “Kure” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Kure” account are available at this link: http://help.unc.edu/help/getting-started-on-kure/#P24_1521. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: It would be very helpful to attend UNIX for Biologists, Part 1: Introduction, if you are not familiar with UNIX. If you are somewhat familiar with UNIX then you could potentially skip attending Part 1.  NOTE: For those unfamiliar with UNIX and Kure/UNC computer cluster usage, this class and UNIX for Biologists, Part 1: Introduction (from this series or prior offerings) are prerequisites for the next generation sequencing classes that follow them in the BBT workshop series: "Next Generation Sequence Data: Introduction and Quality Control" and "RNAseq: The TopHat Suite."


Introduction to Next Generation Sequencing Data and Quality Control    

Time:  Not currently scheduled

Instructor: Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: HSL 307

Genome wide analysis of gene expression has been a popular technique since the advent of microarrays. Next Generation Sequencing (NGS) technology was applied for this purpose early in the life cycle and has since become one of the most popular uses of NGS. We will start with a general overview of NGS technologies and consider commonly used data formats. Because of time/space constraints, we will be using a small subset of sequences (derived from a real dataset). I will introduce FastQC which is a popular application used for quality control (QC) of NGS data. We will then take a look at examples of results from FastQC (aside from the one we will run on our test data). This will be followed by an overview of data scanning/trimming programs. These programs look for contaminating sequences (e.g. adapters) and can also trim data based on various other criteria.

Account Requirement: You will need an account on the “Kure” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Kure” account are available at this link: http://help.unc.edu/help/getting-started-on-kure/#P24_1521. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: For those unfamiliar with UNIX and Kure/UNC computer cluster usage, UNIX for Biologists, Part 1 and UNIX for Biologists, Part 2 (from this series or prior offerings) are prerequisites for this class.


Using the Tuxedo Suite for Analysis of RNAseq Data   

Time:  Not currently scheduled

Instructor: Hemant Kelkar, Ph.D., UNC Center for Bioinformatics
Location: HSL 307

The Tuxedo suite consists of several programs (TopHat, Cufflinks, Cuffdiff). We will loosely base our exploration of the Tuxedo suite of programs on this Nature Protocols paper (http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.html). We will also look at a read counting program (featureCounts) that can generate summarized count data from alignment files which is required for other packages such as DESeq2.

The agenda for this session will be fairly open. Depending on the progress we make we may need to schedule one additional session.

Account Requirement: You will need an account on the “Kure” Linux cluster managed by ITS-Research Computing. Instructions for requesting a “Kure” account are available at this link: http://help.unc.edu/help/getting-started-on-kure/#P24_1521. Please submit your account request at least 3 days before class. It generally takes 24-48 hours for account creation.

Prerequisites: For those unfamiliar with UNIX and Kure/UNC computer cluster usage, UNIX for Biologists, Part 1 and UNIX for Biologists, Part 2 (from this series or prior offerings) are prerequisites for this class.


Introduction to the Protein Data Bank and Pymol  

Time: Not currently scheduled

Instructor: Brenda Temple, Ph.D., UNC Structural Bioinformatics Core Facility
Location: HSL 307 and online via Blackboard Collaborate

This workshop will introduce participants to the Protein Data Bank (PDB), a macromolecular structure database for crystallographic and NMR experimental structures, and to PyMOL, a molecular visualization system for rendering 3D structures.


Introduction to Clinical Genomics (NIH class via videoconferencing)

Time: Not currently scheduled

Instructor: Medha Bhagwat, Ph.D., NIH Library
Location: HSL 307

This class describes how to access information about genes and their variants associated with diseases and the impact of variants on drug response and dosing guidelines. The class also provides an introduction to determination of the impact of the variants on function, pathogenicity or deleteriousness.


Browsing Genes and Genomes with Ensembl and Ensembl Genomes (A full day workshop)   

Time: Not currently scheduled

Instructor: Denise Carvalho-Silva, PhD, EMBL-EBI
Location: HSL 307

The Ensembl and Ensembl Genomes projects provide a comprehensive and integrated source of annotation of vertebrate and non-vertebrate genome sequences, respectively. The browser workshop will include presentation and demonstration, and provide participants an opportunity to gain hands-on experience in the use of Ensembl genome browsers.

Depending on the preferences of the participants, more or less emphasis can be allocated to the Ensembl browser (vertebrates) and to the Ensembl Genomes browser (non-vertebrates). Preferences will be determined through a pre-course survey of registrants. The Ensembl browser workshops typically consist of the first three "core" modules below, plus two other modules from the list below the Core Modules. The additional two modules will be determined by a pre-course survey that will be sent to workshop registrants to identify their interests:

Core modules:

  • Introduction to Ensembl: origin, goals, gene and transcript annotation, data organization and different access points (e.g. browser, BioMart, FTP site, APIs)
  • Website live demo: guided tour of the most important pages of the Ensembl website focusing on the Location tab, Gene tab, Transcript tab, and Variation tab
  • BioMart: retrieving datasets using the web tool BioMart (no programming skills required)

Two modules to be selected from list below based on a pre-course survey of registrants:

  • Variation: short scale variation (SNPs, indels), large scale variation, (CNVs, deletions, etc), phenotype and population genetics data, VEP (Variant Effect Predictor)
  • Comparative genomics: gene trees (protein-coding and non-coding genes), orthologues, protein families, whole genome alignments and synteny
  • Regulation: ENCODE data hub, and annotation of regulatory elements based on ChIP-Seq, DNase1-Seq, FAIRE-Seq, and TFBS

Prerequisites: The only prerequisite for this workshop is general knowledge of molecular biology and genomics and a familiarity with web browsers.