software design

Making use of R’s tryCatch function

R’s tryCatch function is a great tool that helps facilitate robust error handling. It lets you try to run a block of code and if an error occurs, the catch part of the function can be used to handle exceptions in a customized manner (as opposed to halting the entire script). I have personally been deploying this design pattern pretty regularly and there are two situations in which I’ve found tryCatch to be an especially handy tool in my toolbox:

Introducing coil: an R package for DNA barcode data cleaning

As part of my current postdoctoral research I’ve built the R package coil, which is designed to aid users in DNA barcode data cleaning and analysis. The package is available now on CRAN or through my GitHub! Below I’ve included the package’s vignette, which explains how you can get coil up and running. #downloading the package from CRAN: #install.packages('coil') library(coil) Abstract coil is an R package designed for the cleaning, contextualization and assessment of cytochrome c oxidase I DNA barcode data (COI-5P, or the five prime portion of COI).

An introduction to using functions in R

Note: Here you will find the raw RMarkdown file for this post, in case you want to follow along and execute the code yourself! Introduction R is considered to be a functional programming language. What this means is that the syntax and rules of the language are most effective when you write code built around the use of functions. Functions allow you to modularize code, thereby isolating different blocks in a way that makes your code more generalized, reuseable, readable and easier to debug.

This thing I built: go-fasta

github.com/CNuge/go-fasta I spend a lot of time working with fasta files, the standard for storing of biological sequence information (things like DNA sequences or protein sequences). For a little side project, I wanted to create a tool to streamline my ability to manipulate these files. Things like merging files is relatively simple, but splitting files, summarizing sequence lengths and other tasks require more complex UNIX commands that I would have to dig up every time I wanted to manipulate some fasta files.

Returning multiple values from a function in R

During a tutorial I gave for the University of Guelph R users group, we were going through how to generate summary stats & tidy dataframes from messy data sources. This involved working with text data, and the exercise called for us to process a series of sentences and answer 3 questions about each line: Is the line dialogue? (presence of a quotation mark in the string) Is the line a question?