Skills

Bioinformatics

DNA sequence analysis & software design

Statistics

Application and interpretation

Python

Data analytics & machine learning

R

Data cleaning, visualization & statistical analysis

Git and Github

Project management & version control

Data management

Large-scale, cloud-based data analysis on remote access servers

Computer Vision

image-based machine learning algorithms

Scientific writing

Linux and Unix

Experience

 
 
 
 
 

Research Scientist

Fisheries and Oceans Canada | Bedford Institute of Oceanography

Oct 2021 – Present Halifax, NS
  • Bioinformatics: Created whole genome re-sequencing data analysis pipelines for efficient, parallel processing of terabytes of data from hundreds of fish samples, resulting in the identification of millions of genetic variants.
  • Machine Learning: Analyzed genetic and environmental data using supervised and unsupervised machine learning algorithms to characterize populations’ structure, evolutionary history, and susceptibility to climate change.
  • Data science and software engineering: Designed custom programs to clean, visualize, and statistically analyze large datasets in the programming languages Python and R.
  • Cloud-based big data analysis: Made scientific discoveries through the management and analysis of terabytes of data on remote access computing clusters. Use of Microsoft Azure cloud computing environment and resources.
 
 
 
 
 

Data Scientist

SomaDetect Inc

Jul 2020 – Aug 2021 Remote
  • Image Classification: Designed, tested, and deployed a quality control algorithm for selection of high-quality image data for subsequent biological predictions.
  • Time Series Analysis: Designed an image-time-series flow detection algorithm (bidirectional LSTM) capable of identifying when milk was flowing through sensors with high accuracy.
  • Statistics and Data Visualization: Developed statistical tests to validate performance of a variety machine learning models and facilitate the biological interpretation of results.
  • Software development: Packaged machine learning models into Python libraries for edge deployment on company’s sensors.
  • Scientific communication: Responsible for meeting with external stakeholders including farmers, venture capitalists, and researchers to clearly communicate the methods and efficacy of the machine learning algorithms used in the company’s products.
  • Maintained and documented software using Git and Bitbucket for version control.
  • Worked as a member of a team employing Agile project management and using tools such as Jira, Confluence, and Trello.
 
 
 
 
 

Postdoctoral Researcher

University of Guelph | Centre for Biodiversity Genomics

May 2019 – Jul 2020 Guelph, On
  • Design of an alignment-free DNA classifier, alfie. The program uses a neural network to rapidly assign kingdom-level taxonomic classifications with greater than 99.5% accuracy and no sequence alignment.
  • Created coil, an R package for alignment, translation and error evaluation of COI-5P barcode data. Available on CRAN
  • Created debar, an R package for the identification and correction of technical errors in high-throughput DNA sequencer outputs for the DNA barcode, COI-5P. Available on CRAN

 
 
 
 
 

PhD Student

University of Guelph | Department of Integrative Biology

Sep 2014 – Apr 2019 Guelph, On
Research projects:

  • Characterized the genetic basis of important Arctic charr aquaculture traits.
  • Designed a custom 87K SNP genotyping array for Arctic charr.
  • Contributed to the assembly of the Arctic charr reference genome.
  • Constructed a linkage map for the Arctic charr genome.

Publications

DNA barcoding and metabarcoding are now widely used to advance species discovery and biodiversity assessments. High-throughput…

Characterization of biodiversity from environmental DNA samples and bulk metabarcoding data is hampered by off-target sequences that …

Conference Talks

Recent Posts

R’s tryCatch function is a great tool that helps facilitate robust error handling. It lets you try to run a block of code and if an …

As part of my current postdoctoral research I’ve built the R package coil, which is designed to aid users in DNA barcode data cleaning …

Note: Here you will find the raw RMarkdown file for this post, in case you want to follow along and execute the code yourself! …

Older readers of this post may remember the boot screen from Windows XP. This featured a load bar that was there to essentially give a …

When applying a function to a vector, list or dataframe column, your first instinct may be to iterate across the series of inputs. By …