pambot.github.io

Still a work in progress...


Projects | Archive | About

Google Summer of Code 2017 - cBioPortal

24 Aug 2017 | gsoc

Rejoining cBioPortal

Since I had a fun and productive summer last year, I thought I would join cBioPortal again this summer for GSoC for my last summer as a PhD student. This year’s project concerned ctDNA, which is a form of free-floating DNA in the bloodstream that contain tumor DNA in minute amounts. Since you can draw out tumor DNA relatively easily using this approach, one patient can now have a series of biopsies over time instead of just one or two whenever they get surgery. The challenge is to visualize this data in a way that makes sense of the information gathered. Working with JJ Gao and Ino De Bruijn, I wrote a proposal that centered on the potential use cases and the kinds of widget prototypes that would result.

Community Bonding

In a way, I had to get acquainted with cBioPortal afresh because, earlier in the year, they rolled out their new repository for their refactored front-end interface which runs on ES6 + Typescript + React. I would be mainly working with the Patient View page, which was the vanguard for their new look.

Patient View Prototypes

Since I had already bonded with the community with my work last year, this year I played around with the repository and then dived right into making the proposed widget prototypes. This was my introduction to React as well. I wrote a component for a new tab to contain the ctDNA widgets, and after going through several potential React plotting node modules, we settled on using Recharts for the line plot, but the heatmap plot was tricky because there aren’t many mature React plotting library supporting heatmaps that also offer them without a paywall. I cobbled together one using the ChartJS heatmap library. The React way of controlling the component state made it easy to link the two of them together to a gene list that could be entered by the user.

lineplot_heatmap

Commit log: https://github.com/pambot/cbioportal-frontend/commits/ctdna-tab?author=pambot

Patient View Oncoprint

cBioPortal is really proud of its Oncoprint library, which is a library for generating a matrix of rectangles that can be populated by either glyphs to represent discrete events or a solid color to represent continuous phenomena. Thus, cBioPortal uses it to show genetic alterations and/or a heatmap of gene/protein expression in cancer patients (which was the subject of my work last year). JJ decided that, rather than using an external heatmap library, it may be more useful to make another use case for the Oncoprint to display the variant allele frequencies or cancer clonal fractions (both continuous values) of the cancer patients.

This Oncoprint was going to be transposed - the number of biopsies was potentially very variable, while the number of genes would be relatively constant, so it was decided that the genes should be columns and the biopsies should be rows. There were two major challenges: one was transforming the data feed from the web API to produce an Oncoprint that was transposed, and the other was fitting a library written mostly in jQuery from the old stack to the new TS + ES6 + React stack. Data transformations were done using lodash, but the bigger challenge was the latter, because the library wasn’t importable via node and it had no typings. Eventually, it was solved through a bit of a hack by putting the library file into common-dist/ to import via an HTML script tag, making a TS interface for Window, and wrapping the instantiation of Oncoprint in a function that took the Oncoprint object in as a parameter with an any type. The result, which was embedded into the existing Patient View page, is shown here with real ctDNA data:

ctdna_oncoprint

Commit log: https://github.com/pambot/cbioportal-frontend/commits/patient-view-oncoprint?author=pambot

Updating the OncoprintJS Library

Though the basic Oncoprint could be viewed, a few things could be improved. The major thing was loading Oncoprint via a script tag and not through require, and a minor point was some links to assets in the library were broken now because they weren’t being directly served by Webpack anymore. The first things I did, which I had intended to do last year anyway, was to write minimum working examples (MWEs) for Oncoprint and write some documentation in the README. I made a very basic node server and one discrete data example and one continuous data example, and here’s the result:

oncoprint_mwe

Commit log: https://github.com/pambot/oncoprintjs/commits/mwe?author=pambot

The examples above show Oncoprint working as it should, but actually some assets are missing if you just use it from NPM. I attempted a fix by dynamically creating the SVGs for one of the major assets (the three grey dots), but this idea was discarded because cBioPortal would like a more general way of including image assets.

Commit log: https://github.com/pambot/oncoprintjs/commits/menudots-svg?author=pambot

Finally, I figured out how to make the package directly importable through require, instead of having to require a full path to the library within the node package. It was simply taking the gulped library file and putting in the home of the repository and renaming it index.js. Some configurations had to be changed to clean up the node package and make the build process automatic using Travis CI. Also, the README had to be updated so that new users to Oncoprint could get a running start.

Commit log: https://github.com/pambot/oncoprintjs/commits/node-package?author=pambot

comments powered by Disqus

Older · View Archive (2)

Google Summer of Code 2016 - cBioPortal

How the Summer Began

My journey with GSoC began about a week before the proposal deadline when I found out I was eligible to apply. I looked through bioinformatics projects, since that’s my PhD field, and I found that one of my dataviz heroes, cBioPortal, was looking. I pinged their issue list and introduced myself just in case it wasn’t too late, and not only was it not too late, but they were glad I pinged them because I come from a proteomics lab that’s part of CPTAC, a cancer proteomics consortium, and they had this big data update that they had been meaning to get running. I spent a grueling two days writing and revising my proposal, which ended up getting accepted. From then on, JJ Gao became my primary mentor and was heavily involved in directing all of the subprojects.