This past weekend, all of us in Titus Brown’s Data Intensive Biology (DIB) lab went to Yosemite Bug, which is just outside Yosemite National Park, for our first (annual?) lab retreat. We had a great time! I personally found it inspiring to gather thoughts on the direction of research in the lab, ask questions about what everyone else is working on, and think about how my research goals fit into the larger picture of the lab.
Here are some notes from the weekend in case anyone is interested. Please comment and ask questions. Further discussion is welcome!
Photos by Harriet Alexander (left – Camille Scott looking far yonder) and Lisa Cohen (right)
In October, about 4 months prior, we all agreed on location, date and began planning (gathering info, booked rooms and conference space from the resort). About one week prior, we had a brainstorming meeting about the schedule and what we would discuss.
Everyone drove up (~3.5 hrs from Davis) to Yosemite Bug on Friday. We discussed lab business on Sat and Sunday with some time in the middle to enjoy the outdoors and discuss with each other in an informal setting. The idea was to stimulate discussions about the lab (e.g. research, culture and career development) in a context outside the lab. We wanted it to be different than a conference. A retreat would just be our group, more broad/casual than regular lab meetings, to discuss the big picture of the lab’s research direction.
Saturday morning, presentations
To open, Titus identified major themes in the lab right now:
* Expect many big samples continuously arriving,
* Sketch data structures and online/streaming algorithms are good,
* Pre-filtering is good, especially when each step has low false negative rate
* Decentralized is good
Throughout the morning, there were presentations from the major projects in the lab. Presentations were 10 min each with 5 min discussions. Some hot topics bled over to be ~30 min each. These were informal talks with markers and flip chart only (no slides or projector allowed). The internet in the resort was patchy, so luckily the goal of the retreat was not to work on anything requiring an internet connection.
People gave a broad outline of what they are doing, followed by one or two things we’re excited about (enables X and Y, or Z is an opportunity), then 5 minutes of questions.
* Camille Scott / Streaming the RNAseq
* Luiz Irber / Architecture of all the buzzwords (amazing basic-level explanation of the internet for those of us who are unfamiliar)
* Taylor Reiter / sourmash RNAseq
* Daniel Standage / kevlar
* Harriet Alexander and Lisa Cohen / MMETSP and challenges of multi-species data analysis
* Tamer Mansour / Progress and opportunities in vet genetics
Sat afternoon, free time!
Weather was great, sunny! We had anticipated not-so-great weather with just-above-freezing rain. But, this was not the case. In the afternoon, we all piled into cars for exploration of Yosemite National Park! Shannon Joslin did an amazing job of summarizing available social activities into this list:
Sat evening, social time!
Jessica Mizzi, who takes fun VERY seriously, coordinated games and activities:
I participated in a few heated games of Settlers of Catan and Pictionary. It turns out there are several members of our lab who are relentless resource emperors and that there are varying degrees of artistic abilities. 🙂
Photos by Camille Scott (left) and Daniel Standage (right)
Two postdoc lab members recently attended the Moore DDD early career workshop and brought back suggestions for continued discussion on the field of ‘data science’. We found it useful to discuss the larger context of how we market ourselves, develop our careers, and fit ourselves into biological research. Data-intensive biology is a large field. In our lab alone, we represent diverse disciplines, e.g. Software Engineering, Genomics, Biological Oceanography, Comparative Physiology, Medicine, Mathematics, just to name a few. We cannot each have a deep understanding of all of these peripherally-related topics. Yet, our collective knowledge is great. How can we better extract overlapping skills from each other to solve hard problems?
We broke out into 3 groups of 6 people at a mixture of career levels, e.g. beginning grad student, mid-level grad students, postdocs, post-PhD industry-bound to address these specific questions:
* How does Person x learn y topic?
* What works?
* How do we teach Davis community about y topics? (Especially if when we might not necessarily know these things ourselves.)
The following is an approach I’m trying based on some helpful blogging advice: choosing words and phrases explaining what has worked for us (or me, specifically) rather than telling people who read this what they should be doing. This is because I am more apt to listen to someone else’s wisdom gained from their own experiences.
– Learning topics has depended on why we want to learn.
– Up to the learner for finding motivation, not necessarily a list of what others think you need to know. Although, we acknowledged that it is hard to figure out what you need to know, if you don’t know. Some base level knowledge is required.
– We have been told that skills in bioinformatics are required for successful future careers. However, there is no institutional-level plans for how to disseminate these skills to learners.
– Beginning learners can feel overwhelmed because of the interdisciplinary nature of bioinformatics, sometimes requiring a combination of knowledge and skills in computer programming, statistics, cell and molecular biology, etc.
– It has helped many of us to take a project-based learning approach.
– Three motivating scenarios were identified for developing a working knowledge of bioinformatics skills:
- Biologist generating data, e.g. RNAseq for differential expression. In the long-term it doesn’t seem to make sense to rely on a sequencing facility to analyze data because decisions made during analysis affect the results. Making these decisions requires revisiting the question of why the data were generated in the first place, which is not necessarily within the scope of an independently contracted analyst to be familiar with. It has been our experience that data are best analyzed by people who know the projects very well.
- Data analyst understanding many projects simultaneously and advising those generating and analyzing their own data what is the best way to approach analysis based on their own experiences, consensus in the field and benchmark testing.
- Data Scientist at a senior level guiding the direction of a research, training program, and developing new methods for processing data.
– Our lab has representation from all three of these categories.
– Some combination of internet-learning, buddy system, participating in a community are all key aspects of learning bioinformatics skills that seem to work for all of us.
– Buddy system. Many of us have found that forming connections with a person or a community of people at a knowledgeable level to answer questions has been necessary for our learning process. Community and personal connections can be fostered via workshops, classes conferences, social events, friendships.
– We have found good luck with using opportunities to collaborate, asking for advice from experts when we meet them. The great thing about this lab and knowing Titus is being able to take advantage of his far-reaching network of collaborators.
– Internet-learning by Google searching. Stackoverflow is our friend
– Some of us have chosen a good book, e.g. Practical Computing for Biologists by Haddock and Dunn, to read and go through the exercises on a regular basis together with a group of people
– We’ve found it helpful to join a community to ask/answer questions. We are actively working towards fostering such a community at UC Davis via the DIB lab! See our website for training workshop schedule and to sign-up for the email list: http://dib-training.readthedocs.io/en/pub/
– It has been our experience that significant investment of personal time is required to learn.
Here are our flip chart notes from this discussion:
The last afternoon discussion centered around lab culture hacking, i.e. what are we doing well, what needs improvement. A motivational speech from Titus: there are always going to be various things the lab can do better, but generally, we’re in a good place! The lab is a set of opportunities. Choose your own adventure. If we’re not doing something, we can provide resources to accomplish goals. Overall, his expectations are for us to us do wonderful and unexpected things. Preferably multiple wonderful things!
Then Titus left for an hour and a half while we discussed the lab. Topics included more frequent journal club, more frequent project reporting and scrum at every lab meeting (rather than one designated presenter each meeting only presenting slides on their own research the whole time), lab communication on Slack vs. email vs. Google calendar for scheduling. The common theme was that while our projects are all very different, we are all connected and the onus is on us to take more initiative to communicate with one another. We talked about positive and negative aspects of the lab. But generally concluded that our lab is awesome, because of our strong community and diverse backgrounds of our lab members. The meeting adjourned, with some of us returning to watch the Super Bowl while others of us stayed on to play more Settlers of Catan and Pictionary!
Thank you to Yosemite Bug, for the quiet, cozy, accommodating place for our group to stay and be productive this weekend. It was a perfect, small venue for this retreat.
Thank you, Titus for bringing us on this retreat. Thank you, everyone in the DIB lab for being fun people. And thank you, Moore Foundation for funding!
Photo by James Word
Photo by Shannon Joslin