Amelia shares how innovative data science fuels science innovation
Staff Data Scientist
At Zymergen, math-professor-turned-data-scientist Amelia Taylor initially focused on plate model development, which is the practice of designing materials, processes, and measurements at low volume such that they’re predictive of strain performance at higher volume. It’s a crucial piece of developing microbes that work reliably at scale.
After leading the data science team’s work in this area for several years, she’s recently moved over to the Performance Modeling group. In her new team, she works with fermentation data, determining how best to use measurements captured over the course of multi-day fermentation experiments.
“Data comes in all different shapes.”
What constitutes Big Data is in the eye of the beholder. One of the things that is particularly challenging about the data we have, and working with biological data in general, is that ‘big’ to us actually means ‘wide.’
For example, at a social media company, they might have a gazillion observations but only four features they care about. We have a good number of observations but not a gazillion. And the number of features we care about is large relative to the number of observations that we have. Having such ‘wide’ data is its own challenge in data science, which I think is particularly interesting, and that we grapple with all the time here at Zymergen.
“Helping design the data-gathering processes is super cool.”
When we develop a plate model, we’re basically developing a manufacturing process that’s going to generate the data that we’ll then analyze and build predictive models on. It’s pretty rare, as a data scientist, to have any control over the data-generating process–and we do, which is super cool. We’re developing the best data-generating process we can to then build all of the rest of our data science on.
Our predictive models feed back into our strain recommendations: we’re developing the systems to determine which genetic changes we should make to which strains. And we have control over this core data-generating process. I think that is just amazing.
“We’re solving a critical problem for the company.”
What we care about is the strain performance. For each strain, our scientists take a sample and grow it in a well, that’s in a plate, that’s part of a group of plates, that’s part of an experiment that’s being run this week. So there’s a lot of opportunity for statistical bias in that process. How do we know that the differences we observe are because of real differences in the performance of the strain?
We use techniques like normalization–we have a cutting-edge Bayesian approach–and unsupervised outlier detection to provide statistical rigor in determining which performance differences are valid and which may have arisen because of biases in the experiment. So our work here is key to Zymergen’s unique ability to predict the performance of strains at industrial volume based on their performance in plates.
Both scientists and data scientists use this performance data–we’re all solving hard problems together, so we all need the best information possible. Within the data science team, for instance, performance data is an input into the algorithms we’ve built that recommend specific genetic changes to improve strain performance. It allows scientists to consider changes they wouldn’t or couldn’t have otherwise. So it ends up as an amazing collaboration between us.
“An opportunity to take my expertise into another realm.”
After having worked primarily on plate model development for several years, I recently moved over to our Performance Modeling team. On my new team, I’m working more with fermentation modeling, looking at how we can understand our fermentation processes better and feed that back into our plate models, and then improve our predictive ability. I can bring a lot of context and perspective across projects; it’s an opportunity to take my expertise into another realm. I’m super excited about using all these things to make better business decisions.
Astronaut, brain surgeon, and third grade teacher.
WHAT DID YOU STUDY:
SOMETHING THAT INSPIRED YOU RECENTLY:
My smart and thoughtful teammates inspire me every day to be my best self.
ONE THING MOST PEOPLE DON’T KNOW ABOUT YOU:
I traveled to Japan, as staff, with the first World Games Ultimate Frisbee team in 2001.
BEST ADVICE YOU’VE RECEIVED:
Decision-making can’t happen in a vacuum. Great ideas can come from anywhere, so make sure to hear from everyone. And when you make a decision, make sure you communicate how you arrived there, as this provides fuel for even greater ideas.