NLP in biology

Björn Runåker
2 min readJan 23, 2021

Let me show you a use case perfect for using NLP and also one that I’m planning to demonstrate when there are enough resources:

In this article, they are using the understanding of language to predict viral escape. Suppose you look closely at how viral mutation and antibodies evolve. In that case, there is a close correlation between natural language grammar and meaning (syntax and semantics) and the sequence mutations in a protein in a virus.

https://science.sciencemag.org/content/371/6526/284

Abstract

“The ability for viruses to mutate and evade the human immune system and cause infection, called viral escape, remains an obstacle to antiviral and vaccine development. Understanding the complex rules that govern escape could inform therapeutic design. We modeled viral escape with machine learning algorithms originally developed for human natural language. We identified escape mutations as those that preserve viral infectivity but cause a virus to look different to the immune system, akin to word changes that preserve a sentence’s grammaticality but change its meaning. With this approach, language models of influenza hemagglutinin, HIV-1 envelope glycoprotein (HIV Env), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Spike viral proteins can accurately predict structural escape patterns using sequence data alone. Our study represents a promising conceptual bridge between natural language and viral evolution.”

The company ProGen demonstrates how to use the NLP model to learn biology’s language to generate medically useful proteins.

https://blog.einstein.ai/progen/

ProGen is able to successfully generate protein sequences that appear structurally and functionally viable. This is a result of Machine Learning using NLP.

For us that do not have the resources needed to research this field directly, we can still help by providing computing resources to Folding At Home.

https://foldingathome.org/

--

--

Björn Runåker

Software developer into deep learning in combination of Big Data and security