Cambridge Healthtech Institute第2回年次

Machine Learning Approaches for Protein Engineering
(タンパク質エンジニアリングのための機械学習のアプローチ)

実践によるバランス理論

2023年5月18〜19日 EDT(東部夏時間)

機械学習とAIツールの登場が、タンパク質エンジニアリングの分野に多大な影響を与えると期待されています。創薬と開発のプロセスは、予測ツールの欠如による非効率性をはらんでいます。将来、機械学習とAIにより創薬、設計、最適化の手法を本当に変えるためには、抗体の発見、トレーニングセットの開発、予測、スクリーニング、シミュレーション、最適化における使用方法をさらに学ぶ必要があります。PEGS Bostonで開催される第2回「タンパク質エンジニアリングのための機械学習のアプローチ」トラックでは、名誉ある専門家に加わり、抗体開発のプロセスを変革し、最終的に成功率を向上させる方法を学びます。

Scientific Advisory Board
M. Frank Erasmus, PhD, Head, Bioinformatics, Specifica, Inc.
Victor Greiff, PhD, Associate Professor, Oslo University Hospital
Maria Wendt, PhD, Head, Biologics Research US, Sanofi

5月14日(日)

- 5:00 pm Main Conference Registration1:00 pm

Recommended Pre-Conference Short Course2:00 pm

SC3: In silico and Machine Learning Tools for Antibody Design and Developability Predictions

*Separate registration required. See short courses page for details.

5月18日(木)

Registration and Morning Coffee7:30 am

NEXT-GENERATION IN SILICO PROTEIN ENGINEERING AND DE NOVO DESIGN
次世代in silicoタンパク質エンジニアリングとde novo設計

8:25 am

Chairperson's Remarks

Maria Wendt, PhD, Head, Biologics Research US & Global Head, Digital Biologics Platform (ML/AI), Large Molecule Research, Sanofi

8:30 am KEYNOTE PRESENTATION:

Recent Advances in Protein Engineering

Regina Barzilay, PhD, Delta Electronics Professor, Electrical Engineering & Computer Science, Massachusetts Institute of Technology

9:00 am

Surface ID: A Deep Learning-Based Molecular Descriptor and a Useful Tool for Drug Discovery

Yu Qiu, PhD, Senior Principal Scientist, Sanofi Genzyme R&D Center

“Surface ID” is a geometric deep learning system for high-throughput surface comparison based on geometric and chemical features. Surface ID offers a novel grouping and alignment algorithm useful for clustering proteins by function, visualization, and in silico screening of potential binding partners to a target molecule.

9:30 am Discovering antibodies from patient serum after vaccination and infection with SARS-CoV2

Natalie Castellana, CEO, Abterra Biosciences

Serum antibodies from three individuals who had been fully vaccinated against SARS-CoV-2 and subsequently infected with the virus were analyzed by Alicanto.  Serum antibodies were fractionated based on binding to the receptor-binding domain (RBD) and those binding to non-RBD sites on the spike protein.  Memory B cells reactive to spike protein were enriched and sequenced via next-generation sequencing.  A subset of the B cell sequences were identified among the serum antibodies.

Coffee Break in the Exhibit Hall with Poster Viewing10:00 am

10:40 am

Accelerating Therapeutics Discovery with Disruptive Digital Innovation

Peter Clark, PhD, Head of Computational Science & Engineering, Therapeutics Discovery, Janssen R&D

Significant advances in computational methods and hardware-accelerated scientific computing have enabled the dawn of a new era of medicine in which lifesaving therapeutic molecules can be designed and optimized with greater speed and precision than ever before. At Johnson & Johnson, we are leveraging data from across the pharmaceutical value chain, manifested in a knowledge graph to inform novel computational, deep learning models to drive innovation and disrupt the therapeutic research and development lifecycle; building and leveraging our collective institutional knowledge across therapeutic programs and indications in order to inform novel AI/ML models to accelerate the development of lifesaving therapies for patients across the globe.

11:10 am

Addressing Real-World Challenges in AI-Guided Design and Optimization of Biologics

Christopher J. Langmead, PhD, Director of Digital Biologics Discovery, Amgen

This presentation will provide an overview of the key challenges faced when using AI/ML to guide the design and optimization of biologics, including multi-specifics. We will then discuss some of the techniques used within Amgen to address these issues. Finally, we will argue that certain challenges are best solved through collaborative mechanisms, such as federated learning.

11:40 am

Designing Highly Stable Protein Libraries by Interpreting Deep Learning Models Trained on Flow Cytometry-Based Assays

Andrew Chang, PhD, CEO, DeepSeq.AI

The deep-learning model is no longer a black box. This talk will cover how we design a high-throughput assay to generate a protein-stability dataset for training the language model. In addition to predicting novel stable sequences using the trained model, such a model can educate us on what patterns or motifs make a particular sequence stable. Therefore, such protein stability knowledge extracted directly from the trained model becomes valuable for scientists redesigning a more stable protein library.

Luncheon in the Exhibit Hall and Last Chance for Poster Viewing12:10 pm

MACHINE LEARNING FOR ANTIBODY DISCOVERY
抗体発見のための機械学習

1:15 pm

Chairperson's Remarks 

Victor Greiff, PhD, Associate Professor, Immunology, University of Oslo

1:20 pm

Success and Challenges in AI-Driven Antibody Discovery-- From Humanoid Antibodies to de novo Design

Joshua Smith, PhD, Molecular Design, Principal Scientist, Just- Evotec Biologics

Machine learning has become an integral part of antibody discovery and development. I will describe how we designed the J.HAL antibody discovery library with a generative machine learning method and share experimental results from recent discovery campaigns. I will also outline our computational approach to the problem of antigen-specific antibody design, provide in silico validation of the approach, and describe our plans for experimental validation.

1:50 pm

Applications of Geometric Deep Learning Model with a Novel Coarse-Grained Protein Structure Representation

Jae Hyeon Lee, PhD, Machine Learning Scientist, Prescient Design

I will discuss a new protein structure prediction model based on a novel coarse-grained protein structure that achieves atomic accuracy on antibody structure prediction and is orders of magnitude faster than other state-of-the-art models. In addition, I'll describe its application in various antibody property prediction and design tasks.

2:20 pm Talk Title to be Announced

Speaker to be Announced

2:35 pm Deep learning enables exploration of antibody space on unprecedented scale

Yi Li, VP of Strategic Development, Head of Antibody Discovery, XtalPi, Inc.

The theoretical antibody sequence space is immense and beyond the interrogation by ordinary wet-lab means. Deep learning has established its superiority in fields where high-dimensional big data is involved. We demonstrate the potential of deep learning to explore the whole antibody sequence space and find therapeutic candidates with superior efficacy and developability.

Networking Refreshment Break2:50 pm

3:20 pm

Predicting Disposition: Progress towards Relevant Preclinical Models for the Pharmacokinetics of Biologics

Vanita D. Sood, PhD, Senior Vice President, Head of Drug Discovery Research Stealth Versant Ventures NewCo

The disposition of biologics (including clearance and immunogenicity) are key properties that influence efficacy (no exposure, no effect); tolerability/safety (neutralizing or clearing anti-drug antibodies); and commercial success (route of administration, patient convenience). Compared to small molecules, there is a dearth of predictive preclinical models of clinical pharmacokinetics. I will discuss recent progress and challenges in predicting human PK.

3:50 pm

Antibody Profiling at Scale

H. Benjamin Larman, PhD, Associate Professor, Pathology, Johns Hopkins University

The Larman laboratory creates technologies for unbiased characterization of serum antibodies at cohort scale. This seminar will provide an overview of our current antibody profiling capabilities, recent findings, and ongoing developmental efforts that seek to overcome existing limitations of high-throughput antibody analyses.

Close of Day4:20 pm

5月19日(金)

Registration Open7:00 am

INTERACTIVE DISCUSSIONS
インタラクティブディスカッション

7:30 amInteractive Discussions with Continental Breakfast

Interactive Discussions are informal, moderated discussions, allowing participants to exchange ideas and experiences and develop future collaborations around a focused topic. Each discussion will be led by a facilitator who keeps the discussion on track and the group engaged. To get the most out of this format, please come prepared to share examples from your work, be a part of a collective, problem-solving session, and participate in active idea sharing. Please visit the Interactive Discussions page on the conference website for a complete listing of topics and descriptions.

TABLE 1: Meaningful Representation of Biologics for Machine Learning - IN-PERSON ONLY

Yu Qiu, PhD, Senior Principal Scientist, Sanofi Genzyme R&D Center

  • ML doesn’t understand protein. Digital representation (numerical features) is needed as input
  • Meaningful representation (features) is a key for ML models
  • Protein can be represented as 1D sequence (one hot or embedding), 3D structure (point cloud of cartesian coordinates, or graphs with nodes and edges), or surface patches
  • Surface ID is deep learning derived representation, encoding geometric and chemical properties, that can be used for surface patch comparison
  • Applications of Surface ID include paratope clustering, PPI classification, database mining etc.

TABLE 2: Implementation of Disruptive Digital Innovation & Deep Learning Models to Accelerate Therapeutics Discovery of Protein Therapeutics: Challenges & Opportunities - IN-PERSON ONLY

Peter Clark, PhD, Head of Computational Science & Engineering, Therapeutics Discovery, Janssen R&D

  • Explore common challenges for end-to-end integration and enterprise deployment of AI/ML models across the R&D product lifecycle   
  • How are organizations leveraging the growing suite of predictive models to inform and accelerate generative design and optimization of protein therapeutics?
  • How can we foster collaboration between different departments, including research, development, and CMC, to establish AI as a core organizational discipline? 
  • What are the opportunities & best practices for incorporating AI/ML models and integrated lab automation platforms from discovery to development?
  • How are advancements in computational hardware and infrastructure driving innovation in our digital platforms and business processes?

RULES FOR DEVELOPABILITY
開発可能性に向けたルール

8:25 am

Chairperson's Remarks

M. Frank Erasmus, PhD, Head, Bioinformatics, Specifica, Inc.

8:30 am

Identifying Sequence and Structure Features for mAb Developability Assessment

Christopher Negron, PhD, Principal Research Scientist, AbbVie, Inc.

With over 100 approved antibody-based therapeutics, antibodies are a well-established starting point for drug discovery. Despite this success, lead antibodies may suffer from undesired drug-like properties. Thus, we present the Therapeutic Antibody Developability Analysis (TA-DA). A computational tool built by testing hundreds of sequence- and structure-based descriptors at differentiating clinical antibodies from non-natively paired human repertoire antibodies

9:00 am

Machine Learning Prediction of Methionine and Tryptophan Photooxidation Susceptibility

Jared Delmar, PhD, Associate Director, Biopharmaceutical Development, AstraZeneca

Photooxidation of methionine (Met) and tryptophan (Trp) residues is common and includes major degradation pathways that often pose a serious threat to the success of therapeutic proteins. We applied the random forest machine learning algorithm to in-house liquid chromatography-tandem mass spectrometry (LC-MS/MS) datasets (Met, n = 421; Trp, n = 342) of tryptic therapeutic protein peptides to create computational models for Met and Trp photooxidation. We show that our machine learning models predict Met and Trp photooxidation likelihood with accuracy and further identify important physical, chemical, and formulation parameters that influence photooxidation.

9:30 am

Developability Profiling of Natural Antibody Repertoires.

Victor Greiff, PhD, Associate Professor, Immunology, University of Oslo

Developability, the set of physicochemical properties of an antibody relevant for manufacturing and success in clinical trials, is one of the key determinants for success during clinical testing and any developability parameters can be computed from the antibody sequence and structure. Although the distribution of developability parameters of natural antibody repertoires may provide guidance on the potential suitability of therapeutic antibody candidates, the sequence and structural distributional landscape of the natural antibody repertoire has not yet been described. We quantify the redundancy, sensitivity, and predictability of developability parameters in natural and clinical-stage antibodies. Exploiting the vast amount of available antibody high-throughput data will facilitate the derivation of the rules underlying developability profiles to guide antibody therapeutic discovery.


10:00 am Sequencing to Synthesis: How Machine Learning Maximizes Process Efficiency in Antibody Discovery

Crystal Richardson, Ph.D, Manager, Gene Synthesis, Gene Synthesis, Azenta Life Sciences

Presentation to be Announced10:15 am

Networking Coffee Break10:30 am

11:00 am

Predicting scFv Thermostability Using Machine Learning on Sequence and Structure Features

Kathy Y. Wei, PhD, Scientific Co-Founder, 310.ai

Multi-specific biologics are of interest due to the advantage of engaging distinct targets. One important component is the scFv, but their relatively poor thermostability often hampers development. As experimental methods are laborious and expensive, computational methods are an attractive alternative. Here, we show two machine learning approaches - one with pre-trained language models (PTLM), and second, a supervised convolutional neural network (CNN) trained with Rosetta energetics - to better classify thermostable scFv variants from sequence. On out-of-distribution sequences, we show that a simple CNN model outperforms a general PTLM trained on diverse protein sequences (Spearman ?=0.4 vs 0.15).

11:30 am

Development of Machine Learning Models for Prediction of Antibody Non-Specificity

Laila Sakhnini, PhD, Senior Research Scientist, Biophysics & Injectable Formulation, Novo Nordisk AS

Over the years, there has been an increased focus on decreasing non-specific binding during early-stage drug development. It has been recognized as a root cause for failure in many drug programs due to unexpected pharmacokinetics and elevated toxicity. From a computational design perspective, prediction has remained a challenge. Proposed work describes the development of sequence-based machine learning models for prediction of this property with accuracy of up to 74%, enabling flagging and deselection of non-specificity at an early-stage.

12:00 pm

Protein Design and Variant Prediction Using Autoregressive Generative Models

Debora S. Marks, PhD, Associate Professor, Systems Biology, Harvard Medical School

Close of Machine Learning Approaches for Protein Engineering Conference12:30 pm

* 不測の事態により、事前の予告なしにプログラムが変更される場合があります。

参加型パスのお申込みは終了致しました。
オンデマンドパスは引き続きお申込み頂けます。
下記ボタンよりご連絡下さい。

Choose your language
English



View By:

Engineering
Oncology
Bispecific Antibodies
Immunotherpary
Expression
Analytical
Immunogenicity
Emerging Modalities