講題: 結合蛋白質語言模型嵌入,與多視窗掃描深度學習模型解讀膜蛋白序列中的生物語言訊息,並進行功能鑑別
講者: 歐昱言教授
時間: 2024/09/24(二) 14:00-16:00
地點: 人工智慧研究中心(管理大樓11樓)
線上直播: https://gqr.sh/awHN
歐昱言教授於2005年獲得台灣大學資訊工程系博士學位,並於當年到元智大學資工系服務至今,主要研究領域是機器學習與生物資訊。其將機器學習相關技術應用到蛋白質序列分析已經有很多年的時間,主要深耕在膜蛋白相關序列的功能預測與鑑別上,從2016年以後,隨著深度學習與自然語言相關技術的快速發展,帶領著團隊將這些技術應用到膜蛋白相關領域的序列分析上,是使用蛋白質語言模型到膜蛋白序列分析的領先團隊之一,這幾年在主要生物資訊期刊發表包括3篇《Briefings in Bioinformatics》(IF: 9.5,排名3/55)、4篇《Computers in Biology and Medicine》(IF: 7.7,排名4/55)、1篇《Bioinformatics》(IF: 5.8,排名6/55)以及2篇《IEEE/ACM Transactions on Computational Biology and Bioinformatics》(IF: 4.5,排名7/125)等。
This presentation will explore the
integration of protein language pre-training models with multi-window scanning
deep learning techniques to decode and analyze the biological language embedded
within membrane protein sequences. We will first utilize protein language
models such as ProtTrans or ESM-2 to transform protein sequences into
high-dimensional vector embeddings, capturing the intricate biological language
within these sequences. Following this, we will employ multi-window
convolutional neural networks (MCNN) to extract features across various scales,
enabling the identification of membrane protein functions based on these
language features. This innovative approach, combining language models with multi-scale
analysis, not only enhances our understanding of membrane and transporter
proteins but also offers new perspectives and potential applications in the
field of bioinformatics.