Hi, this is Weizhe Lin. After graduating from high school, I studied at the University of Hong Kong for one year before transferring to the University of Cambridge to pursue a four-year BA/MEng program. During my undergraduate studies at the Department of Engineering, I was honored to receive the Trinity Overseas Bursary and the Cambridge Trust Scholarship. On June 31, 2021, I successfully obtained both my Bachelor of Arts and Master of Engineering degrees. On May 18, 2024, I received my Master of Arts degree from the University of Cambridge. On Nov 30, 2024, I received my Ph.D. degree at the University of Cambridge. My Ph.D. research focuses on the intersection of vision and language. In collaboration with Stranks Lab, University of Cambridge, I am also involved in vision technology that has the potential to advance material scientific research. On Dec 30, 2024, I joined Huawei Advanced Computing and Storage Lab, working on developing cutting-edge technologies to improve the inference efficiency of AI models. Anyone interested in my team is welcome to send your resume to me.

Keywrods of my research experiences: Inference System Optimization/Acceleration, Multi-modal Retrieval, Question Answering, Diffusion Models, Dialogue Systems, Recommender Systems, Graph Neural Networks, and Multi-robot Path Planning.

This page is to list some of my projects and publications to help other researchers to know me better.

Education

Trinity College, University of Cambridge 2021 - 2024
Ph.D. in Engineering (Supervised by Prof. Bill Byrne)
Visiting Ph.D. in Chemical Engineering and Biotechnology (Supervised by Prof. Samuel Stranks)
Trinity College, University of Cambridge 2017 - 2021
Master of Engineering & Bachelor of Arts in Information Engineering
Dissertation submitted for Master degree:
- Graph Neural Networks in Multi-Domain Task-Oriented Systems (Distinction Class and Outstanding Project Prize)
University of Hong Kong 2016 - 2017
Bachelor of Engineering (Computer Science)

Work Experience

Research Scientist at Huawei Advanced Computing and Storage Lab (2024-2025)
- Optimizing AI inference system
- Developing inference acceleration algorithms for LLMs
Group Lead at X-Intelligence Labs Research (2024)
- Developing machine learning algorithms in AI long-term contextual memory.
CTO at To0space, Beijing (2023-2024)
- Lead the research group of To0space
- Develop cutting-edge AIGC solutions for architectural design
- To0space Website
Intern Applied Scientist II at Amazon Development Center, Cambridge (2022)
- Developing advanced Table Question Answering models
Intern Researcher at Microsoft Software Technology Center, Beijing (2021)
- Developing cutting-edge recommender systems for movie recommendation
- Prepare up-to-date movie recommendation datasets
Remote Researcher at Computer Laboratory, University of Cambridge (2020)
- Proposed a decentralized deep learning framework that utilizes Graph Attention Networks (GATs) to address multi-agent path planning problems.
- The model trained at simple problem instances show great generalizability in very complex and hard cases (100x in agent number and map size).
Researcher at Computer Laboratory, University of Cambridge (2019)
- Proposed a network-based novel multi-modal feature fusion framework which can be utilised to make prediction of psychological disorder
- Developed a self-adaptor(fidgeting) detection system and applied it to investigate automated detection of psychological distress
Cloud Engineering Intern in Informetis Europe Ltd. (2018)
- Developed a Python Django-based website which tracks usage data by IoT devices in Power Supply monitors (which the company manufactures). This information then helps machine learning engineers to develop more accurate algorithm to predict how energy is used by various electrical appliances
- Gained skills in database management, involving the use of MySQL, Google Bigtable and Redis Caching.

Publications and Presentations

[Technical Report] Huawei Pangu Team. Openpangu DeepDiver-v2: Multi-agent Learning for Deep Information Seeking. Read
[EMNLP 2025] Wenqi Zhou, Kai Cao, Hao Zheng, Xinyi Zheng, Miao Liu, Per Ola Kristensson, Walterio Mayol-Cuevas, Fan Zhang, Weizhe Lin (Corresponding author), Junxiao Shen. X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding. Read
[EMNLP 2025] Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne. Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection. Read
[NeurIPS 2025] Jinghong Chen, Guangyu Yang, Weizhe Lin, Jingbiao Mei, Bill Byrne. On Extending Direct Preference Optimization to Accommodate Ties. on arXiv. Read
[Technical Report] Huawei Pangu Team. Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity. Read
[Arxiv] Weizhe Lin, Xing Li, Zhiyuan Yang, Xiaojin Fu, Hui-Ling Zhen, Yaoyuan Wang, Xianzhi Yu, Wulong Liu, Xiaosong Li, Mingxuan Yuan. TrimR: Verifier-based Training-Free Thinking Compression for Efficient Test-Time Scaling. Read
[ICCV 2025] Xinyi Zheng, Steve Zhang, Weizhe Lin (Corresponding author), Aaron Zhang, Walterio W Mayol-Cuevas, Junxiao Shen. CULTURE3D: Cultural Landmarks and Terrain Dataset for 3D Applications. Read
[Arxiv] Zihong He, Weizhe Lin (Corresponding author), Hao Zheng, Fan Zhang, Matt Jones, Laurence Aitchison, Xuhai Xu, Miao Liu, Per Ola Kristensson, Junxiao Shen. Human-inspired Perspectives: A Survey on AI Long-term Memory. Read
[PhD Thesis] Weizhe Lin. Augmenting Multi-modal Question Answering Systems with Retrieval Methods. Apollo - University of Cambridge Repository. Read
[ACL 2024] Weizhe Lin, Jingbiao Mei, Jinghong Chen, Bill Byrne. PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers. 2024. To appear at Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL). Read
[ACL2024] Jingbiao Mei, Jinghong Chen, Weizhe Lin, Bill Byrne, Marcus Tomalin. Improving hateful memes detection via learning hatefulness-aware embedding space through retrieval-guided contrastive learning. To appear at Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL). Read
[NAACL 2024] Jinghong Chen, Weizhe Lin, Bill Byrne. CONTROL-DAG: Efficient Controlled Decoding for Directed Acyclic Non-Autoregressive Text Generation. 2024. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Read
[NAACL 2024] Guangyu Yang, Jinghong Chen, Weizhe Lin, Bill Byrne. Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding. 2024. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Read
[NeurIPS 2023] Weizhe Lin, Jinghong Chen, Jingbiao Mei, Alexandru Coca, Bill Byrne. Finer-grained Late-interaction Multimodal Retrieval for Knowledge-based Visual Question Answering. 2023. Proceedings of Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS). Read
[Nature Machine Intelligence] Kangyu Ji, Weizhe Lin, Yuqi Sun, Linsong Cui, Javad Shamsi, Yu-Hsien Chiang, Jiawei Chen, Elizabeth Tennyson, Linjie Dai, Qingbiao Li, Kyle Frohna, Miguel Anaya, Neil Greenham. Sam Stranks. Self-supervised deep learning for tracking degradation of perovskite LEDs with multispectral imaging. Nature Machine Intelligence. Read
[ACL 2023] Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adria de Gispert and Gonzalo Iglesias. An Inner Table Retriever for Robust Table Question Answering. 2023. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL). Read
[ACL 2023] Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adria de Gispert and Gonzalo Iglesias. LI-RAGE: Late Interaction Retrieval Augmented Generation with Explicit Signals for Open-Domain Table Question Answering. 2023. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL). Read
[SIGDIAL 2023 Best Long Paper] Alexandru Coca, Bo-Hsiang Tseng, Jinghong Chen, Weizhe Lin, Weixuan Zhang, Tisha Anders and Bill Byrne. Grounding Description-Driven Dialogue State Trackers with Knowledge-Seeking Turns. 2023. 24th Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL). Read
[EACL 2023 Findings] Weizhe Lin, Zhilin Wang, and Bill Byrne. FVQA 2.0: Introducing Adversarial Samples for Fact-based Visual Question Answering. 2023. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Findings (EACL). read
[EVQA 2023 Findings] Alexandru Coca, Bo-Hsiang Tseng, Weizhe Lin, Bill Byrne. More Robust Schema-Guided Dialogue State Tracking via Tree-Based Paraphrase Ranking. 2023. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Findings (EACL). Read
[EMNLP 2022] Weizhe Lin, Bill Byrne. Retrieval Augmented Visual Question Answering with Outside Knowledge. 2022. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). Read
[KaRS 2022] Weizhe Lin, Linjun Shou, Ming Gong, Jian Pei, Zhilin Wang, Bill Byrne, Daxin Jiang. Transformer-Empowered Content-Aware Collaborative Filtering. 2022. In Proceedings of the Fourth Knowledge-aware and Conversational Recommender Systems Workshop co-located with 16th ACM Conference on Recommender Systems (RecSys 2022). Read
[KaRS 2022] Weizhe Lin, Linjun Shou, Ming Gong, Jian Pei, Zhilin Wang, Bill Byrne, Daxin Jiang. Combining Unstructured Content and Knowledge Graphs into Recommendation Datasets (short paper). 2022. In Proceedings of the Fourth Knowledge-aware and Conversational Recommender Systems Workshop co-located with 16th ACM Conference on Recommender Systems (RecSys 2022). Read
[EMNLP 2021] Weizhe Lin, Bo-Hsiang Tseng and Bill Byrne. Knowledge-Aware Graph-Enhanced GPT-2 for Dialogue State Tracking. 2021. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). Read
[IEEE Transactions on Affective Computing] Lin, W., Orton, I., Li, Q., Pavarini, G., & Mahmoud, M. (2021). Looking At The Body: Automatic Analysis of Body Gestures and Self-Adaptors in Psychological Distress. In IEEE Transactions on Affective Computing. Read
[NAACL 2021 Workshop WNU] Zhilin Wang, Weizhe Lin and Xiaodong Wu. Learning similarity between movie characters and its potential implications on understanding human experiences. 2021. In Proceedings of the 2021 NAACL Workshop WNU: 3rd Workshop on Narrative Understanding. Read
[IEEE Robotics and Automation Letters] Qingbiao Li, Weizhe Lin (*equal contribution), Zhe Liu and Amanda Prorok. Message-Aware Graph Attention Networks for Large Scale Multi-Robot Path Planning. 2020. IEEE Robotics and Automation Letters. Read
[FG 2020 Oral] Weizhe Lin, Indigo Orton, Mingyu Liu, Marwa Mahmoud. Automatic Detection of Self-Adaptors for Psychological Distress. 2020. In Proceedings of 2020 15th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2020). Read
[FG 2020 Oral] Ziheng Zhang, Weizhe Lin, Mingyu Liu, Marwa Mahmoud. Multimodal Deep Learning Framework for Mental DisorderRecognition. 2020. In Proceedings of 2020 15th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2020). Read
Xiaodong Wu, Weizhe Lin (*equal contribution), Zhilin Wang and Elena Rastorgueva. Author2Vec: A Novel Framework for Generating User Embedding. 2019. on arXiv.
[EMNLP 2019 Workshop W-NUT] Zhilin Wang, Elena Rastorgueva, Weizhe Lin and Xiaodong Wu. No you’re not alone A better way to find people with similar experiences on Reddit. 2019. In Proceedings of the 2019 EMNLP Workshop W-NUT: The 5th Workshop on Noisy User-generated Text. Read
Zhilin Wang, Xiaodong Wu, Weizhe Lin and Elena Rastorgueva. Detecting personal attributes through analyzing online forums. 2019. In Cambridge Language Sciences Early Careers Researchers Symposium. Read

Review For Venues

Major NLP venues (EMNLP, ACL, EACL, NAACL, COLING, etc.)
Transactions on Pattern Analysis and Machine Intelligence
The First Workshop on Interactive Technologies for AI in Healthcare: Diagnosis, Management, and Assistance
The First Workshop on Multimodal Data for Mental Disorder Recognition

Society

Director of Web Development of Cambridge Hercules Link
Data Analysis Mentor of Bridge for Enterprise Link
Member of Computer Vision Team in Robotics Society (Cambridge)

Projects

Supervised by Prof. Bill Byrne.
Developing multi-modal retrieval-augmented systems for knowledge-based visual question answering and table question answering.

Knowledge-aware multi-domain task-oriented dialogue systems (final year dissertation)

Final year project supervised by Prof. Bill Byrne (Head of Information Engineering).
Utilising neural forms of graph networks in dialogue systems

Multi-robot path planning

Supervised by Amanda Prorok.
Imitation learning using Graph Neural Network to communicate between agents.
Utilize graph attention neural network to leverage the performance of moving agents to their goals.

COVID-19 diagnosis assist and CT denoising (AIXCOVNET Project Support Member)

Working with Stranks Lab of Cavendish Laboratory, NHS(Addenbrooke’s Hospital), Department of Radiology. Super- vised by Sam Stranks.
Perform CT denoising on datasets of COVID-19 and other commonly-seen lung diseases.
Low-dose high-speed CT screening

Image reconstruction for hyperspectral microscopy using deep learning

Working together with Stranks Lab of Cavendish Laboratory. Collaborate with VISION Laboratory of Department of Physics. Supervised by Sam Stranks.
Using machine-learning-based methods to denoise and reconstruct physics-informed images obtain by special mi- croscopy.
Highly reduced the required laser exposure time for taking images for physics/material research.

Automatic fidgeting and self-adaptor detection for psychological distress from 2D videos Code

Developed a fully automated system to detect the fidgeting behaviour (such as touching face by hand and rhythmic body motion). Supervised by Dr. Marwa Mahmoud.
Using Gaussian Mixture Model and Fisher Vector for feature fusion and dimension reduction.
Perform classification based on multi-modal features extracted from interview videos.

LearnAh.uk

Work as frontend & backend designer, database manager
Helps teachers to make science fun to learn by recommending relevant popular science videos using machine- learning based text analysis (Latent Semantic Indexing & Latent Dirichlet Allocation)
Deployed on a Python Django-based website and accessible to thrid-party through REST API
UCL Institute of Education Knowledge Lab EDUCATE Graduate (with EU grant)
Y Combinator Startup School Graduate
Runner-up, Cambridge University Entrepreneurs competition (Social Enterprise)

Mars Lander Design and Programming Contest: Received a prize from Airbus Defense and Space Code

I came up with some good solutions to many of the extension exercises, including the effects of planetary rotation and wind. My autopilot could handle injection into arbitrary orbits with pre-specified apogees and perigees. I also performed some preliminary investigations into optimal control, tweaking the autopilot to minimize peak acceleration or descent time, and to deal with moderate levels of engine lag and delay. I was then invited to tour around the Airbus Stevenage Base.

EasyEye (HackCambridge2019) Code

We implemented an eye monitoring system with OpenCV to detect the fatigue status (blinking frequency/squinting) of the users’ eyes. The program on a laptop uses a built-in camera to track and analyze the users’ eyes. It uploads data onto online database and the data will be evaluated by the server. The server then creates alerts and sends the alerts to the Fitbit watch of the user, alerting the users of the fatigue of their eyes by vibration and sound.

Robotics Society Computer Vision Subteam Task: RMRC Motion Detection Code

Using OpenCV and python, I developed a customized algorithm to detect motion of several black objects moving on a white board and count the number. Currently, the algorithm compromises the results of hue detection and frame difference detection to generate a reasonable output.

Arm Hackathon Machine Learning Project: Wearable Mbed Code

Our Mbed board works as a wearable device, which collects the acceleration data of the user and applies machine learning model to tell which state the user is currently in (idle, walking, running, typing on laptop…). The collected data can be recorded and be used to provide an overview of the user’s daily activity. The user can understand the daily pattern of a day and notice how his/her daily pattern is being changed. For example, the less and less sleeping time will warm them to adjust their habits at night.

WIFI Guard Code

Patent Number: 2016SR049201 (CHINA)
An android application which monitors the data usage and therefore prevents background data traffic abuse
3rd in Computer Science (category) of the 15th Awards Program for Future Scientists (National)

Virus Protection Assistant Code

10 modules, more than 100,000 lines of code
A completed anti-virus software which is able to catch the virus based on “behavior analysis”
Gold Medal in 30th Fujian Adolescents Science Technology & Innovation Contest, China

Other Awards

Airbus Defense and Space Prize 2018 for Mars Lander Design and Programming Contest (Runner Up)
Centenary Prize for Top 3 Information, Electrical and Electronics Engineering Final Project
Chinese Physics Olympiad (Fujian Division) - Second Award
Hong Kong Pan Pearl River Delta Physics Olympics - Third Award
Fujian Adolescents Science & Technology Innovation Contest - Gold Medal (30th), Third Award (29th)
The 15th Awards Program for Future Scientists - Silver Medal

Weizhe Lin