Research Interests
Data Validation, Large Language Models, Graph Analytics, Compound AI Systems
Research Experience
- Feb. 2022 - May. 2023: Machine Learning Research Intern, LinkedIn, Beijing, China.
- Professional Network Matters: Connections Empower Person-Job Fit
- Perform data collection, data cleaning, and data analysis with Spark and SQL on LinkedIn’s internal database.
- Literature review on Person-Job Fit and Heterogeneous Graph Neural Networks.
- Design and implement a heterogeneous graph neural network model for Person-Job Fit.
- The corresponding paper has been accepted by WSDM 2024.
- A Hierarchical Framework with Multitask Co-Pretraining on Semi-Structured Data towards Effective Person-Job Fit
- Participated in Experiment Design and Implementation.
- The corresponding paper has been accepted by ICASSP 2024.
- Skill dependency graph in LinkedIn scenario.
- Build a Skill dependency graph with the common sense of LLM.
- Professional Network Matters: Connections Empower Person-Job Fit
- June 2023 - Jan. 2024: Research Intern, Microsoft Research Asia, Beijing, China.
- Question Answer on Heterogeneous Information Network
- Build a semantic parsing dataset to evaluate LLM’s ability to run graph algorithm.
- The corresponding paper has been accepted by WWW 2025 as a short paper.
- Question Answer on Heterogeneous Information Network
Publications
- Hao Chen, Lun Du, Yuxuan Lu, Qiang Fu, Xu Chen, Shi Han, Yanbin Kang, Guangming Lu and Zi Li, Professional Network Matters: Connections Empower Person-Job Fit (WSDM 2024)
- Yihan Cao, Xu Chen, Lun Du, Hao Chen, Qiang Fu, Shi Han, Yushu Du, Yanbin Kang, Guangming Lu, Zi Li, TAROT: A Hierarchical Framework with Multitask Co-Pretraining on Semi-Structured Data towards Effective Person-Job Fit (ICASSP 2024)
- Stefan Grafberger, Hao Chen, Olga Ovcharenko, Sebastian Schelter, Towards Regaining Control over Messy Machine Learning Pipelines (DAIS Workshop @ ICDE 2025)
- Hao Chen, Lun Du, Xu Chen, Xiaojun Ma, Jiang Zhang, LLM-powered Heterogeneous Information Network Analytics (WWW 2025, short paper)
- Hao Chen, Sebastian Schelter, Towards Automated Task-Aware Data Validation (DEEM Workshop @ SIGMOD 2025)
- Pierre Lubitzsch, Olga Ovcharenko, Hao Chen, Maarten de Rijke, Sebastian Schelter, Towards a Real-World Aligned Benchmark for Unlearning in Recommender Systems (FAccTRec Workshop @ RecSys 2025)
- Yuchen Tian, Kaixin Li, Hao Chen, Ziyang Luo, Hongzhan Lin, Sebastian Schelter, Lun Du, Jing Ma, AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries? (arXiv 2025)
Open Source Projects
- Maintainer of TADV
- A framework for task-aware data validation that leverages language models to generate data validation rules.
- Maintainer of DescKGC
- A python package for knowledge graph completion which highlight the importance of descriptions of entities.
- Made small contributions to open source projects, like Langchain, SuperAGI.
Education
- Ph.D. in Computer Science, BIFOLD & TU Berlin, 2024-present
- M.S. in System Science, Beijing Normal University, 2021-2024
- B.S. in Applied Physics, Beijing University of Posts and Telecommunications, 2017-2021