Hao Zhang
Research Scientist at ByteDance
I am a Research Scientist at ByteDance, where I develop data infrastructure and orchestration frameworks for LLM-based agents. More broadly, my research spans agentic systems, general-purpose data systems such as vector and graph databases, and hardware-accelerated query processing for SQL and other data-intensive workloads, with an emphasis on efficient foundations for modern AI applications.
I received my Ph.D. from the Chinese University of Hong Kong in 2022 under the supervision of Prof. Jeffrey Xu Yu and Prof. Hong Cheng. Before that, I earned my B.S. in Computer Science from the Hongyi Honor School at Wuhan University in 2017.
I have authored and co-authored 20+ papers in top-tier database venues including SIGMOD, VLDB, ICDE, and TKDE. Beyond publications, I build systems: I have designed and architected multiple industry and research platforms, and my graph database work has achieved world-record benchmark results multiple times.
Current Research
- Agentic System for LLM orchestration, including [SEMA, Arxiv’26].
- Vector Database + Graph Database for agents, including [SQLVec, ICDE’26],[RapidStore, VLDB’25], [GES, SIGMOD’25], [Graph Storage Benchmark, SIGMOD’25], [Aquila, VLDB’26].
- Hardware-Accelerated Query Engine for structured (e.g. SQL) and unstructured data (e.g., Graph), including [TQEX(SQL), SIGMOD’26], [TenGraph, VLDB’24], and [TGraph, SIGMOD’25].
Prior Research
- Distributed Query Engine: [Secco, SIGMOD’22], [DISC, VLDB’20], [Crystal, VLDB’18]
- Graph Algorithms and GNN: [Graph Classification Attack, DASFAA’25], [Time-Dependent Reachability, ICDE’24], [Streaming Embedding, ICDE’24], [Triangle Listing, BigData’16], [Triangle Listing, DPD’17], [Local Clustering Coefficient, DASFAA’17]
- AI4DB: [ALSS, SIGMOD’21], [Learned Sketch for Subgraph Counting, VLDBJ’23], [NNGP-Card, SIGMOD’22], [Learned Multiway Join, ICDE’21]
Highlights
- Authored and co-authored 20+ papers in top-tier database venues, with recent work spanning vector search, tensor-centric execution, graph systems, and Data+AI infrastructure.
- Designed and architected systems such as
SEMA,TDB,GES,SeccoSQL,DISC, andCrystal; the projects page summarizes these systems from research prototypes to production infrastructure. - Achieved LDBC SNB Interactive world-record results in both the declarative track (2024) and the imperative track (2025), two of the strongest public signals for graph database performance.
Collaboration and Internships
- I welcome collaboration on technically ambitious systems problems in database infrastructure, retrieval, vector search, graph systems, and multimodal analytics. Remote collaboration is also welcome.
- We are looking for strong interns in Shenzhen, Beijing, and Shanghai who are excited about Data+AI systems research and development. If that fits your profile, you can apply through this.
News
| 11/2025 | Set a new world record for graph database workload again, this time on a Chinese-made chip (during my time at Huawei), achieving Ranked #1 in LDBC Benchmark SNB - Imperative Track, the most authoritative ranking in graph database. Congrats to all my former team members and collaborators at Huawei and SJTU! |
|---|---|
| 10/2025 | One paper accepted by SIGMOD’26. Congrats to Junchao Ma and Prof. Yuanyuan Zhu! |
| 09/2025 | One paper accepted by VLDB’26. Congrats to Ziqi Zhou and Prof. Zhiwei Zhang! |
| 07/2025 | One paper accepted by SIGMOD’26. Congrats to Haitao Zhang and Prof. Yuanyuan Zhu! |
| 05/2025 | One paper accepted by VLDB’25. Congrats to Chiyu Hao and Prof. Shixuan Sun! |
Recent Publications
^ indicates first author or corresponding author.