Hao Zhang

Research Scientist at ByteDance

bio-photo.jpeg

Research Scientist, ByteDance

Data+AI systems

zhanghao.ai@bytedance.com

I am an AI-native data systems researcher and builder focused on agentic workflows, retrieval, and accelerator-native execution. At ByteDance, I develop data infrastructure and orchestration frameworks for LLM-based agents. More broadly, my research spans agentic systems, vector and graph databases, and hardware-accelerated query processing for SQL and other data-intensive workloads, with an emphasis on building efficient foundations for modern AI applications.

I received my Ph.D. from the Chinese University of Hong Kong in 2022 under the supervision of Prof. Jeffrey Xu Yu and Prof. Hong Cheng. Before that, I earned my B.S. in Computer Science from the Hongyi Honor School at Wuhan University in 2017.

I have authored and co-authored 20+ papers in top-tier database venues including SIGMOD, VLDB, ICDE, and TKDE. Beyond publications, I build systems: I have designed and architected multiple industry and research platforms, and my graph database work has achieved world-record benchmark results multiple times.

Collaboration and Internships

  • I welcome collaboration on technically ambitious systems challenges in database infrastructure, retrieval, vector search, graph systems, and multimodal analytics. Remote collaboration is also welcome.
  • We are seeking strong interns in Shenzhen who are excited about research and development in Data+AI systems. If this fits your interests and background, please apply by email and prefix the subject line with [Intern].

Current Research

Prior Research

  1. Distributed Query Engine: [Secco, SIGMOD’22], [DISC, VLDB’20], [Crystal, VLDB’18]
  2. Graph Algorithms and GNN: [Graph Classification Attack, DASFAA’25], [Time-Dependent Reachability, ICDE’24], [Streaming Embedding, ICDE’24], [Triangle Listing, BigData’16], [Triangle Listing, DPD’17], [Local Clustering Coefficient, DASFAA’17]
  3. AI4DB: [ALSS, SIGMOD’21], [Learned Sketch for Subgraph Counting, VLDBJ’23], [NNGP-Card, SIGMOD’22], [Learned Multiway Join, ICDE’21]

Highlights

  • Authored and co-authored 20+ papers in top-tier database venues, with recent work spanning vector search, tensor-centric execution, graph systems, and Data+AI infrastructure.
  • Designed and architected systems such as SEMA, TDB, GES, SeccoSQL, DISC, and Crystal; the projects page summarizes these systems from research prototypes to production infrastructure.
  • Achieved LDBC SNB Interactive world-record results in both the declarative track (2024) and the imperative track (2025), two of the strongest public signals for graph database performance.

News

11/2025 Set a new world record for graph database workload again, this time on a Chinese-made chip (during my time at Huawei), achieving Ranked #1 in LDBC Benchmark SNB - Imperative Track, the most authoritative ranking in graph database. Congrats to all my former team members and collaborators at Huawei and SJTU!
10/2025 One paper accepted by SIGMOD’26. Congrats to Junchao Ma and Prof. Yuanyuan Zhu!
09/2025 One paper accepted by VLDB’26. Congrats to Ziqi Zhou and Prof. Zhiwei Zhang!
07/2025 One paper accepted by SIGMOD’26. Congrats to Haitao Zhang and Prof. Yuanyuan Zhu!
05/2025 One paper accepted by VLDB’25. Congrats to Chiyu Hao and Prof. Shixuan Sun!