Hao Zhang

Research Scientist at ByteDance

bio-photo.jpeg

Research Scientist, ByteDance

Data+AI systems

zhanghao.ai@bytedance.com

I am a Research Scientist at ByteDance, where I develop data infrastructure and orchestration frameworks for LLM-based agents. More broadly, my research spans agentic systems, general-purpose data systems such as vector and graph databases, and hardware-accelerated query processing for SQL and other data-intensive workloads, with an emphasis on efficient foundations for modern AI applications.

I received my Ph.D. from the Chinese University of Hong Kong in 2022 under the supervision of Prof. Jeffrey Xu Yu and Prof. Hong Cheng. Before that, I earned my B.S. in Computer Science from the Hongyi Honor School at Wuhan University in 2017.

I have authored and co-authored 20+ papers in top-tier database venues including SIGMOD, VLDB, ICDE, and TKDE. Beyond publications, I build systems: I have designed and architected multiple industry and research platforms, and my graph database work has achieved world-record benchmark results multiple times.

Current Research

Prior Research

  1. Distributed Query Engine: [Secco, SIGMOD’22], [DISC, VLDB’20], [Crystal, VLDB’18]
  2. Graph Algorithms and GNN: [Graph Classification Attack, DASFAA’25], [Time-Dependent Reachability, ICDE’24], [Streaming Embedding, ICDE’24], [Triangle Listing, BigData’16], [Triangle Listing, DPD’17], [Local Clustering Coefficient, DASFAA’17]
  3. AI4DB: [ALSS, SIGMOD’21], [Learned Sketch for Subgraph Counting, VLDBJ’23], [NNGP-Card, SIGMOD’22], [Learned Multiway Join, ICDE’21]

Highlights

  • Authored and co-authored 20+ papers in top-tier database venues, with recent work spanning vector search, tensor-centric execution, graph systems, and Data+AI infrastructure.
  • Designed and architected systems such as SEMA, TDB, GES, SeccoSQL, DISC, and Crystal; the projects page summarizes these systems from research prototypes to production infrastructure.
  • Achieved LDBC SNB Interactive world-record results in both the declarative track (2024) and the imperative track (2025), two of the strongest public signals for graph database performance.

Collaboration and Internships

  • I welcome collaboration on technically ambitious systems problems in database infrastructure, retrieval, vector search, graph systems, and multimodal analytics. Remote collaboration is also welcome.
  • We are looking for strong interns in Shenzhen, Beijing, and Shanghai who are excited about Data+AI systems research and development. If that fits your profile, you can apply through this.

News

11/2025 Set a new world record for graph database workload again, this time on a Chinese-made chip (during my time at Huawei), achieving Ranked #1 in LDBC Benchmark SNB - Imperative Track, the most authoritative ranking in graph database. Congrats to all my former team members and collaborators at Huawei and SJTU!
10/2025 One paper accepted by SIGMOD’26. Congrats to Junchao Ma and Prof. Yuanyuan Zhu!
09/2025 One paper accepted by VLDB’26. Congrats to Ziqi Zhou and Prof. Zhiwei Zhang!
07/2025 One paper accepted by SIGMOD’26. Congrats to Haitao Zhang and Prof. Yuanyuan Zhu!
05/2025 One paper accepted by VLDB’25. Congrats to Chiyu Hao and Prof. Shixuan Sun!

Recent Publications

^ indicates first author or corresponding author.

  1. ICDE
    SQLVec: SQL-Based Vector Similarity Search
    Zequn Zhang , Yuanyuan Zhu , Hao Zhang , and Jeffrey Xu Yu
    In 42nd IEEE International Conference on Data Engineering (ICDE), 2026
    (To appear)
  2. SIGMOD
    Accelerating Triangle-Connected Truss Community Search Across Heterogeneous Hardware
    Junchao Ma , Xin Yan , Yuanyuan Zhu , Guojing Li , Hao Zhang , and Jeffrey Xu Yu
    In ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD), 2026
    (To appear)
  3. SIGMOD
    TQEX(SQL): Tensor-based Query Engine Enhanced by Bridging the Gap
    Haitao Zhang , Ran Pang , Yuanyuan Zhu , Hao Zhang^ , Congli Gao , Ming Zhong , Jiawei Jiang , Tieyun Qian , and Jeffrey Xu Yu
    In ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD), 2026
  4. VLDB
    Aquila: A High-Concurrency System for Incremental Graph Query
    Ziqi Zhou , Hao Zhang^ , Jiaxin Yao , Kangfei Zhao , Zhiwei Zhang , Sen Gao , Jingpeng Hao , Ye Yuan , and Guoren Wang
    In 52nd International Conference on Very Large Data Bases (VLDB), 2026
    (To appear)
  5. arXiv
    SEMA: A Unified Agentic System for Multimodal Data Workflows
    2026
    ArXiv preprint
  6. DASFAA
    Breaking Free from Label Limitations: A Novel Unsupervised Attack Method for Graph Classification
    Yadong Wang , Zhiwei Zhang , Pengpeng Qiao , Ye Yuan , Hao Zhang , and Guoren Wang
    In 30th International Conference on Database Systems for Advanced Applications (DASFAA), 2025
  7. SIGMOD
    GES: High-Performance Graph Processing Engine and Service in Huawei
    Sen Gao , Jianwen Zhao , Hao Zhang^ , Shixuan Sun , Chen Liang , Gongye Chen , and et al.
    In ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD), 2025
  8. SIGMOD
    TGraph: A Tensor-centric Graph Processing Framework
    Yongliang Zhang , Yuanyuan Zhu , Hao Zhang^ , Congli Gao , Yuyang Wang , and et al.
    In ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD), 2025
  9. SIGMOD
    Revisiting the Design of In-Memory Dynamic Graph Storage
    Jixian Su , Chiyu Hao , Shixuan Sun , Hao Zhang , Sen Gao , and et al.
    In ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD), 2025
  10. VLDB
    RapidStore: An Efficient Dynamic Graph Storage System for Concurrent Queries
    Chiyu Hao , Jixian Su , Shixuan Sun , Hao Zhang , Sen Gao , Jianwen Zhao , Chenyi Zhang , and et al.
    In 51st International Conference on Very Large Data Bases (VLDB), 2025
  11. ICDE
    Label Constrained Reachability Queries on Time Dependent Graphs
    Yishu Wang , Jinlong Chu , Ye Yuan , Yu Gu , Hangxu Ji , and Hao Zhang
    In 40th IEEE International Conference on Data Engineering (ICDE), 2024
  12. ICDE
    Attributed Network Embedding in Streaming Style
    Anbiao Wu , Ye Yuan , Changsheng Li , Yuliang Ma , and Hao Zhang
    In 40th IEEE International Conference on Data Engineering (ICDE), 2024
  13. VLDB
    TenGraph: A Tensor-Based Graph Query Engine
    Guanghua Li , Hao Zhang , Xibo Sun , Qiong Luo , and Yuanyuan Zhu
    In 50th International Conference on Very Large Data Bases (VLDB), 2024