publications | Siqiao Xue

2024

VLDB
Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models

Siqiao Xue, Danrui Qi , Caigao Jiang , Wenhui Shi , Fangyin Cheng , and 12 more authors

In VLDB , 2024

Abs arXiv Bib Video Code

The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. The technologies of interacting with data particularly have an important entanglement with LLMs as efficient and intuitive data interactions are paramount. In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility. DB-GPT is designed to understand data interaction tasks described by natural language and provide context-aware responses powered by LLMs, making it an indispensable tool for users ranging from novice to expert. Its system design supports deployment across local, distributed, and cloud environments. Beyond handling basic data interaction tasks like Text-to-SQL with LLMs, it can handle complex tasks like generative data analysis through a Multi-Agents framework and the Agentic Workflow Expression Language (AWEL). The Service-oriented Multi-model Management Framework (SMMF) ensures data privacy and security, enabling users to employ DB-GPT with private LLMs. Additionally, DB-GPT offers a series of product-ready features designed to enable users to integrate DB-GPT within their product environments easily. The code of DB-GPT is available at Github(this https URL) which already has over 10.7k stars. Please install DB-GPT for your own usage with the instructions(this https URL) and watch a 5-minute introduction video on Youtube(this https URL) to further investigate DB-GPT.
@inproceedings{xue2024dbgptdemo, title = {Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models}, author = {Xue, Siqiao and Qi, Danrui and Jiang, Caigao and Shi, Wenhui and Cheng, Fangyin and Chen, Keting and Yang, Hongjun and Zhang, Zhiping and He, Jianshan and Zhang, Hongyang and Wei, Ganglin and Zhao, Wang and Zhou, Fan and Yi, Hong and Liu, Shaodong and Yang, Hongjun and Chen, Faqiang}, year = {2024}, booktitle = {VLDB} }
ACL
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending

Shiyi Zhu , Jing Ye , Wei Jiang , Siqiao Xue, Qi Zhang , and 2 more authors

In ACL , 2024

Abs arXiv Bib Code

Self-attention and position embedding are two key modules in transformer-based Large Language Models (LLMs). However, the potential relationship between them is far from well studied, especially for long context window extending. In fact, anomalous behaviors harming long context extrapolation exist between Rotary Position Embedding (RoPE) and vanilla self-attention unveiled by our work. To address this issue, we propose a novel attention mechanism, CoCA (Collinear Constrained Attention). Specifically, we enforce a collinear constraint between Q and K to seamlessly integrate RoPE and self-attention. While only adding minimal computational and spatial complexity, this integration significantly enhances long context window extrapolation ability. We provide an optimized implementation, making it a drop-in replacement for any existing transformer-based models. Extensive experiments show that CoCA performs extraordinarily well in extending context windows. A CoCA-based GPT model, trained with a context length of 512, can seamlessly extend the context window up to 32K (60×), without any fine-tuning. Additionally, by dropping CoCA in LLaMA-7B, we achieve extrapolation up to 32K within only 2K training length. Our code is publicly available at: this https URL.
@inproceedings{zhu2024coca, title = {CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending}, author = {Zhu, Shiyi and Ye, Jing and Jiang, Wei and Xue, Siqiao and Zhang, Qi and Wu, Yifan and Li, Jianguo}, year = {2024}, booktitle = {ACL} }
arXiv
Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models

Zhixuan Chu , Lei Zhang , Yichen Sun , Siqiao Xue, Zhibo Wang , and 2 more authors

arXiv preprint, 2024

Abs arXiv Bib Code

The rapid advancement in text-to-video (T2V) generative models has enabled the synthesis of high-fidelity video content guided by textual descriptions. Despite this significant progress, these models are often susceptible to hallucination, generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this critical issue, we introduce the SoraDetector, a novel unified framework designed to detect hallucinations across diverse large T2V models, including the cutting-edge Sora model. Our framework is built upon a comprehensive analysis of hallucination phenomena, categorizing them based on their manifestation in the video content. Leveraging the state-of-the-art keyframe extraction techniques and multimodal large language models, SoraDetector first evaluates the consistency between extracted video content summary and textual prompts, then constructs static and dynamic knowledge graphs (KGs) from frames to detect hallucination both in single frames and across frames. Sora Detector provides a robust and quantifiable measure of consistency, static and dynamic hallucination. In addition, we have developed the Sora Detector Agent to automate the hallucination detection process and generate a complete video quality report for each input video. Lastly, we present a novel meta-evaluation benchmark, T2VHaluBench, meticulously crafted to facilitate the evaluation of advancements in T2V hallucination detection. Through extensive experiments on videos generated by Sora and other large T2V models, we demonstrate the efficacy of our approach in accurately detecting hallucinations. The code and dataset can be accessed via GitHub.
@article{chu2024sora, journal = {arXiv preprint}, title = {Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models}, author = {Chu, Zhixuan and Zhang, Lei and Sun, Yichen and Xue, Siqiao and Wang, Zhibo and Qin, Zhan and Ren, Kui}, year = {2024}, }
arXiv
DB-GPT: Empowering Database Interactions with Private Large Language Models

Siqiao Xue, Caigao Jiang , Wenhui Shi , Fangyin Cheng , Keting Chen , and 11 more authors

arXiv preprint, 2024

Abs arXiv Bib Video Code

The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are paramount. In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user experience and accessibility. DB-GPT is designed to understand natural language queries, provide context-aware responses, and generate complex SQL queries with high accuracy, making it an indispensable tool for users ranging from novice to expert. The core innovation in DB-GPT lies in its private LLM technology, which is fine-tuned on domain-specific corpora to maintain user privacy and ensure data security while offering the benefits of state-of-the-art LLMs. We detail the architecture of DB-GPT, which includes a novel retrieval augmented generation (RAG) knowledge system, an adaptive learning mechanism to continuously improve performance based on user feedback and a service-oriented multi-model framework (SMMF) with powerful data-driven agents. Our extensive experiments and user studies confirm that DB-GPT represents a paradigm shift in database interactions, offering a more natural, efficient, and secure way to engage with data repositories. The paper concludes with a discussion of the implications of DB-GPT framework on the future of human-database interaction and outlines potential avenues for further enhancements and applications in the field. The project code is available at this https URL. Experience DB-GPT for yourself by installing it with the instructions this https URL and view a concise 10-minute video at this https URL.
@article{xue2024dbgpt, journal = {arXiv preprint}, title = {DB-GPT: Empowering Database Interactions with Private Large Language Models}, author = {Xue, Siqiao and Jiang, Caigao and Shi, Wenhui and Cheng, Fangyin and Chen, Keting and Yang, Hongjun and Zhang, Zhiping and He, Jianshan and Zhang, Hongyang and Wei, Ganglin and Zhao, Wang and Zhou, Fan and Qi, Danrui and Yi, Hong and Liu, Shaodong and Chen, Faqiang}, year = {2024}, }
AAAI
Enhancing Recommender Systems with Large Language Model Reasoning Graphs

Yan Wang , Zhixuan Chu , Xin Ouyang , Simeng Wang , Hongyan Hao , and 6 more authors

In AAAI , 2024

Abs arXiv Bib

Recommendation systems aim to provide users with relevant suggestions, but often lack interpretability and fail to capture higher-level semantic relationships between user behaviors and profiles. In this paper, we propose a novel approach that leverages large language models (LLMs) to construct personalized reasoning graphs. These graphs link a user’s profile and behavioral sequences through causal and logical inferences, representing the user’s interests in an interpretable way. Our approach, LLM reasoning graphs (LLMRG), has four components: chained graph reasoning, divergent extension, self-verification and scoring, and knowledge base self-improvement. The resulting reasoning graph is encoded using graph neural networks, which serves as additional input to improve conventional recommender systems, without requiring extra user or item information. Our approach demonstrates how LLMs can enable more logical and interpretable recommender systems through personalized reasoning graphs. LLMRG allows recommendations to benefit from both engineered recommendation systems and LLM-derived reasoning graphs. We demonstrate the effectiveness of LLMRG on benchmarks and real-world scenarios in enhancing base recommendation models.
@inproceedings{wang2023enhancing, title = {Enhancing Recommender Systems with Large Language Model Reasoning Graphs}, author = {Wang, Yan and Chu, Zhixuan and Ouyang, Xin and Wang, Simeng and Hao, Hongyan and Shen, Yue and Gu, Jinjie and Xue, Siqiao and Zhang, James Y and Cui, Qing and others}, booktitle = {AAAI}, year = {2024}, }
ICLR
EasyTPP: Towards Open Benchmarking Temporal Point Processes

Siqiao Xue, Xiaoming Shi , Zhixuan Chu , Yan Wang , Hongyan Hao , and 7 more authors

In ICLR , 2024

Abs arXiv Bib Code

Continuous-time event sequences play a vital role in real-world domains such as healthcare, finance, online shopping, social networks, and so on. To model such data, temporal point processes (TPPs) have emerged as the most natural and competitive models, making a significant impact in both academic and application communities. Despite the emergence of many powerful models in recent years, there hasn’t been a central benchmark for these models and future research endeavors. This lack of standardization impedes researchers and practitioners from comparing methods and reproducing results, potentially slowing down progress in this field. In this paper, we present EasyTPP, the first central repository of research assets (e.g., data, models, evaluation programs, documentations) in the area of event sequence modeling. Our EasyTPP makes several unique contributions to this area: a unified interface of using existing datasets and adding new datasets; a wide range of evaluation programs that are easy to use and extend as well as facilitate reproducible research; implementations of popular neural TPPs, together with a rich library of modules by composing which one could quickly build complex models. All the data and implementation can be found at this https URL. We will actively maintain this benchmark and welcome contributions from other researchers and practitioners. Our benchmark will help promote reproducible research in this field, thus accelerating research progress as well as making more significant real-world impacts.
@inproceedings{xue2024easytpp, title = {EasyTPP: Towards Open Benchmarking Temporal Point Processes}, author = {Xue, Siqiao and Shi, Xiaoming and Chu, Zhixuan and Wang, Yan and Hao, Hongyan and Zhou, Fan and Jiang, Caigao and Pan, Chen and Zhang, James Y and Wen, Qingsong and Zhou, Jun and Mei, Hongyuan}, booktitle = {ICLR}, year = {2024}, }

2023

EMNLP
Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompt

Gangwei Jiang , Caigao Jiang , Siqiao Xue, James Y. Zhang , Jun Zhou , and 2 more authors

In Findings of EMNLP , 2023

Abs arXiv Bib Code

Continual pre-training has been urgent for adapting a pre-trained model to a multitude of domains and tasks in the fast-evolving world. In practice, a continually pre-trained model is expected to demonstrate not only greater capacity when fine-tuned on pre-trained domains but also a non-decreasing performance on unseen ones. In this work, we first investigate such anytime fine-tuning effectiveness of existing continual pre-training approaches, concluding with unanimously decreased performance on unseen domains. To this end, we propose a prompt-guided continual pre-training method, where we train a hypernetwork to generate domain-specific prompts by both agreement and disagreement losses. The agreement loss maximally preserves the generalization of a pre-trained model to new domains, and the disagreement one guards the exclusiveness of the generated hidden states for each domain. Remarkably, prompts by the hypernetwork alleviate the domain identity when fine-tuning and promote knowledge transfer across domains. Our method achieved improvements of 3.57% and 3.4% on two real-world datasets (including domain shift and temporal shift), respectively, demonstrating its efficacy.
@inproceedings{jiang2023anytime, title = {Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompt}, author = {Jiang, Gangwei and Jiang, Caigao and Xue, Siqiao and Zhang, James Y. and Zhou, Jun and Lian, Defu and Wei, Ying}, year = {2023}, booktitle = {Findings of EMNLP} }
arXiv
WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine

Siqiao Xue*, Fan Zhou* , Yi Xu , Hongyu Zhao , Shuo Xie , and 5 more authors

arXiv preprint, 2023

Abs arXiv Bib Video Code

We present WeaverBird, an intelligent dialogue system designed specifically for the finance domain. Our system harnesses a large language model of GPT architecture that has been tuned using extensive corpora of finance-related text. As a result, our system possesses the capability to understand complex financial queries, such as "How should I manage my investments during inflation?", and provide informed responses. Furthermore, our system incorporates a local knowledge base and a search engine to retrieve relevant information. The final responses are conditioned on the search results and include proper citations to the sources, thus enjoying an enhanced credibility. Through a range of finance-related questions, we have demonstrated the superior performance of our system compared to other models. To experience our system firsthand, users can interact with our live demo at this https URL, as well as watch our 2-min video illustration at this https URL.
@article{xue2023weaverbird, title = {WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine}, author = {Xue*, Siqiao and Zhou*, Fan and Xu, Yi and Zhao, Hongyu and Xie, Shuo and Jiang, Caigao and Zhang, James and Zhou, Jun and Xiu, Dacheng and Mei, Hongyuan}, journal = {arXiv preprint}, year = {2023} }
NeurIPS
Prompt-augmented Temporal Point Process for Streaming Event Sequence

Siqiao Xue*, Yan Wang* , Zhixuan Chu , Xiaoming Shi , Caigao Jiang , and 5 more authors

In NeurIPS , 2023

Abs arXiv Bib Code

Neural Temporal Point Processes (TPPs) are the prevalent paradigm for modeling continuous-time event sequences, such as user activities on the web and financial transactions. In real-world applications, event data is typically received in a \emphstreaming manner, where the distribution of patterns may shift over time. Additionally, \emphprivacy and memory constraints are commonly observed in practical scenarios, further compounding the challenges. Therefore, the continuous monitoring of a TPP to learn the streaming event sequence is an important yet under-explored problem. Our work paper addresses this challenge by adopting Continual Learning (CL), which makes the model capable of continuously learning a sequence of tasks without catastrophic forgetting under realistic constraints. Correspondingly, we propose a simple yet effective framework, PromptTPP\footnoteOur code is available at \small \url this https URL, by integrating the base TPP with a continuous-time retrieval prompt pool. The prompts, small learnable parameters, are stored in a memory space and jointly optimized with the base TPP, ensuring that the model learns event streams sequentially without buffering past examples or task-specific attributes. We present a novel and realistic experimental setup for modeling event streams, where PromptTPP consistently achieves state-of-the-art performance across three real user behavior datasets.
@inproceedings{xue2023prompttpp, title = {Prompt-augmented Temporal Point Process for Streaming Event Sequence}, booktitle = {NeurIPS}, author = {Xue*, Siqiao and Wang*, Yan and Chu, Zhixuan and Shi, Xiaoming and Jiang, Caigao and Hao, Hongyan and Jiang, Gangwei and Feng, Xiaoyun and Zhang, James and Zhou, Jun}, year = {2023} }
NeurIPS
Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning

Xiaoming Shi , Siqiao Xue, Kangrui Wang , Fan Zhou , James Y. Zhang , and 3 more authors

In NeurIPS , 2023

Abs arXiv Bib Code

Large language models have shown astonishing performance on a wide range of reasoning tasks. In this paper, we investigate whether they could reason about real-world events and help improve the prediction performance of event sequence models. We design LAMP, a framework that integrates a large language model in event prediction. Particularly, the language model performs abductive reasoning to assist an event sequence model: the event model proposes predictions on future events given the past; instructed by a few expert-annotated demonstrations, the language model learns to suggest possible causes for each proposal; a search module finds out the previous events that match the causes; a scoring function learns to examine whether the retrieved events could actually cause the proposal. Through extensive experiments on several challenging real-world datasets, we demonstrate that our framework—thanks to the reasoning capabilities of large language models—could significantly outperform the state-of-the-art event sequence models.
@inproceedings{shi2023abductive, title = {Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning}, author = {Shi, Xiaoming and Xue, Siqiao and Wang, Kangrui and Zhou, Fan and Zhang, James Y. and Zhou, Jun and Tan, Chenhao and Mei, Hongyuan}, booktitle = {NeurIPS}, year = {2023}, }
AAAI
Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point Processes

Chao Qu , Xiaoyu Tan , Siqiao Xue, Xiaoming Shi , James Zhang , and 1 more author

In AAAI , 2023

Abs arXiv Bib Code

We consider a sequential decision making problem where the agent faces the environment characterized by the stochastic discrete events and seeks an optimal intervention policy such that its long-term reward is maximized. This problem exists ubiquitously in social media, finance and health informatics but is rarely investigated by the conventional research in reinforcement learning. To this end, we present a novel framework of the model-based reinforcement learning where the agent’s actions and observations are asynchronous stochastic discrete events occurring in continuous-time. We model the dynamics of the environment by Hawkes process with external intervention control term and develop an algorithm to embed such process in the Bellman equation which guides the direction of the value gradient. We demonstrate the superiority of our method in both synthetic simulator and real-world problem.
@inproceedings{qu2023bellman, title = {Bellman Meets {H}awkes: Model-Based Reinforcement Learning via Temporal Point Processes}, author = {Qu, Chao and Tan, Xiaoyu and Xue, Siqiao and Shi, Xiaoming and Zhang, James and Mei, Hongyuan}, booktitle = {AAAI}, year = {2023}, }

2022

NeurIPS
HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences

Siqiao Xue, Xiaoming Shi , James Y Zhang , and Hongyuan Mei

In NeurIPS , 2022

Abs arXiv Bib Code

In this paper, we tackle the important yet under-investigated problem of making long-horizon prediction of event sequences. Existing state-of-the-art models do not perform well at this task due to their autoregressive structure. We propose HYPRO, a hybridly normalized probabilistic model that naturally fits this task: its first part is an autoregressive base model that learns to propose predictions; its second part is an energy function that learns to reweight the proposals such that more realistic predictions end up with higher probabilities. We also propose efficient training and inference algorithms for this model. Experiments on multiple real-world datasets demonstrate that our proposed HYPRO model can significantly outperform previous models at making long-horizon predictions of future events. We also conduct a range of ablation studies to investigate the effectiveness of each component of our proposed methods.
@inproceedings{xue2022hypro, author = {Xue, Siqiao and Shi, Xiaoming and Zhang, James Y and Mei, Hongyuan}, title = {HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences}, booktitle = {NeurIPS}, year = {2022}, }
KDD
A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud

Siqiao Xue*, Chao Qu* , Xiaoming Shi , Cong Liao , Shiyi Zhu , and 9 more authors

In KDD , 2022

Abs arXiv Bib

Predictive autoscaling (autoscaling with workload forecasting) is an important mechanism that supports autonomous adjustment of computing resources in accordance with fluctuating workload demands in the Cloud. In recent works, Reinforcement Learning (RL) has been introduced as a promising approach to learn the resource management policies to guide the scaling actions under the dynamic and uncertain cloud environment. However, RL methods face the following challenges in steering predictive autoscaling, such as lack of accuracy in decision-making, inefficient sampling and significant variability in workload patterns that may cause policies to fail at test time. To this end, we propose an end-to-end predictive meta model-based RL algorithm, aiming to optimally allocate resource to maintain a stable CPU utilization level, which incorporates a specially-designed deep periodic workload prediction model as the input and embeds the Neural Process to guide the learning of the optimal scaling actions over numerous application services in the Cloud. Our algorithm not only ensures the predictability and accuracy of the scaling strategy, but also enables the scaling decisions to adapt to the changing workloads with high sample efficiency. Our method has achieved significant performance improvement compared to the existing algorithms and has been deployed online at Alipay, supporting the autoscaling of applications for the world-leading payment platform.
@inproceedings{xue_meta_2022, author = {Xue*, Siqiao and Qu*, Chao and Shi, Xiaoming and Liao, Cong and Zhu, Shiyi and Tan, Xiaoyu and Ma, Lintao and Wang, Shiyu and Wang, Shijun and Hu, Yun and Lei, Lei and Zheng, Yangfei and Li, Jianguo and Zhang, James}, title = {A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud}, booktitle = {KDD}, year = {2022} }