Lijing Wang

Research Interests

Machine Learning, Artificial Intelligence, and Natural Language Processing
- Theory-Guided Machine Learning
- Deep Ensemble Algorithms
- Efficient Learning Algorithms
- Model Pruning Algorithms
- Spatial-Temporal Data Analysis
- Generalization of Large Language Models (LLMs)
Applications and Domains
- Recommender Systems
- User Behavior and Network Analysis in Social Media
- Epidemiological Modeling and Disease Forecasting
- Healthcare Analytics with EHR (Electronic Health Records)

Grants

Grace Hopper Artificial Intelligence Research Institute Grant Award
- Principal Investigator: Lijing Wang
- Project Title: AI for Reliable Epidemic Forecasting
- Funded Amount: $10,000
Grace Hopper Artificial Intelligence Research Institute Grant Award
- Principal Investigator: Zhifeng Kou, Bioinformatics, NJIT
- Co-Principal Investigator: Lijing Wang, Data Science, NJIT
- Project Title: Large Language Model-Driven AI Platform for Next-Generation Surgical Planning and Navigation
- Funded Amount: $25,000
FY25 Faculty Seed Grant Award
- Principal Investigator: Shuai Zhang
- Department: Data Science
- Project Title: Provable Efficient Learning with Foundation Models
- Co-Principal Investigator(s): Lijing Wang
- Funded Amount: $10,000
FY24 Faculty Seed Grant Award
- Principal Investigator: Lijing Wang
- Department: Data Science
- Project Title: Towards Improving the Generalization and Robustness of Large Pretrained Language Models
- Co-Principal Investigator(s): Mengnan Du (Data Science)
- Funded Amount: $10,000

Current Research Projects

Improving the Generalization, Consistency, and Robustness of Large Pretrained Language Models.
Key words: generalization, consistency, robustness, LLMs
In this project, we focus on designing robust and consistent fine-tuning methods for LLMs in general domains. This includes examining the effects of random seed initialization on performance variability and addressing challenges related to domain adaptation. The goal is to make LLMs more reliable and versatile for diverse applications.
- Hao Zhou, Guergana Savova, and Lijing Wang. Assessing the Micro and Macro Effects of Random Seeds on Fine-Tuning Large Language Models. Under review in ACL 2025 short paper.
- Lijing Wang*, Yingya Li*, Timothy A. Miller, Steven Bethard, and Guergana Savova. Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models .The 61st Annual Meeting of the Association for Computational Linguistics (ACL'23), Toronto, Canada, July 9-14th, 2023. To appear. [ pdf | bibtex | code]

Resource-Efficient Deep Recommendation Systems.
Key words: deep learning, model pruning, feature selection, efficient learning
In this project, we are developing efficient deep neural network (DNN)-based recommendation systems by employing model pruning and feature selection techniques. This research addresses computational challenges, enhancing scalability while maintaining performance in large-scale recommendation frameworks.
- Ching-Hao Fan, Yue Ning, and Lijing Wang. Optimizing Recommender Systems: A Structured Pruning Approach with Pretraining for Enhanced Efficiency and Accuracy. . Under review in ACM KDD 2025.

Integrating GNNs and NLP for Social Science Applications.
Key words: deep learning, GNN, NLP, social science, network analysis, behavior analysis
In this project, we explore the application of GNNs and NLP to analyze user behavior and network dynamics in crowdsourcing platforms. By modeling textual data and network interactions, this work seeks to uncover patterns in collaboration and knowledge-sharing, with the goal of designing adaptive systems that foster productivity and engagement.
- Ching-Hao Fan, Hao Zhou, Yao Sun, Geovanny Palomino Roldan, Olga Kokshagina, Marc Santolini, and Lijing Wang. Incorporating Knowledge Sharing in Graph Learning for User Behavior Prediction in Crowd-Empowered Online Communities. .Under review in The 15th ACM International Conference on Multimedia Retrieval (ICMR 2025).

AI in Medical Imaging and Surgical Navigation.
Key words: NLP, LLM, image processing
In collaboration with faculties in bioinformatics at NJIT and experts in medical imaging and neural sciences, we are developing an AI-based platform for automated surgical planning and navigation, with a focus on brain tumor surgeries. This work leverages LLMs, advanced ML algorithms, and cutting-edge imaging techniques to enhance the interaction between clinicians and imaging systems, aiming to replace manual tasks while ensuring precision and efficiency.

Previous Research Projects

Improving consistency of deep learning models via ensemble techniques.

Deep Learning Theory and Algorithm
Key words: deep learning, consistency, correct-consistency, snapshot ensemble
Deep learning models are assisting humans in making decisions and hence the user's trust in these models is of paramount importance. Trust is often a function of constant behavior. From an AI model perspective it means given the same input the user would expect the same output, especially for correct outputs, or in other words consistently correct outputs. We study a model behavior in the context of periodic retraining of deployed models where the outputs from successive generations of the models might not agree on the correct labels assigned to the same input. We formally define consistency and correct-consistency of a learning model. We prove that consistency and correct-consistency of an ensemble learner is not less than the average consistency and correct-consistency of individual learners and correct-consistency can be improved with a probability by combining learners with accuracy not less than the average accuracy of ensemble component learners. To validate the theory using three datasets and two state-of-the-art deep learning classifiers we also propose an efficient dynamic snapshot ensemble method and demonstrate its value.
- Lijing Wang, Dipanjan Ghosh, Maria Gonzalez Diaz, Ahmed Farahat, Mahbubul Alam, Chetan Gupta, Jiangzhuo Chen, Madhav Marathe. Wisdom of the Ensemble: Improving Consistency of Deep Learning Models. Advances in Neural Information Processing Systems 33 (NeurIPS'20), online, Dec 06-12, 2020. Acceptance rate: 20.1%. [ pdf | bibtex | code | poster]

Epidemic forecasting with recurrent neural networks and graph neural networks.

Epidemic Forecasting and Simulating
Key words: RNN, GNN, dynamic networks, mobility map
Forecasting the spatial and temporal evolution of epidemics has been an area of active research over the past couple of decades. Pure data-driven methods employ statistical and time-series-based methodologies to learn patterns in historical epidemic data and leverage those patterns for forecasting. Recurrent neural networks (RNNs) are widely used for time series forecasting since it can capture the temporal dynamics. Graph neural networks (GNNs) are famous for their ability to capture cross-spatial effects in dynamic environments. We propose novel frameworks that use RNN and GNN for spatio-temporal epidemic forecasting. Extensive experiments on seasonal influenza-like-illness (ILI) datasets and COVID-19 cases datasets demonstrate the value of the proposed methods.
- Lijing Wang*, Xue Ben*, Aniruddha Adiga*, Adam Sadilek, Ashish Tendulkar, Srinivasan Venkatramanan, Anil Vullikanti, Gaurav Aggarwal, Alok Talekar, Jiangzhuo Chen, Bryan Lewis, Samarth Swarup, Amol Kapoor, Milind Tambe, Madhav Marathe. Using Mobility Data to Understand and Forecast COVID19 Dynamics. The 29th International Joint Conference on Artificial Intelligence Workshop on AI for Social Good (IJCAI AI4SG'21), online, Jan 07-15, 2021. Long talk. Acceptance rate: 28%. [ pdf | bibtex ]
- Lijing Wang, Aniruddha Adiga, Srinivasan Venkatramanan, Jiangzhuo Chen, Bryan Lewis, Madhav Marathe. Examining Deep Learning Models with Multiple Data Sources for COVID-19 Forecasting. IEEE BigData 2020 Workshop on Data Science in Medicine and Healthcare (IEEE BigData DSMH'20), online, Dec 10-13, 2020. [ pdf | bibtex ]
- Songgaojun Deng, Shusen Wang, Huzefa Rangwala, Lijing Wang, Yue Ning. Cola-GNN: Cross-location Attention based Graph Neural Networks for Long-term ILI Prediction. The 20th ACM International Conference on Information and Knowledge Management (CIKM'20), online, Oct 19-23, 2020. Research Track. Acceptance rate: 20%. [ pdf | bibtex | code ]
Combining theory and deep learning for epidemic forecasting.

Epidemic Forecasting and Simulating
Key words: DNN, theory-based causal models, synthetic data
Deep learning methods have gained popularity in epidemic forecasting domain due to their advances in computer vision, natural language processing, and many other domains. A drawback with the deep learning models is their black box nature, while they are capable of providing correct inferences they lack explanatory power for the underlying phenomena. We are first proposing to combine mechanistic causal methods with deep learning based methods leading to explainable AI. The proposed methods are able to provide correct inference as well as better understanding of the learned models.
- Lijing Wang, Jiangzhuo Chen, Madhav Marathe. TDEFSI: Theory Guided Deep Learning Based Epidemic Forecasting with Synthetic Information. ACM Transactions on Spatial Algorithms and Systems (TSAS'20), Deep Learning for Spatial Algorithms and Systems, 2020 May;6(3):1-39. Impact factor: 1.69. [ pdf | bibtex | poster ]
- Lijing Wang, Jiangzhuo Chen, Madhav Marathe. DEFSI: Deep Epidemic Forecasting with Synthetic Information. The 32nd Innovative Applications of Artificial Intelligence (IAAI'19), Hawaii, USA, Jan 27-Feb 01, 2019. Acceptance rate: 35%. [ pdf | bibtex ]
Epidemic forecasting with mobility data

Epidemic Forecasting and Simulating
Key words: human mobility, GNN, agent-based SEIR models, metapopulation SEIR models
Human mobility is a primary driver of infectious disease spread. Thus, the disease dynamics are heavily affected by human mobility behaviours. In this research work, we propose new models (metapopulation models, agent-based models, and graph neural network models) that leverage a large-scale anonymized mobility map aggregated over hundreds of millions of smartphones and evaluate its utility in forecasting epidemics. On one side, we factor mobility map into a metapopulation model to retrospectively forecast influenza in the USA and Australia. On the other side, we use mobility information to build graph neural networks for COVID-19 confirmed case forecasting at US state level. Our work takes the first step towards timely infectious disease forecasting at a global scale and opens new possibilities in studying human mobility and its applications to infectious disease epidemiology.
- Lijing Wang*, Xue Ben*, Aniruddha Adiga*, Adam Sadilek, Ashish Tendulkar, Srinivasan Venkatramanan, Anil Vullikanti, Gaurav Aggarwal, Alok Talekar, Jiangzhuo Chen, Bryan Lewis, Samarth Swarup, Amol Kapoor, Milind Tambe, Madhav Marathe. Using Mobility Data to Understand and Forecast COVID19 Dynamics. The 29th International Joint Conference on Artificial Intelligence Workshop on AI for Social Good (IJCAI AI4SG'21), online, Jan 07-15, 2021. Long talk. Acceptance rate: 28%. [ pdf | bibtex ]
- Alok Talekar, Nidhin Vaidhiyan, Sharad Shriram, Gaurav Aggarwal, Jiangzhuo Chen, Srini Venkatramanan, Lijing Wang, Aniruddha Adiga, Adam Sadilek, Ashish Tendulkar, Madhav Marathe, Rajesh Sundaresan and Milind Tambe. Cohorting to Isolate Asymptomatic Spreaders: An Agent-based Simulation Study on the Mumbai Suburban Railway. The 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS'21), online, May 03-07, 2021. Acceptance rate: 24.8%. [ pdf | bibtex ]
- Srinivasan Venkatramanan, Adam Sadilek, Arindam Fadikar, Christopher L. Barrett, Matthew Biggerstaff, Jiangzhuo Chen, Xerxes Dotiwalla, Paul Eastham, Bryant Gipson, Dave Higdon, Onur Kucuktunc, Allison Lieber, Bryan L Lewis, Zane Reynolds, Anil K Vullikanti, Lijing Wang, Madhav Marathe. Forecasting Influenza Activity Using Machine-Learned Mobility Map. Nature Communications (NatureComms), 2021 Feb 09;12(1): 1-12. Impact factor: 12.121. [ pdf | bibtex ]
- Aniruddha Adiga*, Lijing Wang*, Adam Sadilek*, Ashish Tendulkar*, Srinivasan Venkatramanan, Anil Vullikanti, Gaurav Aggarwal, Alok Talekar, Xue Ben, Jiangzhuo Chen, Bryan Lewis, Samarth Swarup, Milind Tambe, Madhav Marathe. Interplay of global multi-scale human mobility, social distancing, government interventions, and COVID-19 dynamics. medRxiv (DOI). [ pdf | bibtex]
Epidemic forecasting with social media data

Epidemic Forecasting and Simulating
Key words: twitter posts, topic modeling, agent-based SEIR models
Traditional compartmental epidemiology models are able to capture the disease spreading trends through contact network, however, unable to provide timely updates via real-world data. In contrast, techniques focusing on emerging social media platforms can collect and monitor real-time disease data, but don not provide understanding of the underlying dynamics of ailment propagation. To achieve efficient and accurate real-time disease prediction, the framework proposed in this paper combines the strength of social media mining and computational epidemiology. Specifically, individual health status is first learned from user’s online posts through Bayesian inference, disease parameters are then extracted for the computational models in population-level, and the outputs of computational epidemiology model are inversely fed into the mining of social media data for further performance improvement.
- Ting Hua, Chandan K Reddy, Lei Zhang, Lijing Wang, Liang Zhao, Chang-Tien Lu, Naren Ramakrishnan.
  Social Media based Simulation Models for Understanding Disease Dynamics. The 27th International Joint Conference on Artificial Intelligence (IJCAI'18), Stockholm, Sweden, Jul 13-19, 2018. Acceptance rate: 20.5%. [ pdf | bibtex ]
Health disparity analysis in infectious disease via agent-based SEIR simulations

Epidemic Forecasting and Simulating
Key words: health disparity, agent-based SEIR models, net return, vaccination strategy
Infectious diseases such as Influenza and Ebola pose a serious threat to everyone but certain demographics and cohorts face a higher risk of infection than others. This research provides a computational framework for studying health disparities among cohorts based on individual level features, such as age, gender, income, etc. We apply this framework to find health disparities among subpopulations in an influenza epidemic and evaluate vaccination prioritization strategies to achieve specific objectives. The results, framework, and methodology developed here can assist public health policy makers in efficiently allocating limited pharmaceutical resources.
- Lijing Wang, Jiangzhuo Chen, Achla Marathe. A framework for discovering health disparities among cohorts in an influenza epidemic. World Wide Web Journal Special Issue on Social computing and big data applications (WWWJ'19), 2019 Nov;22(6):2997-3020. Impact factor: 2.892. [ pdf | bibtex | poster]

Identity reconciliation via graph matching

Network Science
Key words: social network, percolation-based graph matching
Linking multiple accounts owned by the same user across different online social networks (OSNs) is an important issue in social networks, known as identity reconciliation. Graph matching is one of popular techniques to solve this problem by identifying a map that matches a set of vertices across different OSNs. Among them, percolation-based graph matching (PGM) has been explored to identify entities belonging to a same user across two different networks based on a set of initial pre-matched seed nodes and graph structural information. However, existing PGM algorithms have been applied in only undirected networks while many OSNs are represented by directional relationships (e.g., followers or followees in Twitter or Facebook). For PGM to be applicable in real world OSNs represented by directed networks with a small set of overlapping vertices, we propose a percolation-based directed graph matching algorithm, namely PDGM, by considering the following two key features: (1) similarity of two nodes based on directional relationships (i.e., outgoing edges vs. incoming edges); and (2) celebrity penalty such as penalty given for nodes with a high in-degree. Through the extensive simulation experiments, our results show that the proposed PDGM outperforms the baseline PGM counterpart that does not consider either directional relationships or celebrity penalty.
- Lijing Wang, Jin-Hee Cho, Ing-Ray Chen, Jiangzhuo Chen. PDGM: Percolation-based directed graph matching in social networks. 2017 IEEE International Conference on Communications (IEEE ICC'17), Paris, France, May 21-25, 2017. Acceptance rate: 36.1%. [ pdf | bibtex ]

Lijing Wang

New Jersey Institute of Technology

Research Interests

Grants

Current Research Projects

Improving the Generalization, Consistency, and Robustness of Large Pretrained Language Models.

Resource-Efficient Deep Recommendation Systems.

Integrating GNNs and NLP for Social Science Applications.

AI in Medical Imaging and Surgical Navigation.

Previous Research Projects

Improving consistency of deep learning models via ensemble techniques.

Epidemic forecasting with recurrent neural networks and graph neural networks.

Combining theory and deep learning for epidemic forecasting.

Epidemic forecasting with mobility data

Epidemic forecasting with social media data

Health disparity analysis in infectious disease via agent-based SEIR simulations

Identity reconciliation via graph matching