Hi! My name is Sinjoni Mukhopadhyay King. I graduated with a Ph.D. from the CSE Department at UC Santa Cruz in March 2023. I have been a part of INRG since 2021 with Professor Katia Obraczka as my primary advisor and Professor Faisal Nawab from UC Irvine, as my co-advisor. My research interests include storage security, workload characterization, ML feature extraction and selection, synthetic workload generation, applications of generative AI with focus on Generative Adversarial Networks (GANs) and Large Language Models (LLMs). Prior to UC Santa Cruz, I graduated with my B.S. in Electronics and Telecommunications from Symbiosis International University in the year 2015, where I worked on autonomous war robots, piezo-electric staircase tiles and computed tomography machine firmware development.

Since 2016, I have worked on numerous projects in the areas of secure distribution of long term storage, optimization of feature selection process for storage workloads, analysis of storage workloads and have collaborated with IBM Zurich and CERN for data collection in Geneva in October 2019. For my dissertation, I have worked on user mobility and activity (uMA) open source dataset analysis and applying GANs for generation of these uMA traces. During my internship at Apple in Summer 2020, I worked on router placement optimizations. I started working full-time at Apple in the Fall of 2020, where I have since worked on localization algorithm development (VIO/SLAM), Context aware conversation grouping and ML based performance tuning for audio product modules.

For any questions related to my background or research please feel free to message me on Linkedin or email me at simukhop@ucsc.edu.

Research

Name: UMAD: CLASSIFICATION, GENERATION AND ANALYSIS OF USER MOBILITY AND ACTIVITY DATA

Keywords: user activity, mobility, trace generation, GANs, analysis

Introduction: Access to user mobility and activity data (uMAD) is crucial for researchers and practitioners in various areas of technology and infrastructure planning. It reveals a number of aspects of user behavior and trends at different spatio-temporal scales which in turn provide invaluable information to guide the design, operation, and management of critical infrastructure, services and applications. However, previous academic/industry efforts to collect user mobility and activity (uMA) information face important challenges raised by issues such as uMA data diversity, privacy and protection concerns. Consequently, even if uMA data is collected successfully, it cannot be generalized and/or shared publicly. To address these challenges, there has been significant work on the generation of synthetic uMA datasets as well as work on data anonymization. Prior work in these areas, however, target specific applications and datasets, and thus make it harder to generalize them for use across different scenarios.

Our aim is to fill these gaps by providing an uMA ecosystem that manages classification, generation, evaluation and analysis. As part of this goal, our pipeline uMAD aims to include the following features: \textit{Classification}: enabling existing or new uMA data to be classified into our proposed taxonomy buckets; \textit{Generation}: allowing users to capture patterns and generate realistic uMA datasets by leveraging well known Machine learning generation models like Generative Adversarial Networks (GANs); \textit{Trace Analysis}: helping users analyze and visualize patterns in existing and new uMA datasets; and \textit{Model Analysis} providing users with a broad understanding of the ML model resource consumption and parameters. uMAD’s open-source command line interface (CLI) is eventually meant to generate realistic synthetic uMA datasets that mimic existing traces for a range of user-configurable parameters and provide users with existing datasets that can be selected based on the users’ specific needs.

Publications

  • Sinjoni Mukhopadhyay King, Faisal Nawab, Katia Obraczka, “A Survey of Open Source User Activity Traces with Applications to User Mobility Characterization and Modeling”, 2021 preprint on arxiv.org. PDF.
  • Oceane Bel; Sinjoni Mukhopadhyay; Nathan Tallent; Faisal Nawab; Darrell Long, “WinnowML: Stable feature selection for maximizing prediction accuracy of time-based system modeling”, Published in: 2021 IEEE International Conference on Big Data (Big Data). PDF.
  • Sinjoni Mukhopadhyay, “Secure Distributed Storage for the Internet of Things”, Published in Women Securing the Future with TIPPSS for IoT, 2019. PDF.
  • Daniel Bittman; Matthew Gray; Justin Raizes; Sinjoni Mukhopadhyay; Matt Bryson; Peter Alvaro; Darrell D.E. Long; Ethan L. Miller, “Designing Data Structures to Minimize Bit Flips on NVM”, Published in: 2018 IEEE 7th Non-Volatile Memory Systems and Applications Symposium (NVMSA). PDF.
  • Sinjoni Mukhopadhyay; Joel Frank; Justin King; Daniel Bittman; Darrell Long; Ethan Miller, “Efficient Reconstruction Techniques for Disaster Recovery in Secret-Split Datastores”, Published in: 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). PDF.