My interests revolve around developing, applying, and scaling probabilistic machine learning algorithms for highly relational network data (e.g. social networks, power grids). More specifically, I am interested in graphical models and deep learning-based approaches to structured prediction with relational data and utilizing these methods for large-scale predictive tasks.
Master of Science,
University of California, Santa Cruz
Bachelor of Science,
University of Delaware
Bachelor of Mechanical
University of Delaware
I am currently completing a Master of Science in Computer Science at the University of California, Santa Cruz. My studies center around machine learning, specifically in graphical models and structured prediction for relational data. While at the University of Delaware, I studied automated natural language analyses of code to augment and improve software engineering tasks. Previously, I interned for Adobe Research where I researched probabilistic entity resolution for anonymous web activity, and for Xerox PARC working on context-aware probabilistic models for predicting mobile device activity usage.
Web service users often provide feedback in terms of partial preferences by submitting reviews/likes for a small number of items (e.g., movies, products). Unfortunately, user feedback is seldom and therefore extremely sparse, and the task of recommending new items is challenging when relying solely on the feedback space of items. Despite the lack of feedback, real-world data is often rich, heterogenous, and interlinked, and motivates the use of graphical models to exploit dependencies. For example, we often have information about a person's social network (with each individual providing their own set of preferences).
This project explored a solution to solving a structured preference problem by training graphical models to optimize ranking metrics by modeling the latent (unobserved) preferences of users. Specifically, we modeled our network interactions with Hinge-loss Markov random fields (HL-MRFs) and trained them via Latent Structural SVM learning algorithms to optimize a ranking metric known as NDCG.
For more info, see: poster
Modern software systems often consist of millions of lines of code, with complex components and many users contributing to the same application. The complexity of today's software necessitates the use of production and maintenance tools, such as those designed for code search, code comprehension, and bug identification.
In this project, we explored how natural language analysis techniques could assist in software engineering tasks by extracting useful semantic relationships from code and corresponding documentation. Specifically, we built a system to extract developer comments and corresponding code snippets, and then parse the primary action (verb) described in both the comment and method names to form a semantic pair.