GHC hosts a mind-expanding poster session for students, faculty members, and industry professionals. We are honored to be joined by these poster presenters, some of whom are also competing in the distinguished ACM Student Research Competition. Learn about the innovative technology the next generation is dreaming up.
|Wednesday, September 26
11:30 a.m. — 2 p.m.
|Wednesday, September 26
2:30 — 5 p.m.
|Thursday, September 27
11:30 a.m. — 2 p.m.
|Thursday, September 27
2:30 — 5 p.m.
|Friday, September 28
9 — 11:30 a.m.
Wednesday, September 26, 11:30 a.m. — 2 p.m.
The Promise of Premise: Question Premises in Visual Question Answering
Questions about images contain premises – objects and relationships implied by the question. We find that reasoning about premises can help Visual Question Answering models respond more intelligently to irrelevant questions. For this, we construct a dataset for Question Relevance Prediction and train novel models. We show that models that reason about premises constantly outdo models that do not.
The Structural Affinity Method for Solving Geometrical Intelligence Tests
In his works on heuristic reasoning, a Hungarian mathematician G. Polya wrote extensively on analogy and generalization for solving geometrical problems. An analogy of two systems is quantified in terms of the agreement between their respective parts. We propose a method that utilizes this definition for finding structures in geometrical tests of human intelligence.
Deep Learning Based Text Anomaly Detection
Data is a vital part of Bloomberg’s business. To provide the best financial services to our clients, we must ensure the data in our datastores are clean, and anomalous data is caught and corrected before it is introduced into the system. We present Zero-Boundary LSTM, an unsupervised deep learning approach for anomaly detection in unstructured text data using LSTM autoencoders, and one-class SVM.
Composable Learning in Multi-Agent Systems
We present a reinforcement learning algorithm for learning sparse non-parametric controllers. This representation of the policy enables efficiently composing multiple learned models. We demonstrate the performance of this algorithm on learning obstacle-avoidance policies in multiple simulations and on a physical robot equipped with a laser scanner while navigating in a 2D environment.
Facial Recognition Application
In recent years, facial recognition has become a trending topic. This poster introduces a facial recognition application built using open source resources, in addition to a pipeline utilizing a hybrid knowledge base.
A Realistic Face Simulator for Deep Learning
Facial analysis via deep learning requires large labeled datasets. Annotating visual data is expensive and tedious. We propose a novel highly realistic face computer graphics simulator for generating synthetic facial data with many ground truth labels. We generate a large synthetic dataset with varied head poses and show how it helps to achieve state-of-the-art accuracy for head pose estimation.
Uncovering Scene Context for Predicting Images’ Privacy
With the exponential increase in the number of online images, the development of image privacy prediction system has become crucial. Prior works have used object tags derived from visual content and user tags as features. However, we propose that adding the scene context obtained from visual content using convolutional neural networks to object and user tags can further improve the performance.
IRMA: Information Referral Matching Assistant for Social Good
The chatbot implements 211LA County’s state of the art industry standard taxonomy model to provide three referrals for different services including housing, food, utility and senior needs. In this proposal we describe its implementation and point to current results and future work in the area.
Does Size Matter? Evaluating Commercial Computer Vision Algorithms for Evidence of Disparate Treatment by Body Type
In this study, we investigate algorithmic bias in commercial computer vision systems with respect to a previously unexamined source of bias: body type. We present a new dataset of images labeled by body type and use these images as input to three commercial computer vision APIs. The output of the APIs is then examined for evidence of disparate treatment between two different body types.
Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
Traffic forecasting is challenging due to (1) complex spatial dependency, (2) nonlinear temporal dynamics. We introduce Diffusion Convolutional Recurrent Neural Network (DCRNN), a deep learning framework for traffic forecasting. When evaluated on two real-world large-scale road network traffic datasets, we observe consistent improvement of 12% to 15% over the state-of-the-art baselines.
Fast Computing on GPGPU
Likelihood Ratio Test is a method for identifying hotspots or anomalous regions for a spatial data grid that draws from an arbitrary distribution. LRT has wide-spread applications from social-media applications powered by mobile phones to medical image processing. Unfortunately, a naive implementation of LRT exhibits a worst-case time complexity of O(n4) for a two- dimensional grid.
Building a Highly Performant and Cost Efficient Ad-Hoc Query Platform
This study presents the lessons learnt while building a production ad-hoc query platform for one of the globally renowned gaming companies in the U.S. By utilizing novel machine learning techniques and advanced resource allocation strategies, the ad-hoc query platform achieved a significant improvement in the execution time while reducing the overall infrastructure/operation cost by 30%.
Efficiently Scaling the Performance of Big Data Applications by Using Memory Slices
Dense compute-intensive applications such as training neural networks, urge scalability, reaching a desirable performance with as few modules as possible. Our proposed memory system, in which performance scales with data size, maintain a balance between bandwidth and compute rate to efficiently meet requirements of dense applications, a key requirement of scalability, missed by previous work.
Towards a Cost-Effective Tactile Reader for the Visually-Impaired
Commercially available refreshable braille readers are very expensive due to the number of actuators needed to actuate each dot of the braille cell, forcing many visually impaired people use screen readers. Yet hearing text is a different experience from reading it. Here words are presented with just a single actuator. This method of reading is evaluated through a proof of concept prototype.
Auntie-Tuna Data-Sharing: Enhancing User Experience
AuntieTuna is an anti-phishing Chrome plugin built by Calvin Ardi and John Heidemann that protects against phishing sites that lure users into sharing important data. AuntieTuna records known websites’ elements and watches if they reappear. I’ve improved data-sharing by adding manual import and export of known websites and creating a website where users can upload and share data with others.
Towards Understanding the Effects of Social Networking on Postpartum Depression in Women: An Analysis in the Context of Bangladesh
Nowadays, many women use social networking sites (SNS) as a platform for receiving social support during their postpartum period. In this paper, I quantitatively analyze the effects of SNS usage on postpartum depression (PPD) in Bangladeshi women based on a survey (N = 93). Besides, several design implications are also discussed to effectively support women at risk of PPD in developing countries.
Adding a Third and Crucial Step, the 'Follow Me Home' Session
In order to optimize the UI design, the design process should be tailored to meet the customers’ needs. To this end, consumer research, customer surveys, and focus groups are all extremely valuable, yet limited in their ability to provide full/authentic real-life representation of the use of the product. This poster describes the added value of the “follow-me-home” step designed to bridge this gap
A Study for Designing of a Cooperative Self-Management Program on a Smartphone for Adolescents with Autism Spectrum Disorder
Adolescents with Autism Spectrum Disorder(ASD) present a variety of behavioral challenges. Our aim is to develop a chatbot-based self-management tool that can support a smooth transition to their adulthood. We conducted a formative study to identify difficulties and current practices of adolescents with ASD, designed a cooperative self-management tool, and examined the feasibility of the system.
How to Make Empathetic Bots with Emotion Tech?
Conversational agents and chat bots are becoming smarter by the day. A key component these intelligent bots lack is the ability to understand human emotions and react appropriately. This poster will explore the emotion detection and emotion synthesis approaches to make bots ’emotionally intelligent,’ enabling more friendly, empathetic and human-relatable conversations.
Transition from Paper to Electronic Documentation in Pediatric Emergency Medical Settings
Introduction of an electronic flowsheet in a time-critical medical setting provided us with a unique opportunity to study the time of transition from paper to electronic documentation. We discuss the advantages of electronic charting, challenges of use, and workarounds to overcome these challenges, and provide design recommendations to support and improve the use of the electronic flowsheet.
Connection: An Autism-Focused Dating App
Dating apps are some of the most popular means to build romantic relationships, with 30% of U.S. adults aged 18-29 using them. Meanwhile, only 9% of autistic adults are married, and there is a lack of accessible and inclusive dating apps supporting their needs. We propose Connection, a dating app for autistic adults guided by literature and inspired by our target user’s needs.
Increasing the Salience of Data Use Opt-Outs Online
Prior work shows users are concerned about their privacy but feel resigned about their abilities to manage it. They may not be aware of choices available to them. We propose a way to make privacy choices salient to users. Based on an approach to identify & classify opt-out choices in privacy policies, we designed an extension to notify users about available choices.
Towards Faster and Better Responses: Incremental Speech Processing in Socially-Aware Virtual Agents
Speech-based interaction systems often use complex architectures that result in processing delays. We eliminate these delays in one such system, a socially-aware virtual agent, through incremental speech processing–prompting the agent to talk as a task is being completed. Our approach not only eliminated latencies, but also resulted in higher ratings regarding the performance of the agent’s task.
Using System Design to Stop an Epidemic
The recent rise in deadly viral infections like the flu has left many people concerned. In my presentation, I will cover how technology through system design can minimize infection risks worldwide by improving the flow of information and ground-level involvement in the prevention and spreading of infectious diseases.
Dress-Yourself: A Web-based Application to Manage Clothes for People with Visual Impairments
This research proposes a web-based application that helps people with visual impairments manage their clothes without the help of others. It can be important and challenging for them to identify features such as clothing category and color. With this application, they can search for information about clothes using the braille code generated by the system and find out which clothes are good pairs.
Real-Time Identification of Movement Similarity: a Semantic-Aware Approach
The identification of trajectory outliers can lead to the discovery of useful and meaningful knowledge and has a number of practical applications (e.g., transportation, public safety). We incorporate semantics annotation in the raw trajectory data to discover various movement relationships between sub-trajectories of mobile nodes in real-time. Experimental results indicate promising outcomes.
Cache Network Management Using “BIG” Cache Abstraction
In a hierarchical network of caches of a Content Delivery Network, by use of timer-based caches, we solve the resource allocation problem and object placement with the objective of utility maximization by considering regular and heavy-tail request traffics. We propose an optimization decomposition framework for cache management under “BIG” cache abstraction to enhance network performance.
DisDroneNet: A Flying Ad Hoc Network (FANET) Providing Connectivity in Disaster Areas
The recent hurricane in Puerto Rico which demolished most of the cell towers and network infrastructure over the island, primarily motivated us to come up with a system design which can help rescue the trapped lives and put the disaster areas back upon their feet quickly. We aim to provide an ad hoc infrastructure in disaster areas as the communication network for the first 24 hours and beyond.
OData – A Standard for RESTful APIs
Good APIs are an important quality attribute of successful software products. This poster presentation gives an overview on the OData protocol, a standard for the creation and consumption of web APIs that can be used to exchange data between systems. Advantages compared to previous API technologies will be explained and own experiences with usage of OData in practice will be shared.
Building Contextually Aware Products
Software of the future must increasingly be intelligent, personalized, and deeply human in order to deliver value to users. Contextual awareness is a critical component in creating these new engaging experiences. This poster will focus on how to create contextually aware products using signals readily available today.
Back to top
Wednesday, September 26, 2:30 — 5 p.m.
FarmBytes: Using Computer Vision to Automate Plant Growth in Vertical Farming
To overcome the dependence of vertical farming systems on the eyes of its users, thus increasing its widespread accessibility, this project aims to develop a software that can autonomously recognize and correct deficiencies in plant growth and growing conditions using computer vision algorithms and a microcontroller.
Predictive Analytics in the Criminal Justice System: Media Depictions and Framing
Artificial intelligence is becoming commonplace in crime-fighting efforts. For instance, predictive policing uses software to predetermine criminals and areas where crime is likely to happen. This study evaluates frames in news media and how they shape public opinion on two fronts: (a) the use of predictive analytics in the justice system, and (b) the integration of AI in everyday life.
AID: An Index for Interactive Multilevel Visualization
Visualization is an integral part of big data management. This paper introduces AID, the first adaptive index that combines both data and images to provide scalable interactive visualization. The proposed index classifies spatial input based on their visualization cost into image, data, shallow or empty tiles. The index is accessible to end users by a standard web interface like Google Maps.
Finding Optimal Dielectric Boundary for Practical Continuum Solvent Calculations
Atomistic modeling and simulation methods facilitate biomedical research from many respects. The ability of these methods is determined by the treatment of complex solvation effects in target biomolecules surrounded by water. We propose a novel approach to constructing optimal dielectric boundary, in terms of calculating binding free energies, central to many continuum solvation models.
ASAP-MMP: A Antibody Sequence Analysis Pipeline Based on Antibody Targeting Matrix Metalloproteinases
Matrix Metalloproteinases (MMPs) facilitate cancer progression under pathologic conditions, so designing antibodies to inhibit MMP computationally is in high demand. We build a pipeline, ASAP-MMP, which analyzes and identifies key features of MMP-targeting antibody protein sequences computationally. These features can be utilized to design MMP-targeting antibody sequence with high specificity.
Protein Sequence Alignment and Replication of Computational Results on Intel Xeon Phi Coprocessors
This study replicates previous work comparing given protein sequences against a database of other sequences. We found that the original study inaccurately coded the Smith-Waterman algorithm for protein sequence database search, resulting in incorrect sequence alignment scores. The aim of our work is to highlight the importance of replicating computational research.
Scratch Hackathon for Teenagers
At computer science outreach programs, participants usually learn programming fundamentals, but not software engineering, which is a crucial part of software development. We developed a simple architectural pattern to implement animations with Scratch at a hackathon for teenagers. The teen participants grasped the software quickly and were able to construct projects with the proposed pattern.
A Brain Computer Interface Approach to Examine Changes in Anxiety While Walking in a Virtually Infinite World
The goal of our work is understanding and mitigating fear of falling, particularly among the elderly. In our setup, an EEG cap monitors a subject’s neural activity while the subject is immersed in a virtual world and walking on an instrumented treadmill. Based on this data, we can dynamically alter the landscape in the virtual world.
Does Open Source Open Doors? Gendered Success in Software Development Careers
The field of software development is an area with especially strong male dominance. In this article, we analyze a large dataset of open source software developers, to answer the question are women at a disadvantage because of who they are, or because of what they do? Results show the gendered behavior is related to outcomes, and female-like behavior has a negative effect on success and survival.
Student Trajectories and School Choice in the New York City Public School System
The NYC Department of Education (DOE) granted us access to student-level data, which allowed us to explore student trajectories through the public school system and the recently implemented high school choice system. These explorations gave us insights into which students are more likely to leave the public school system, and the chance a student is accepted to his/her top choice high school.
Meal and Speech Assistance Device Prototyping for Neuromuscular Disease
Children with cerebral palsy commonly have feeding disorders and swallowing problems that in many instances place them at risk for aspiration with oral feeding, with potential pulmonary consequences. By using targeted vibration therapy on the maxillofacial region, patients may be able to regain control of their facial muscles. This allows them to eat and speak independently.
A Message-Passing Parallel Algorithm for the Steiner Forest Problem
We propose a message-passing parallel algorithm for the Steiner Forest Problem which guarantees the same approximation as its sequential counterpart. This is the first parallel approximation algorithm for solving the problem on message- passing systems, proposed in the literature. We implement and run the algorithm on a MPI system and perform an extensive experimental analysis.
Understanding Password Use by People with Vision Impairment: Initial Results of a Survey
While remembering passwords and avoiding attacks pose challenges to all users, it is particularly challenging for individuals who are blind or have low vision. We conducted an online survey with 325 vision impaired users about their mobile digital security. We compare our survey results to answers from sighted users and share our insights about password accessibility on mobile devices.
LapseAnalyzer: A Web-Based Visual Analytics Tool for Aiding Smoking Cessation
We developed LapseAnalyzer, a web-based visual analytics tool to support complex analysis of large physiological datasets collected from smokers via wearable devices. We believe visualizations offered by LapseAnalyzer will advance understanding of the complexities surrounding design of intelligent predictive models to aid smoking cessation.
Combating Imbalanced Data with Generative Adversarial Networks
Large, labeled and balanced datasets are essential for optimal performance in Deep Learning. However, creating such datasets is tedious. This work attempts to combat one such issue: unbalanced data across classes of a dataset. Generative Adversarial Networks and Transfer Learning have been combined and used on an unbalanced dataset to create synthetic data of the lesser represented classes.
Predicting Short Term Water Consumption for Multi-Family Residences
Smart water meters have been installed across Abbotsford, British Columbia, Canada, to measure the hourly water consumption of consumers in the area. Using this water consumption data, we develop machine learning models to predict water consumption for multi-family residences in the city of Abbotsford.
Towards Detecting 'Fake News' by Propagating Bias Through a Network of Linked Assertions
We aimed to create a tool to aid users in identifying reliable news. This study involves creating software that processes a collection of articles and identifies assertions made in each piece that entail or contradict assertions made in others. We obtain an assertion network by linking documents according to the assertion pairs. From this, our system will report if the article is trustworthy.
Intelligent Selection of Features Using Reinforcement Learning
To enhance molecular dynamics simulations on supercomputers, we present a reinforcement learning based sampling algorithm that aims to sample protein landscapes faster than conventional simulations by identifying features that are relevant for sampling. As simulation evolves, the algorithm rewards sampling along important features and disregards others that do not facilitate exploration.
A Language Learning App Empowered by Deep-Learning-Based Speech and Handwriting Recognition
Motivated by an interest in machine learning and language learning, we developed an Android application called DeepLang that uses deep learning to help users learn a language. The user can test their Japanese or Chinese writing skills in the handwriting activity, and can test their pronunciation in over twenty different languages in the speech activity.
Performance-Driven Path Selection
RouteScout is a system that facilitates performance-aware routing in the Internet. RouteScout’s novel data plane enables an ISP to monitor performance across the route choices for each destination at line-rate and with low memory footprint. Routescout’s control plane provides responsiveness to congestion incidents and failures by exploring and exploiting the most performant routes.
Towards a Resilient SDN Control Plane
Network management is a complex task with the growing scale of today’s networks. Thus, SDN was introduced to simplify this management by separating the control and data planes. However, with a centralized control plane, the resilience of the SDN control network becomes critical. Thus, we propose a data plane resilient routing mechanism to maintain the connectivity of SDN controllers and switches.
Robotic and Neurosurgical Instrument Segmentation for Development of Intelligent Surgical Assistant
As a step towards development of an Intelligent Surgical Assistant, surgical instruments need to be tracked continuously. For this, an instrument tracking algorithm was developed and tested on two datasets: robotic and neurosurgical instruments. Instrument segmentation was performed using color filtering and border constraints in combination with features including blurriness, shape and disparity.
Experimental Multi-Party Computation on Real Data Using SPDZ
Secure Multi-Party Computation has the power to revolutionize computation between shared parties holding private data. However theoretical protocols are not scalable. This project investigates the arithmetic circuit based SPDZ protocol against other MPC protocols on machine learning algorithms in a realistic distributed setting, and demonstrates its potential to be used in real world scenarios.
Predict Privacy and Security Risk for Product Features
Privacy and Security (PrivSec) incidents cost companies several billion dollars each year. With the increasing number of software products being developed in an agile and continuous release cycle, it is difficult to comply with all PrivSec standards. We adopt a machine learning approach to predict PrivSec risk for product features in development using engineering project tracking data.
Thematic Mapping of Cyber Security and Cyber Security Risk: Expert Elicitation of Researchers and Practitioners
Given the interdisciplinary nature of cyber security research, it is important that researchers from disparate disciplines share a common understanding of what is meant by cyber security and cyber security risk. In an attempt to identify this common understanding, researchers applied thematic analysis on interviews with cyber security experts in academia and the U.S. Army.
Accelerating Poly1305 Cryptographic Message Authentication on the z14
In this paper, we examine the implementation of the Poly1305 authentication algorithm on the IBM z14 computer. By restructuring the algorithm, we improve the performance of multiplication and reduction, two basic building blocks of many cryptographic operations. These techniques can be applied beyond Poly1305, in other algorithms, such as in the signature algorithms used in HyperLedger Blockchain.
HexCFI: Context Aware Precise Dynamic Control-Flow Integrity
Control-Flow Hijacking is one of the main attack vectors in modern world. Control-Flow Integrity(CFI) is a defense mechanism which can mitigate such attacks. However, due to the dependency on the static analysis, state-of-the-art CFI implementations cannot guarantee complete security. We present a novel technique, HexCFI to avoid the static information dependency and can guarantee better security.
Beyond Blocking: Instrumenting Simple Risk Communication for Safer Browsing
User-centered solutions to phishing have included education, warnings, and blocking. We have developed a tool which integrates personalized blocking and risk communication. We report on a 4 week in situ trial showing acceptability and usability of the tool, as well as a laboratory experiment showing efficacy. Participants in our study stated that the tool was simple, non-intrusive, and intuitive.
Security Metrics that Matter
As a cost center, information security departments struggle to show their value via traditional metrics because standard measurements generally do not resonate well with the C-suite, who rely on hard facts to base their decisions. Effective and clear security metrics that matter provide leadership tangible evidence that their security program reduces risks and is aligned with business goals.
Hardening IoT Cybersecurity Through Device-Side Solutions
With proliferation of IoT devices, comes a new set of security risks and challenges. We will discuss how technology driven device-side solutions can be beneficial for hardening IoT security. We will learn IoT security risks from device, user and connectivity standpoints and how a tiered security framework can cater to the diverse IoT market to meet its varying cost and security needs.
Back to top
Thursday, September 27, 11:30 a.m.—2 p.m.
Analyzing Billboard Music Trends in Twitter Using Spotify, Amazon AWS and Big Data
Social median platforms are omnipresent in today’s world. But, how can this vast wealth of information on social media be used for improving user’s well-being? Our team explored how music trends on Twitter can be analyzed to understand user’s sentiment and build better recommendation systems to improve user’s mood using Spotify, Billboard and AWS. #TechnologyForGood
SamBaTen: Sampling-based Batch Incremental Tensor Decomposition
Tensor decompositions are invaluable tools in analyzing multimodal datasets. In real-world scenarios datasets tend to grow over time. How can we maintain a valid & accurate tensor decomposition without having to re-computing it? We introduce SamBaTen, which incrementally maintains the decomposition, given new updates to the tensor. We extensively evaluate SamBaTen and achieves comparable accuracy.
DeepFP: A Deep Learning Framework for User Fingerprinting via Mobile Motion Sensors
In this paper, we propose a deep learning framework for user fingerprinting via mobile motion sensors, DeepFP, which can identify and track users based on their behavioral patterns while interacting with the smartphone. It exploits metric learning techniques and does not need to re-train to identify new users which makes it feasible to be used in real-world scenarios with a huge number of users.
How to Become an Open Source Contributor With Any Skill Set
As software matures through combined efforts of passionate contributors worldwide, the open-source community continues to thrive on the principle of using great minds to solve great problems. The most empowering part of the open-source revolution is that anyone can contribute. This workshop will cover the tools and resources you need to become an open-source contributor, with any skills you bring.
Getting Started With the Camel REST DSL
REST services are becoming an increasingly popular way of connecting devices with the cloud as well as systems to each other. With Apache Camel you can write REST services easier and quicker using the the REST DSL. This poster will walk attendees through developing their first Camel route using the REST DSL.
Rapidly Exploring Random Search Explorer
Motion planning for robots have generally worked with a single goal solution but in real world problems, multi-goal positions are most often the case. In this work, we propose a Rapidly Exploring Random Search Explorer (RESE) algorithm, that works for multi-goal scenarios and results in better and realistic performance of robots in scenario where multi goal positions is crucial and of the essence.
Recognition of Human Object Manipulation Actions in Human-Robot Interaction
Understanding human action is a key component in human-robot interaction. The robot should be able to comprehend the human action, and to respond appropriately. Here, We focus on recognition of manipulation actions and propose a method to decompose their signals into a set of primitives. The sequence of the primitives within the actions are used for training the sequential model for recognition.
An Analysis of Heat Indices Used for Quantifying Heat-Health Risk
Extreme heat is the deadliest weather-related condition in the United States. It is important to investigate the relationship between heat and human health as climate change will increase the frequency, duration, and intensity of heatwaves. To assess the usefulness of heat stress indexes in identifying episodes of hot weather that are detrimental to human health, we examined the relationship between Heat-Related Dispatch calls and heat indices in three U.S. cities.
Developing Expressive User-Interfaces for Socially-Assistive Telepresence Robots
Developing an effective user-interface is essential in the design of socially-assistive telepresence robots. This research focuses on creating interfaces for connecting human input to robots that can help remote students learn and socialize with classmates during extended absences. The findings this research may be useful to others in the field of human-robot interaction.
Investigating Human Comfort with Unmanned Aerial Vehicles in Different Sized Rooms
This project investigates how people react to unmanned aerial vehicles (UAVs) when approached. Prior work has looked at UAVs at different heights to assess comfort levels. We extended this work with different room sizes to see how these variables would affect the participants’ comfort. These results will contribute to understanding of social, collaborative, and assistive robots, allowing people to be more comfortable integrating drones into society.
Investigating Physiological Synchrony in Paramedic Trainee Dyads
The goal of this research is to investigate the physiological synchrony between paramedic trainee pairs in training situations. Understanding physiological synchrony (i.e., the unconscious, dynamic linking of physiological responses such as heart rate and electrodermal activity) in working dyads is important because it can have an effect on the fatigue and stress levels of the dyad’s performance in carrying out life-saving tasks. While moderate stress can improve cognitive performance, severe stress can reduce fine motor performance and attention. Individuals’ physiological responses have been well-linked to several affective and mental states of humans, such as arousal and cognitive load. The use of physiological signals, particularly electrodermal activity (EDA), to monitor the stress response in human behavior has several additional advantages. For instance, physiological responses provide us with a continuous measure, and the emerging generation of wearables allows us to collect data non-invasively and unobtrusively. In the context of teams or pairs, physiological synchrony has given us insight about human-human interaction, such as level of social engagement, coordinated behavior, and team performance. We propose to extend this line of work by looking at what EDA data can tell us about pairs working in real-life high-stake situations, as opposed to social situations or manufactured, low-stake activities. Our study monitors the EDA of paramedic trainees as they work in realistic simulated emergency situations. Using the Empatica E4 wristband, we have collected EDA data from eight team-training simulations (n=16). We plan to extract features that will give us insight into the physiological synchrony of the pairs. Based on prior work, we will look at individual features of the EDA data (e.g., minimum, maximum, average, STD, number of peaks, and peak amplitude) as well as team level features (differences in numbers of peaks, and canonical correlation). From initial analysis, we have already noticed a level of correlation between team members’ EDA in a significant number of the sessions. We hope that this study will give us better insight into how the physiological responses of two people working together can affect both their own and their teammates’ stress level and task performance.
Eye Tracking to Verify Users Based on Eye Movements
Each person’s eyes move in a unique set of movements called saccades and fixations. Saccades are uncontrollable movements as a person’s eyes move between two points and each person has a unique saccadic movement. Eye movements can be used as a biometric security feature because people can be identified and verified using eye tracking technology and machine learning.
Semantic Segmentation for Melanoma Detection
There are over 5,000,000 newly diagnosed cases of skin cancer in the United States every year. Using a novel convolutional neural network architecture coupled with semantic segmentation methods, we make automated predictions of lesion segmentation boundaries within dermoscopic images. This work will be submitted to ISIC 2018, with the ultimate goal of enabling the automated diagnosis of melanoma.
Metrics for Efficient Exploration of Narrow Passages in Robot Motion Planning
Motion planning algorithms find safe paths for robots, but state of the art sampling-based planning methods have difficulty in narrow regions. In response, this poster presents metrics to increase the efficiency of exploration in narrow passages. In particular, the metrics are applied to dynamic region sampling, a method that encodes the environment topology to guide exploration.
Parents’ Perspectives on the Burdens and Benefits of Digital Home Assistant Use by Children
Parents receive conflicting opinions about benefits and burdens of children’s technology use, especially new technologies such as digital home assistants. To understand parents’ views, we analyze product reviews, deploy home assistants to families, and conduct surveys and interviews. This study contributes a family-centered understanding of and design opportunities for digital home assistants.
How to Write Effective Programming Questions for Automated Grading Tools
We are studying if scaling questions difficulty in automated grading tools will lead novice programmers to learn more effectively and efficiently. We expect that scaling complexities from lower to higher orders of Bloom’s taxonomy will produce the most effective and efficient learning and that students will perceive the autograding tool more positively.
Spoken Dialogue Systems for Effective Medication Management
The past few years, spoken dialogue systems have been rapidly growing in interest, including the field of health care. There is a growing need for automated systems that can not only interview patients about their current health and medications using naturalistic interaction, but also aid health providers care for their patients. After all, a well-crafted dialogue system with state management can allow the distribution of tailored messages to users based on the inferred goals and beliefs of the user in the present moment. In this project, we explore the use of a custom dialogue system using PyDial created to provide drug product and prescription information to users.
Ms. PacMan vs. Ghosts
In the summer of 2017, my team and I were unbelievably lucky to be offered a spot in the CREU program. Our goal was to create an artificially intelligent controller that would select an optimal path, ensuring a win, using current knowledge of the game state in a game of Ms. PacMan. We achieved this by implementing and testing several AI techniques such as neural nets, genetic algorithms, and more!
Computer Scientists have expressed their enthusiasm for the aesthetic possibilities of fractals, cellular automata, and genetic programming. Both L-Systems and cellular automata use a system of rules or mathematical functions that can be interpreted visually. These functions can be manipulated and altered with a genetic algorithm to form individual representations of abstract data. Our research aims to set standards for artistic merit by including a larger audience in order to establish itself as genuine.
Our project explored the possibility of generating art through algorithms with visual outputs, and whether they can be interpreted b the general populace as holding artistic merit. Two main algorithms were explored; cellular automata and L-system fractals. Rulesets were formed for each and used to generate randomized images that could be considered to be created entirely by the algorithms. These images were displayed in a survey to participants designed to help quantify their aesthetic sense. Using these responses and concepts from machine vision, two aesthetic metrics were formed for cellular automata and L-system fractals. The aesthetic metric’s use a modified chi-square metric to rate images from most aesthetically pleasing, 1.0, to least aesthetically pleasing, 0.0. These scores were used in a genetic algorithm to produce unique images that scored well with our aesthetic metric. Sixteen images created like this were already displayed in the Hendrix Library for participants to view, and included write-ups of the process behind them. Ultimately, our poster will lead the viewer through the pieces of our project and display visual representations of the results. Key aspects include: introduction to cellular automata and L-systems, the rulesets used to create them, the aesthetic metric used to judge them, the machine vision concepts used to see them, responses from an audience of our results, our modified aesthetic metric, and our ultimate genetic algorithm used to create “the most aesthetically pleasing” results.
Driving Interactive Graph Exploration Using 0-Dimensional Persistent Homology Features
This poster was presented at the IEEE Visualization 2017 conference, the most competitive conference in my field, where it was nominated in the top 6 posters for InfoVis. Here is a PDF: https://bit.ly/2Lb3LPa Although presented in 2017, we have done further work and tested our software on several new datasets and submitted a paper for this project to the 2018 Visualization conference. Our poster briefly explains Persistent Homology, a tool from Topological Data Analysis, which we use to study the 0-dimensional features (connected components) of a graph. With our technique, we reveal new structures found in 15+ common real-world and synthetic datasets through an interactive approach using force-directed layouts. Our software provides a “persistence barcode”, where a user can select a single or multiple bars to either “repulse” or “contract” substructures of the graph, which are ranked in order of lowest to highest “persistence”. The next step in our project will include an additional tool to identify and rank the 1-dimensional features (holes, or cycles) of a graph as part of our suite of interactive design tools using Persistent Homology. I am continuing this project during my first year of graduate school, where I am pursuing a PhD at Tufts University. Thank you for your consideration!
Towards Detecting 'Fake News' by Propagating Bias through a Network of Linked Assertions
This study involves creating software that processes articles and identifies assertions made in each piece that entail or contradict assertions made in others. We obtain an unbiased assertion network by linking the documents according to the assertion pairs. We allow a user to label stories as biased positively or negatively. Our system will ultimately report which articles are trustworthy or not.
Acoustic Analysis of American English Child-Directed Speech
In this project we assess the acoustic properties of child-directed speech (CDS) and normative adult speech using long-term average speech spectrum to help inform training and testing practices for automatic speech recognition systems. Normative speech data is collected from a subset of the VoxForge speech corpus, while CDS was obtained from a subset of the American English CHILDES database.
Identifying and correcting for bias in ranking
Ranking is a popular method used to compare complex items such as hospitals, universities, and job candidates. However, ranking sometimes demonstrates evidence of discriminatory bias. We use recent work in algorithmic fairness to identify bias in ranking and consider post-processing approaches to correct an unfair model.
LLE Based Perceptual Image Hashing of Video
We investigated perceptual image hashing on video data by looking at video information that is visually similar in appearance. A perceptual image hash is a digital fingerprint obtained from an image or video data based on its visual appearance. We created a new method of perceptual image hashing that incorporates Locally Linear Embedding (LLE) and Discrete Cosine Transform (DCT).
Revisiting the Effectiveness of Competition (Programming Contests) in Introductory Programming Courses
The improvement of academic performance in introductory programming classes is an ongoing mission of the curriculum committee in the Computer Science Department at Winston-Salem State University. Across the country every year a high volume of students fail their introductory programming courses, or complete them with substandard knowledge. This project seeks to address that problem by revisiting an intervention that was previously introduced but not quantitatively studied for significant results. This intervention introduces competition via programming contests in the courses as a means of assessing students problem solving and programming skills and its influence on academic achievement. Why revisit the intervention now? During the last academic year while actively involved in another CREU research project, we noticed on several in class assignments that the students started to compete for rights to who wrote the best program. There were not any tangible incentives involved. Later, we observed that the test scores of students who had previously done poorly had improved. At first it was thought that this was just a coincidence, but it happened again in another section of the course. Therefore, we believe the concept was worth revisiting. The aforementioned reasons are the motivation for this project.
A Gaze-Based Exploratory Study on the Information Seeking Behavior of Developers on Stack Overflow
We present an exploratory study on how developers seek for information on Stack Overflow. We use iTrace, an Eclipse plugin to collect eye-tracking data. We chose the task of code summarization because our goal is to learn how developers summarize code. Participants completed the tasks correctly when they spent more time reviewing both the source code and Stack Overflow to complete the tasks.
Back to top
Thursday, September 27, 2:30—5 p.m.
Atom Mapping of Chemical Reactions: A Machine Learning Approach
The atom mapping of a chemical reaction encodes the underlying reaction mechanism, information essential in drug design studies. We present a computational method for automatically deriving the atom mapping which uses machine learning in order to incorporate the chemical context in the computational problem. We show that compared to existing tools, our method computes more accurate mappings.
Analysis of Population Regulatory T Cells in Tumor Tissue
Regulatory T cells are cells that has as main role to suppress immune response. We analyze in depth different cells obtained through Single-cell RNA-seq, to identify new biological phenomena. Applying machine learning methods we distinguish six major cell types and we got a gene interaction network with the best genes candidates for a therapy, being the targets those that are highly connected.
Dental care is the most expensive health service in the U.S. We extracted Tweets about dental care affordability and used NLP based feature engineering to train ML models which predict a proxy for a given user’s level of dental insurance. To improve health outcomes, in the future we will use our classifier to supply people in need with locations of nearby low cost resources and clinics.
REXTAL: Regional Extension of Assemblies Using Linked-Reads
It is currently impossible to get complete de-novo assembly of segmentally duplicated genome regions using genome-wide short-read datasets. We devise a new computational method called REXTAL for improved region-specific assembly of segmental duplication region, leveraging genomic short-read datasets generated from large DNA molecules partitioned and barcoded using Gel Bead Emulsion microfluidic.
ViroLog: Viral Genome Analysis to Uncover New Battlefronts in Immunity
ViroLog is a comprehensive virus genome database and interactive user-interface focused on viral copies of host genes and interactions. Following the extensive data analysis and interrogation of ~7500 viral genomes, our tool presents new insights to any viral protein related to host proteins integrated with gene expression data, cellular localization, and disease-associations for respective genes.
Computing Regional Myocardial Radiation-Dose Response in Left-Sided Breast Cancer Patients
Breast cancer treatment often involves radiation therapy to reduce risk of recurrence and improve survival. Inadvertent radiation exposure to the heart often occurs resulting in increased risk for cardiac events at five to 15 years post RT. Our study aims to correlate regional radiation dose with sub-clinical, regional changes in heart function at early time points to assess cardiac tissue dose-response.
Beyond Basics: Coding Workshops for Middle School Girls
Coding skills are not taught in middle schools in rural upper Michigan. This combines with traditional stereotypes to keep girls from learning about or developing an interest in computer science. We developed a curriculum to teach middle school girls programming skills through AspireIT workshops. Through these workshops middle school girls eagerly learned challenging topics as well as basics.
The Role of Culture in Elite Tech Companies
This poster presents cultural fit as a key hiring criterion for elite tech companies. Drawing from 30 interviews with employers and intern candidates, I show that employers seek interns who were not only technically competent, but also fit culturally in terms of communication styles. I conclude by discussing the implications for scholarship on inequality and labor markets.
Inclusive Leadership: Building a Nucleus for Change at Intel
How does a company the size of Intel bring inclusion to life for every employee? Equipping managers to lead inclusively is a critical component – often we look to senior leaders to change a culture but middle managers make change stick. We’ll share insights gained from the program & lessons learned along the way to enable others to accelerate sustainable change in their own organizations.
Using a Game to Teach Basics of DDoS Attacks
This poster presents a 3D game that aims to help students understand basics of Distributed Denial of Service (DDoS) attacks. The game is designed to be played less than ten minutes per topic. The Pre-test and Post-test are built into the game. Students will take an online survey to share their feedback after playing the game. The game will be used in several computing classes.
Bringing Computer Science to High School Students and Teachers: Assessing the Impact of CyberPDX
Too often high schools fail to integrate computer science. There are many reasons, including lack of teacher preparation, unequal access to computers, and inconsistencies in curriculum. CyberPDX, a camp hosted by Portland State University, aims to address these issues. I present my work to assess long-term impacts of CyberPDX on attendees’ college and career decisions.
A Guideline for the Effective Development of Applications for Higher Mobile Learning
This research project aims to redefine m-learning and create a design guideline for its effectiveness. We give the community a new perspective, meaning that m-learning is not just an interface for learning but an interactive teaching tool. Our compilation of design provides aspects that must be included in a higher m-learning application for the best results in student’s learning experiences.
Enforcing Positive Pair Programming by Shattering Student Perceptions
Students may dread intro course pair programming despite it resulting in valued skills, better code, and higher productivity. We explore theories on why intro students do not always buy in to pairing, and discuss differing goals pair programming has from other group work. We look at ideas for showing its positive value to students, and methods that can measure whether these favorably impact views.
The Relationship Between Implicit Gender-Science Bias and Plan to Major in Computer Science
We investigated the effect of gender-science implicit bias on students’ likelihood of majoring in computer science. In our preliminary study, undergraduates in Intro to Computer Science classes took the Implicit Association Test for gender-science and rated their likelihood of majoring in CS. We found that females with lower male-science implicit bias were more likely to major in computer science.
Stem Apps: Developing Mobile Applications for Learning
This research project focuses on developing mobile apps to enhance the learning experiences in STEM courses. Our first app, Biology Dogma, is a cross-platform DNA matching game with random quizzes that received positive results from UPRA students and faculty. Our plans include testing in other universities, to continue developing the project and inspire new apps for the STEM academic community.
Information Extraction from Document Images Using Synthetic Data and Conditional Random Fields
Presenting an end-to-end framework for field-type extraction in tax forms using synthetic data and conditional random fields (CRF). Synthetic data generator learns historical data distributions and generates data by sampling, while CRF learns to extract field types. Our trained CRF achieves state-of-the-art test set performance (> 95% F1) and respectable results (70-90% F1) on production images.
Surfacing Hidden User Data: Multi-Step Data-Science Approaches Using NLP Methods
Missing data is a critical problem for data scientists. Although a common problem, it can lead to invalid analysis and predictions, as well as a degraded user experience. The following case study of how we used data science to address this unsupervised problem at Intuit offers insights on dealing with the larger issue of missing data for other applications in the industry.
Peer to Peer Hate: Perceived Demographics of Hate Instigators and Their Targets
While social media has become an empowering agent to individual voices, it also facilitates anti-social behaviors including hate speech. In this poster, we present the first comparative study of hate speech instigators and target users on Twitter in terms of their profile demographic and online visibility. Our results advance the state of the art of understanding online hate speech engagement.
A Joint Sparse Classification and Clustering Approach with Applications to Healthcare Predictive Analytics
This work develops a novel method for a type of classification problem, where the positive class is a mixture of multiple clusters and the negative class is drawn from a single cluster. We employ an alternating optimization approach with double regularizations. The experimental results on real dataset demonstrate successful cluster detection and more accurate classification than alternatives.
FFWD: Delegation is (Much) Faster Than You Think
We revisit the question of delegation vs. synchronized access to shared memory, and show that delegation can be much faster than locking. We propose fast, fly-weight delegation (ffwd), a highly optimized design of delegation that allows it to significantly outperform prior work, while retaining the scalability advantage. In micro-benchmarks, we see improvement in the 5–10× range.
Universal Support for Scoped Memory Access Instrumentation
To use memory features like persistence, transactions, and encryption, program loads and stores require custom instrumentation. Implementing instrumentation by hand or through feature-specific compiler extensions leads to many known problems. We introduce an abstract memory instrumentation system for code regions in C/C++ that is modular, fast, and easy for both programmers and researchers to use.
Using Discrete-Event Simulation for Evaluating Distributed Systems
Simulation is a useful tool for the evaluation of distributed system performance. We explore strategies to use when designing and developing a simulator for a multi-tier distributed system. We illustrate this by showing a distributed storage system simulated in SimPy, an open-source Python library for discrete-event simulation.
Environmental Engineering and Environmental Information Technology, including Renewable Energy and Green Technology
Efficient Computations of Flooding Scenarios for the Coast of Maine
This research aims to provide more efficient modeling of flooding scenarios by algorithm development utilizing FZG and DEM LiDAR geospatial data sets to automate the NOAA protocol. The research was able to produce results in a fraction of the time required by ArcGIS geospatial platform with increased facility to manipulate variables.
A Pareto Max Flow Based Solution for Urban Pollution Control
Long-term exposure to air pollution has proven to have adverse effects on human health. To improve the air quality in highly cluttered roads in cities, we develop a pareto max flow algorithm which maintains maximum vehicular flow while distributing the flow through the city, based on existing levels of pollution. Experiments were performed using the traffic simulator SUMO on the New York map.
SPEED: Self-Serve Prepaid Emissions-Free Electricity for Developing Regions
Today more than 1 billion people have limited access to electricity. To address this critical issue, we have developed an accounting system that enables micro-entrepreneurs to store, sell, and account for small amounts of energy at affordable prices, recouping their initial investment over a months-long payback period. The system also automates fault diagnostics.
Improving Recurrence Prediction in Breast Cancer by Coupling Machine Learning Algorithms
Breast cancer is the most common cancer in women worldwide, with nearly 1.7 million new cases diagnosed every year. While many women are treated with chemotherapy to reduce the risk of recurrence, Deep Learning can better predict women at risk. Based on our results, we conclude that Deep Learning should be used in biomedical research even with training sets of moderate size.
Predictive Modeling of Longitudinal Data for Alzheimer Disease Diagnosis with the Application of the RNNs (LSTM and GRU)
Early diagnosis and accurate prognosis of Alzheimer’s Disease (AD) plays a significant role in patient care. However, mining this data effectively is a challenging task, owing to heterogeneous measurements, varying samples’ length and missing modalities. This work outlines a pipeline to employ RNN models to distinguish AD samples by analyzing the longitudinal data instead of cross sectional data.
Image Processing for Intelligent Traffic Lights
The project aims to work with a part of this system, in orderto develop an autonomous and integrated traffic light system capable to detectand process traffic flow by controlling the traffic lights using a image processing algorithm to count the vehicles.
MonoLoco: WiFi Localization and Orientation with a Single Access Point
Decimeter-level WiFi localization has become a reality due to the ability to eliminate the effects of multipath interference. We demonstrate the ability to use reflections to enhance localization rather than throwing them away. We present MonoLoco, the first decimeter-level WiFi localization system that requires only a single receiver and does not impose any overhead beyond standard WiFi protocols
Room Occupancy Detection for Study Spaces and Client/Server Support for a Mobile Application
A Raspberry Pi Zero W augmented with a passive infrared sensor to detect movement. Each Raspberry Pi unit is installed in a study room ceiling with an exposed sensor. Secure RSA encrypted messages are sent between the Pi and a custom database located on a linux server. A SQL database maintains a table of room numbers and occupancy statuses to update a virtual map displaying study room statuses.
Back to top
Friday, September 28, 9—11:30 a.m.
Overview of VESA Display ID 2.0: Enumerating the Future Display Monitors/TVs Beyond HDR & 4K capabilities
The Extended Display Identification Data (EDID) has been an industry standard defined by VESA since 1994, enumerating 128 bytes of metadata about the Display Panel. In the year 2000, it was enhanced by E-EDID. The “Display-ID” is a new design created to replace EDID and E-EDID to suit both consumer electronics and PC monitors and aimed at futuristic displays beyond 4K resolution & HDR colors.
Sharing is Caring: Scalability and Shared Tech in Games
What do cross-platform games, web mega-hits like FarmVille, and social activity on PlayStation Network have in common? They have all faced fascinating challenges of scalability and code reuse! This presentation will cover examples from the games industry of massive scalability challenges — what worked and what went wrong — from the perspective of a shared technology owner.
Data Science for Brain Computer Interfaces
Data Science is an exponentially growing field in today’s technological arena. It’s capabilities are seen in the commercial settings of big data, data mining etc. It holds tremendous untapped potential for usage in biomedical arena for social causes. This work deals with using Data Science to create Brain Computer Interfaces for training of autistic people through neurofeedback using EEG datasets.
Improving Diversity in Recommendations Using Submodular Function Optimization
Recommendation systems have to balance between relevance and diversity to deliver a good user experience. While relevance is somewhat well understood, diversity is harder to quantify. We address the issue by formulating personalized recommendation as a submodular function optimization problem. Specifically, we use the submodular framework to re-rank and select from the set of recommendable items.
Analyzing Bias in Object Detection Data Sets
This poster presents the current bias among popular object detection datasets and explains the increasingly pertinent research behind decreasing bias while maintaining precision of machine learning models. The goal is to create models that can learn how to predict with smaller data sets, which highly increases the adoption of the models while decreasing the time to market.
A Data-Driven Approach for Detecting Autism Spectrum Disorders
Our project aims at creating new protocols and developing data-driven approaches for early and efficient detection of Autism Spectrum Disorders (ASD) in children by devising machine learning-based prediction models using data from multiple sensors that capture a subject’s response to eight stimuli as opposed to current methods that are subjective or focus on responses to a single stimulus.
End-to-End Trained CNN Encoder-Decoder Networks For Image Steganography
All the existing image steganography methods use manually crafted features to hide binary payloads into cover images. This leads to small payload capacity and image distortion. Here we propose a convolutional neural network based encoder-decoder architecture for embedding of images as payload. We perform extensive empirical evaluation of proposed architecture on publicly available datasets.
Scalabe Model Parallelism with Deep Reinforcement Learning
We propose a hierarchical model for efficient placement of computational graphs onto hardware devices, especially in heterogeneous environments with a mixture of CPUs, GPUs. Our algorithm can find optimized, non-trivial placements for computational graphs with over 80,000 operations. We achieve runtime reductions of up to 60.6% when applied to models such as Neural Machine Translation.
Application-Aware Network Traffic Classification Using Machine Learning Techniques
Traffic classification plays an important role in network security and management where large number of known/unknown applications are handled such as cloud environment. Traditional traffic classification techniques are either non-scalable or creates privacy and legal concerns.Our research applies unsupervised Machine Learning (ML) techniques into flow feature based classification methods.
Watch, Listen, Read, and Learn: Deep Multimodal Representation Learning for Video Classification
This research presents a new multimodal deep learning framework for event detection from videos by leveraging recent advances in deep neural networks. First, it automatically generates deep features from each modality (visual, audio, textual). Then, a two-stage fusion approach is proposed to combine the scores from each modality. This framework is applied to a new natural disaster video dataset.
Multi-Modal Sequential Representation Learning for Disaster-Related Video Classification
Videos serve to convey complex semantic information and ease the understanding of new knowledge. However, when mixed semantic meanings involved, it is more difficult for a computer model to detect and classify the concepts. A multimodal deep learning framework is proposed to improve video concept classification by leveraging recent advances in transfer learning and sequential deep learning models.
Deep Multi-View Models for Glitch Classification
Non-cosmic disturbances known as “glitches”, show up in gravitational-wave data. We propose a deep neural network to classify glitches automatically. The primary purpose of classifying glitches is to understand their origin, which facilitates their removal from the detectors. The suggested classifier is a multi-view deep neural network that exploits four different views for classification.
Tensor Dictionary Learning: Feature Learning for Multidimensional Data
This work analyzes the dictionary learning (DL) problem for sparse representation of tensor data. To account for the tensor structure, the dictionary underlying the vectorized version of data samples is Kronecker structured (KS). While there have been algorithms developed for KS-DL, their convergence has not been studied. This work provides theoretical limits on the sample complexity of KS-DL.
CSI Based Device Free Indoor Localization Using Machine Learning Based Classification Approach
In typical Indoor positioning system, the subject requires to carry a physical device. In this study, we present a novel, machine learning based, device free technique for indoor positioning. We exploited the Channel State Information obtained from Multiple Input Multiple Output Orthogonal Frequency-Division Multiplexing as location specific features for machine learning based location estimation.
Nonlinear Unmixing of Hyperspectral Paintings by Integrating Deep Neural Networks and Kubelka-Munk Theory
Here, the problem of nonlinear unmixing of hyperspectral reflectance data using works of art is described. This is accomplished through deep neural networks used to qualitatively identify the constituent pigments in an unknown spectrum. Based on the pigment(s) present in a given pixel, Kubelka-Munk theory is employed to estimate its semi-quantitative concentration.
SafeRoute: Learning to Navigate Streets Safely in an Urban Environment
Recent studies show 85% of women have changed their traveled route to avoid harassment. We propose SafeRoute, a novel solution to the problem of navigating cities while avoiding harassment and crime using deep reinforcement learning. Our agent learns to pick favorable streets to create safe and short paths and improves over state-of-the-art methods by up to 17% in local average crime distance.
Machine Learned Spam Detection in E-Commerce
eBay offers its flagship online shopping products which serves hundreds of millions of sellers and buyers. It is observed some sellers engage in incorrect practices and engage in spamming on eBay. They create duplicate listings or create a false notion on popularity of their products. This paper discusses how eBay detects these spamming behaviors by using latest machine learning techniques.
Public Perception on Autonomous Vehicles: Understanding Patterns from YouTube Videos
Content analysis and text mining on comments related to top viewed Youtube videos of autonomous vehicles can provide insight about potential end-user feedback. This study used two natural language processing (NLP) tools to perform knowledge discovery. The key issues are mostly associated with efficiency, performance, trust, comfort, and safety.
Affinity Propagation Hybrid Clustering Approach for Named-Entity Recognition
Named-entity recognition is a common NLP problem with several existing solutions, such as AWS Comprehend and the Stanford Named Entity Tagger. However, these solutions are best suited for situations where natural language is used and entities are known. In this poster, we present a novel method using affinity propagation and clustering to identify people names where standard NER may fail.
Learning Feature Representations for Keyphrase Extraction
Keyphrase extraction models encode each phrase with a set of hand-crafted features and ML algorithms are trained to predict keyphrases. While these features showed to work well in practice, feature engineering is a tedious process that normally does not generalize well. To address this, we present a feature learning framework that exploits the text to discover patterns that keyphrases express.
Speeding up Reinforcement Learning-Based Information Extraction Training Using Asynchronous Methods
A recently proposed Reinforcement Learning-based Information Extraction (RLIE) technique is able to incorporate external evidence during the extraction process. RLIE trains a single agent sequentially, training on one instance at a time. We leverage recent advances in parallel RL training using asynchronous methods and propose RLIE-A3C. RLIE-A3C is able to achieve 5x training speedup over RLIE
Detecting Anxiety on Reddit
We study anxiety disorders through personal narratives collected through the popular social media website, Reddit. We build a substantial data set of typical and anxiety-related posts, and we apply N-gram language modeling, vector embeddings, topic analysis, and emotional norms to generate features that accurately classify posts related to anxiety.
Towards a Virtual Assistant Health Coach: Corpus Collection and Annotations
In this paper, we present our data collection process, annotation schemas and agreement results for extracting health goals from SMS conversations between a health coach and the patients. This is our first step towards building an autonomous virtual assistant health coach that learns from expert demonstration to interact with patients via SMS.
Scheduling Coflows in Large-Scale Distributed Systems
To maintain efficiency and quality of service of applications running in datacenters, we need efficient algorithms to make better use of both network and server resources. In this paper, I overview some of the challenges and research problems in allocating shared resources in datacenters, and briefly discuss my preliminary research results in coflow scheduling problem.
A Gaze-Based Exploratory Study on the Information Seeking Behavior of Developers on Stack Overflow
We present an exploratory study on how developers seek information on Stack Overflow. We use iTrace, an Eclipse plugin, to collect eye-tracking data on code and Stack Overflow posts. Participants completed four code summarization tasks. Developers who spent more time on relevant Stack Overflow pages gave complete and accurate summaries and also focused on pages that had code snippets on them.
Chaos Engineering is the principle of finding weakness in distributed systems by testing real-world outage scenarios on production systems, or as close to production as possible. It can also be used to learn about your existing system behavior by conducting experiments to test for a reaction. The mechanism generates new information about how the system reacts when individual components fail.
The CATDroid Framework for Testing Context-Sensitive Mobile Applications
Mobile apps react to GUI and context events (e.g. Internet, GPS, etc.) to provide context-sensitive functionality. Our framework, CATDroid, uses a combinatorial approach to generate test suites that systematically execute GUI events in multiple contexts. Experiments show that the approach increases code coverage and rate of fault detection for the context-sensitive Android apps in our study.
Addressing Challenges in Developing Virtual Beamline (VBL): A Large-Scale, High-Energy Parallel Laser Simulation Code
Modeling of advanced laser systems requires massively parallel simulations with multi-order increases in time/space resolution compared to Virtual Beamline’s (VBL) current capabilities. In developing the new version of VBL we address two key challenges: heterogeneous parallel computing using MPI and RAJA, and coordinating a large-scale scientific software project with a Scrum Agile approach.
Metaprogramming In A Functional-Programming-Based Web Framework
Back to top