Skip to main content

SkillBot: Identifying Risky Content for Children in Alexa Skills

 

Tu Le (University of Virginia) Danny Yuxing Huang (New York University) Noah Apthorpe (Colgate University) Yuan Tian (University of Virginia)

Page Link


Image courtesy: kidscreen

Abstract

Many households include children who use voice personal assistants (VPA) such as Amazon Alexa. Children benefit from the rich functionalities of VPAs and third-party apps but are also exposed to new risks in the VPA ecosystem (e.g., inappropriate content or information collection). To study the risks VPAs pose to children, built a Natural Language Processing (NLP)-based system to automatically interact with VPA apps and analyze the resulting conversations to identify contents risky to children. Identified 28 child-directed apps with risky contents and maintain a growing dataset of 31,966 non-overlapping app behaviours collected from 3,434 Alexa apps. Findings suggest that although voice apps designed for children are subject to more policy requirements and intensive vetting, children are still vulnerable to risky content. a user study is conducted to study showing that parents are more concerned about VPA apps with inappropriate content than those that ask for personal information, but many parents are not aware that risky apps of either type exist. Finally, identified a new threat to users of VPA apps: confounding utterances, or voice commands shared by multiple apps that may cause a user to invoke or interact with a different app than intended. Identified 4,487 confounding utterances, including 581 shared by child-directed and non-child-directed apps 


Risks to Children from VPAs

Researchers have found that 91% of children between ages 4 and 11 in the U.S. have access to VPAs, 26% of children are exposed to a VPA between 2 and 4 hours a week, and 20% talk to VPA devices for more than 5 hours a week. The lack of robust authentication on commercial VPAs makes it challenging to regulate children’s use of skills, especially as anyone in the same physical vicinity of a VPA can interact with the device. As a result, children may have access to risky skills that deliver inappropriate content (e.g., expletives) or collect personal information through voice interactions.

The 1998 Children’s Online Privacy Protection Act (COPPA) regulates the information collected from children under 13 online, but widespread COPPA violations have been shown in the mobile application market and compliance in the VPA space is far from guaranteed. Additionally, parental control modes provided by VPAs (e.g., Amazon FreeTime and Google Family App) often place a burden on parents during setup and receive complaints from parents due to their limitations.


Courtesy: Youtube

Purpose

Protecting children in the era of voice devices, therefore, raises several pressing questions: 

• Can we automate the analysis of VPA skills to identify content risky for children without requiring manual human voice interactions? 

• Are VPA skills targeted to children that claim to follow additional content requirements – hereafter referred to as “kid skills” – actually safe for child users? 

• What are parents’ attitudes and awareness of the risks posed by VPAs to children? 

•  How likely is it for children to be exposed to risky skills through confounding utterances—voice commands shared by multiple skills which could cause a child to accidentally invoke or interact with a different skill than intended.

Contributions

Automated System for Skill Analysis: A system is presented, SkillBot, that automatically interacts with Alexa skills and collects their contents at scale. The system can be run longitudinally to identify new conversations and new conversation branches in previously analyzed skills. 

Identification of Risks to Children: Analyzed 31,966 conversations collected from 3,434 Alexa kid skills to detect potential risky skills directed to children. 8 skills were found that contains inappropriate content for children and 20 skills that ask for personal information through voice interaction. 

User Study of Parents’ Awareness and Experiences: A user study conducted demonstrating that a majority of parents express concern about the content of the risky kid's skills identified by SkillBot tempered by disbelief that these skills are actually available for Alexa VPAs. This lack of risk awareness is compounded by findings that many parents’ do not use VPA parental controls and allow their children to use VPA versions that do not have parental controls enabled by default. 

Confounding Utterances: Identified confounding utterances as a novel threat to VPA users. The SkillBot analysis reveals 4,487 confounding utterances shared between two or more skills and highlight those that place child users at risk by invoking a non-kid skill instead of an expected kid skill.


Alexa Parental Control. 

Amazon FreeTime is a parental control feature which allows parents to manage what content their children can access on their Amazon devices. FreeTime on Alexa provides a Parent Dashboard user interface for parents to set daily time limits, monitor activities, and manage allowed content. If Freetime is enabled, users can only use the skills in the kid's category by default. To use other skills, parents need to manually add skills in the white list. FreeTime Unlimited is a subscription that offers thousands of kid-friendly content, including a list of kid skills available on compatible Echo devices, for children under 13. Parents can purchase this subscription via their Amazon account and use it across all compatible Amazon devices.

Children can potentially access an Amazon Echo device located in a shared space and invoke such “risky" skills in the absence of child-protection features on the Amazon Echo because of the following reasons. FreeTime is turned off by default on the regular version of Amazon Echo


Skill Information Extractor is the key component other components information found in paper. Amazon provides an online repository of skills via Alexa Skills Store. Each skill is an individual product, which has its own product info page and an Amazon Standard Identification Number (ASIN) that can be used to search for the skill in Amazon’s catalogue. The URL to a skill’s info page can be constructed from its ASIN. Our skill information extractor includes a web scraper to systematically access the Alexa website and download the skills’ info page in HTML based on their ASINs (i.e., skill IDs). It then reads the HTML files and constructs json dictionary structure using BeautifulSoup library. For each skill, we extract any information available on its info page such as ASIN (i.e., skill’s ID), icon, sample utterances, invocation name, description, reviews, permission list, and category (e.g., kids, education, smart home, etc.). 

Exploring and Classifying Utterances: Amazon allows developers to list up to three sample utterances in the sample utterances section of their skill’s information page. The system first extracts these sample utterances. 

Detecting Questions in Skill Responses: To extend the conversation, our system first classifies responses collected from the skill into three main categories. These three categories include: Yes/No question, WH question, and non-question statement. 

For this classification task, we employ spaCy and StanfordCoreNLP which are popular tools for NLP tasks. first tokenized the skill’s response into sentences and each sentence into words. Then annotate each sentence using part-of-speech (POS) tagging. For POS tags, utilizing both TreeBank POS tags and Universal POS tags. With the POS tagging, identifed the role of each word in the sentence, such as auxiliary, subject, or object, based on its tag.



Exploring Conversation Trees 

For each skill, SkillBot runs multiple rounds to explore different paths within the conversation trees. Each node in this tree is a unique response from Alexa. There is an edge between nodes i and j if there exists an interaction where Alexa says i, the user (i.e., SkillBot) says something, and then Alexa says j. We call the progression from i to j a path in the tree. Furthermore, multiple paths of interactions could exist for a skill. For instance, node i could have two edges: one with j and another one with k. Effectively, two paths lead from i. In one path, the user says something after hearing i, and Alexa responds with j. In another path, the user says something else after hearing i, and Alexa responds with k.




Kids Only: Identified 64 (58.2%) out of 110 utterances in this type invoked an irrelevant skill that was not in the list of skills associated with the utterance itself. The remaining 46 utterances (41.8%) invoked a relevant skill within the list of associated skills. 

Both Kids and Non-kids (Joint): Found that 367 (63.2%) out of 581 utterances in this type invoked an irrelevant skill that was not in the list of skills associated with the utterance itself. The remaining 214 utterances (36.8%) invoked a relevant skill within the list of associated skills. However, there were 157 out of 214 utterances (73.4%) prioritized to invoke a non-kid skill over a kid skill. 

Non-kids Only: Found that 1,999 (52.7%) out of 3,796 utterances in this type invoked an irrelevant skill that was not in the list of skills associated with the utterance itself. The remaining 1,797 utterances (47.3%) invoked a relevant skill within the list of associated skills. 

Takeaway: It is risky if a confounding utterance is shared between a kid skill and a non-kid skill. Our analysis shows that kids can accidentally invoke a non-kid skill while trying to use a kid skill. An adversary can exploit this problem to get kid users to invoke risky non-kid skills.


Conclusion 

Designed and implemented an automated skill interaction system called SkillBot, analyzing 3,434 Alexa kid skills. Identified a number of risky skills with inappropriate content or personal data requests, and confounding utterance threat. To further evaluate the impacts of these risky skills on kids, then conducted a user study of 232 U.S. parents who use Alexa in their household. Found widespread concerns about the contents of these skills, combined with general disbelief that these skills might actually be available to kids, and low adoption of parental control features.

Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based similarity measures for text data. Object o is an out

TableSense: Spreadsheet Table Detection with Convolutional Neural Networks

 - By Haoyu Dong, Shijie Liu, Shi Han, Zhouyu Fu, Dongmei Zhang Microsoft Research, Beijing 100080, China. Beihang University, Beijing 100191, China Paper Link Abstract Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the detection task is challenged by the diversity of table structures and table layouts on the spreadsheet. Considering the analogy between a cell matrix as spreadsheet and a pixel matrix as image, and encouraged by the successful application of Convolutional Neural Networks (CNN) in computer vision, we have developed TableSense, a novel end-to-end framework for spreadsheet table detection. First, we devise an effective cell featurization scheme to better leverage the rich information in each cell; second, we develop an enhanced convolutional neural network model for tab

DEEP LEARNING FOR ANOMALY DETECTION: A SURVEY

-By  Raghavendra Chalapathy  University of Sydney,  Capital Markets Co-operative Research Centre (CMCRC)  Sanjay Chawla  Qatar Computing Research Institute (QCRI),  HBKU  Paper Link Anomaly detection also known as outlier detection is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data. Typically the anomalous items will translate to some kind of problem such as bank fraud, a structural defect, medical problems or errors in a text. Anomalies are also referred to as outliers, novelties, noise, deviations and exceptions Hawkins defines an outlier as an observation that deviates so significantly from other observations as to arouse suspicion that it was generated by a different mechanism. Aim of this paper is two-fold, First is a structured and comprehensive overview of research methods in deep learning-based anomaly detection. Furthermore the adoption of these methods