Skip to main content

Science journalism meets artificial intelligence "Robotic Journalism"

By - Raghuram Vadapalli, Bakhtiyar Syed, Nishant Prabhu , Balaji Vasan Srinivasan, Vasudeva Varma.



Summary of Research papers from IIT Hyd





Since couple of years an exciting topic is getting attraction in Machine learning and Artificial world that is "Robot Reporter".  Today's paper got inspired by the concept. Application to science journalism is non-trivial, as that would entail understanding scientific content and translating it to simpler language without distorting underlying semantics. paper heads infant steps towards answering few challenges.

Authors came out with a tool, which, given the title and abstract of a research paper will generate a blog title by mimicking a human science journalist. The tool uses model trained on 87,328 pairs of research papers and their related blogs.





Contributions can be summed up as follows


1. A new parallel corpus of 87, 328 pairs of research paper titles and abstracts and their corresponding blog titles.

2. Demonstrating the web application, which uses a pipeline-based architecture that can generate blog titles in a step-by-step fashion,while enabling the user to choose between various heuristic functions as well as the neural model to be used for generating the blog title.

3. Analyzing the outcomes of the experiments conducted to find the best heuristic function as well as network architecture.


Architecture 








Stage 1 


uses heuristic function to analyse and extract sequence

What is Heuristic function...

The Heuristic function is a way to inform the search about the direction to a goal. It provides an informed way to guess which neighbor of a node will lead to goal. There is nothing magical about heuristic function. It must use only information that can be readily obtained about node.


Stage 2 


The pointer-generator model used to generate the output sequence from the intermediate sequences.

Sequence to Sequence (seq2seq) is a learning model that converts an input sequence into output sequence. Seq2Seq model has achieved great success in fields such as machine translation, dialog systems, question-answering.


Blog Title Generation

Heuristic functions takes title and abstract of research paper as input H(pt, abs) where pt is paper title and abs is paper abstract. Various heuristic functions were explored and are outlined below
1)pt
2)RP (TF-IDF based)
3)RD (Flesch reading ease based)
4)RPD (normalized of RD and RP)


The output of the previous step is fed into a sequence-to-sequence neural generation model in order to generate the title of the blog post.

System provides a baseline attention network which defines 'attention' over the input sequence to allow the network to focus on specific parts of the input text and the pointer-generator
The sequence s obtained from the first stage is the input to the neural natural language generation model which generates bt' as output with loss function  L(bt, bt'), given by sum of cross entropy loss at all time-steps:










Working prototype gives opportunity to play around with combination of heuristic functions and model types for generating blog title

Link for working site


Comments

Popular posts from this blog

Cybersecurity Threats in Connected and Automated Vehicles based Federated Learning Systems

  Ranwa Al Mallah , Godwin Badu-Marfo , Bilal Farooq image Courtesy: Comparitech Abstract Federated learning (FL) is a machine learning technique that aims at training an algorithm across decentralized entities holding their local data private. Wireless mobile networks allow users to communicate with other fixed or mobile users. The road traffic network represents an infrastructure-based configuration of a wireless mobile network where the Connected and Automated Vehicles (CAV) represent the communicating entities. Applying FL in a wireless mobile network setting gives rise to a new threat in the mobile environment that is very different from the traditional fixed networks. The threat is due to the intrinsic characteristics of the wireless medium and is caused by the characteristics of the vehicular networks such as high node-mobility and rapidly changing topology. Most cyber defense techniques depend on highly reliable and connected networks. This paper explores falsified informat...

MLOps Drivenby Data Quality using ease.ml techniques

 Cedric Renggli, Luka Rimanic, Nezihe Merve Gurel, Bojan Karlas, Wentao Wu, Ce Zhang ETH Zurich Microsoft Research Paper Link ease.ml reference paper link Image courtesy 99designes Developing machine learning models can be seen as a process similar to the one established for traditional software development. A key difference between the two lies in the strong dependency between the quality of a machine learning model and the quality of the data used to train or perform evaluations. In this work, we demonstrate how different aspects of data quality propagate through various stages of machine learning development. By performing joint analysis of the impact of well-known data quality dimensions and the downstream machine learning process, we show that different components of a typical MLOps pipeline can be efficiently designed, providing both a technical and theoretical perspective. Courtesy: google The term “MLOps” is used when this DevOps process is specifically applied to ML. Diffe...

An Efficient Algorithm for Cleaning Robots Using Vision Sensors

 Abhijeet Ravankar , Ankit A. Ravankar , Michiko Watanabe and Yohei Hoshino Paper Link image Courtesy: the Verge Public places like hospitals and industries are required to maintain standards of hygiene and cleanliness. Traditionally, the cleaning task has been performed by people. However, due to various factors like shortage of workers, unavailability of 24-h service, or health concerns related to working with toxic chemicals used for cleaning, autonomous robots have been seen as alternatives. In recent years, cleaning robots like Roomba have gained popularity. These cleaning robots have limited battery power, and therefore, efficient cleaning is important. Efforts are being undertaken to improve the efficiency of cleaning robots.  The most rudimentary type of cleaning robot is the one with bump sensors and encoders, which simply keeps cleaning the room while the battery has charge. Other approaches use dirt sensors attached to the robot to clean only the untidy portions of ...