Interpretation of KDD2016 Papers (2)

Joint compilation: Gao Fei, Zhang Min, Chen Yang Yingjie

Introduction: KDD2016 is the premier interdisciplinary conference gathering researchers and practitioners in data science, data mining, knowledge discovery, large-scale data analysis and big data.

Paper One: Compression of Convolutional Neural Networks in Frequency Domain

Summary

Convolutional neural networks (CNNs) are being used more and more widely in many fields of computer vision research. Since convolutional neural networks can "absorb" large amounts of tagged data using millions of parameters, the application of such neural networks has received widespread attention. However, as the size of models continues to increase, the storage and memory requirements for classifiers continue to increase, thus impeding the application of many applications, such as image recognition and voice recognition capabilities for cell phones and other devices. This article will present a new type of network construction, the frequency-sensitive FreshNets, which is an inherently redundant construction between a convolutional layer and a fully connected layer using a deep learning model. Formation can save memory and storage consumption to a great extent. The weight of the learning convolution filter is usually stable and low frequency. Based on this major observation, we first transform the weight of the filter into a frequency domain with a discrete cosine transform, using a low-cost hash function to randomize the frequency parameters. Divided into hash buckets. All parameters that are assigned to the same hash bucket share a single value that can be learned using standard back propagation algorithms. To further reduce the size of the model, we assign a few hash bins to some high-frequency components, which are usually less important. We evaluated FreshNets in eight data sets and the assessment results showed that FreshNets has higher compression performance than several other associated baselines.

Keywords: Model Compression; Convolutional Neural Network; Hashing

First author introduction

Wenlin Chen

School: PhD, Department of Computer Science and Engineering, University of Washington, St. Louis

Main research fields: machine learning, data mining, artificial intelligence, especially interest in deep learning and large-scale machine learning

Related academic achievements:

· Strategies for Training Large Vocabulary Neural Language Models (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016. (ACL-16))

· Compressing Convolutional Neural Networks in the Frequency Domain, (Proc. ACM SIGKDD Conference, 2016. (KDD-16))

· Deep Metric Learning with Data Summarization. European Conference on Machine Learning (2016 (ECML-16))

Paper link: original paper download

Paper Two: Multi-task Features Interactive Learning

Summary

The linear model has been widely used in various data mining and machine learning algorithms. One of the major limitations of this type of model is the lack of the ability to obtain predictive information from feature interactions. Although the introduction of higher-order feature interaction terms can overcome this shortcoming, this method will still increase the complexity of the model to a great extent, and it will bring great challenges to the overfitting phenomenon in the learning process. When there are many kinds of interrelated learning tasks, the feature interactions in these tasks are usually related to each other. Modeling this association relationship plays a key role in improving the universality of these features in interactive learning. In this paper, we present a new multi-task feature interactive learning (MTIL) framework to use the connections between various tasks in the high-level feature interaction process. Specifically, we use tensors to represent feature interactions in multiple tasks. Using this quantity, we incorporate prior knowledge about task association into different structural adjustment processes. In this learning framework, we have developed two specific methods, namely shared interactive methods and embedded interactive methods. The former believes that all tasks have a common interaction model, while the latter believes that the multi-task feature interaction has a common subspace. We have provided efficient algorithms for formulating these two schemes. Extensive empirical research on such synthetic and real data sets confirms the effectiveness of our proposed multi-task feature interactive learning framework.

Key words: multi-task learning; feature interaction; institutional adjustment; tensor standard

First author introduction

Kaixiang Lin

School: Assistant Professor, Department of Computer Science and Engineering, Michigan State University

Main research areas: machine learning and data mining

Related academic achievements:

· Online Multi-task Learning Framework for Ensemble Forecasting (compliant to TKDE)

· Synergies that Matter: Efficient Interaction Selection via Sparse Factorization Machine

(SDM, 2016)

· GSpartan: a Geospatio-Temporal Multi-task Learning Framework for Multi-location Prediction. (SDM, 2016)

Download link: Original paper download

Paper 3: Contextual Intent of Private Assistants (KDD 2016 Best Student Paper)

Summary

In the area of ​​smart personal assistants, a new form of advice is emerging such as Apple's Siri, Google Now, and Microsoft Cortana, which can “properly recommend the right information at the right time,” and are proactive in helping you “get things right”. This type of recommendation requires accurate tracking of the user's intentions at the time, ie what type of information the user intends to know (eg, weather, stock prices) and what tasks they intend to accomplish (eg, playing music, taxiing). The user's intention is closely related to the context, including the external environment, such as time and place, and the user's internal activities (which can be felt by the personal assistant). The complex co-occurrence and sequence correlation between context and intent, and the contextual signals are also very mixed and sparse. This makes the relationship between modeling context and intention to become a challenging task. In order to solve the intent tracking problem, we propose Kalman filter regularize PARAFAC2 (KP2) real-time forecasting model, which can represent the structure and joint movement between context and intention. The KP2 model leverages collaboration capabilities on the user and learns each user's personalized dynamic system to ensure efficient real-time prediction of user intent. Most of the experiments used real-world datasets from business personal assistants. The results showed that the KP2 model was clearly superior to all other methods, and provided inspiring inspiration for the deployment of large-scale active advice systems in personal assistants.

Keywords: suggestion; real-time prediction; multi-task learning

First author introduction

Yu sun

School: Department of Computing and Information Systems, University of Melbourne

Research direction: contextual behavior mining, reinforcement learning, optimal location discovery, space/time indexing, algorithm design/analysis.

Related academic achievements:

· A Contextual Collaborative Approach for App Usage Forecasting, (UbiComp, 2016)

· Reverse Nearest Neighbor Heat Maps: A Tool for Influence Exploration, (ICDE, 966-977, 2016)

Download link: Original paper download

Paper 4: Using Censored Data in Display Advertising Unbiased Learning Bid Consciousness Gradient Descent

Summary

In real-time display advertising, the advertising space for each print is sold through an auction mechanism. For an advertiser, the information of the activity is incomplete, and only after the advertiser's bid wins the corresponding advertisement auction, the user's reaction (eg, click or conversion) and the market price of each advertisement can be be observed. Forecasts, such as bidding scenario predictions, click-through-rate (CTR) estimates, and bid optimization, are all run through the full volume of bid request data during the pre-bid phase. However, training data is gathered during the post-bid phase - there is a strong prejudice against winning impressions. A common way to learn this censored data is to reweight data instances to correct the conflict between training and prediction. However, there are very few studies on how to obtain weights independent of bidding strategies, so integrate them into the final CTR forecasting and bid generation steps. In this article, we have developed CTR evaluation and bid optimization under this censored auction data. Originating from a survival model, we show that previous bid information was naturally incorporated into bid-ideal gradient decline (BGD), which controls the importance and direction of gradients for unbiased learning. Based on two large-scale, real-world datasets learned empirically, this approach demonstrated superior performance gains in our solution. The learning framework has been deployed on Yahoo's real-time bidding platform, and on an online A/B test, it provided CTR with an estimated 2.97% increase in AUC and a 9.30% eCPC reduction in bid optimization.

Keywords: unbiased learning, censored data, real-time bidding, display advertising.

First author introduction

Weinan Zhang (Zhang Weinan)

School: Department of Computer Science, University College of London / Assistant Professor, Shanghai Jiao Tong University, August 2016

Research directions: machine learning, big data mining and its application in computing advertising and recommendation systems

Related academic achievements:

· User Response Learning for Directly Optimizing Campaign Performance in Display Advertising (CIKM 2016)

· Learning, Prediction and Optimisation in RTB Display Advertising(CIKM,October 2016)

Download link: Original paper download

Paper 5: Cooperative Knowledge Base Embedding of Recommendation System

Summary:

Among different recommendation technologies, collaborative filtering is often limited due to sparse user-object interactions. To solve these problems, we usually use auxiliary information to improve performance. Due to the rapid collection of information on the network, the knowledge base can provide heterogeneous information, including structured and unstructured data with different semantics, which can be used in various applications. In this paper, we study how to use the heterogeneous information in the knowledge base to improve the quality of the recommendation system. First, by using the knowledge base, we designed three components to extract the semantic representations of the objects from the structure content, text content, and video content. Specifically, the heterogeneous network embedding method we use, called TransR, considers extracting the structural representation of objects through the heterogeneity of nodes and relationships. We use stacked noise reduction automatic encoders and stacked convolutional automatic encoders, which are two types of deep learning based on embedded technology to extract text representations and image representations of objects, respectively. Finally, we propose a final integration framework, called Collaborative Knowledge Base Embedding (CKE), to jointly learn the potential representations of collaborative filtering and the semantic representation of objects in the knowledge base. In order to evaluate the performance of each embedded component as well as the entire system, we conducted extensive experiments with real-world data sets in two different scenarios. The results show that our method is superior to several of the most advanced recommended methods that are widely used.

Keywords: recommendation system; knowledge base embedding; collaborative learning

First author introduction

Fuzheng Zhang

School: Ph.D. in Computer Science, University of Science and Technology of China, and currently an assistant researcher at Microsoft Research Asia.

Research direction: user model, recommendation system, deep learning, emotion detection, location-based social network, spatio-temporal data mining, pervasive computing, large-scale systems

Author information link: https://

Download link: Original paper download

Paper 6: Maximizing Robustness

Summary

In this paper, for the in-depth study of maximizing influence, we present an important issue of uncertainty in the probabilistic projection of marginal influence, that is, the task of finding seed nodes k that can maximize the influence of communication in social networks. . The problem of maximizing the robustness that we propose is to give the uncertainties of the input parameters, maximizing the worst case ratio between the selected seed setting and the influence propagation of the optimal seed setting. We designed an algorithm that relies on the relevant boundaries of the solution to solve this problem. We further reduce the parameter uncertainty by further studying uniform sampling and adaptive sampling methods, and improve the robustness of the task of maximizing influence. Our experimental results show that the parameter uncertainty may seriously affect the maximization of influence, and previous studies also show that the probability of experience influence will maximize the robustness of the parameter due to the uncertainty of parameter estimation. The performance is very poor. Information superposition based on adaptive sampling method may be able to effectively improve the robustness of influence maximization.

Keywords: social network; impact maximization; robust optimization; information dissemination

First author introduction

Chen Wei

School: Senior Researcher at Microsoft Research Asia, Visiting Professor at Tsinghua University, Visiting Fellow at the Institute of Computing, Chinese Academy of Sciences, Program Committee Member at several international top data mining and data management conferences (KDD, WSDM, SIGMOD, ICDE, WWW, etc.), China Computer Learned the first members of the Big Data Expert Committee, editorial board of the Big Data journal.

Research directions: social and information network algorithms and data mining, online game theory and economics, online learning, etc.

In recent years, a series of groundbreaking research results in terms of maximizing social influence have been well received following publication at the top data mining, artificial intelligence and database academic conferences such as KDD, ICDM, SDM, WSDM, ICWSM, AAAI, and VLDB. And led to numerous follow-up work in this direction. The first published KDD'2009 paper was ranked second in all papers cited in the conference, while the second KDD'2010 paper was ranked first in all conference papers. In 2013, he collaborated with two other collaborators to write a monograph on influence propagation and maximization (Information & Influence Propagation in Social Networks, Morgan & Claypool, 2013), systematically summarizing the research results and the latest developments in this area. In addition, groundbreaking work has also been done in the direction related to social and information networks, such as community testing, network centric metric sequencing, network gaming, network pricing, and network incentive mechanisms, which have introduced game theory into online community detection. The dissertation won the Best Student Paper Award at the 2010 European Machine Learning and Data Mining Conference.

Download link: Original paper download

Via:KDD2016 accepted-papers

PS : This article was compiled by Lei Feng Network (search “Lei Feng Network” public number) and it was compiled without permission.