-
Automatic Classification of Games using Support Vector Machine
Authors:
Ismo Horppu,
Antti Nikander,
Elif Buyukcan,
Jere Mäkiniemi,
Amin Sorkhei,
Frederick Ayala-Gómez
Abstract:
Game developers benefit from availability of custom game genres when doing game market analysis. This information can help them to spot opportunities in market and make them more successful in planning a new game. In this paper we find good classifier for predicting category of a game. Prediction is based on description and title of a game. We use 2443 iOS App Store games as data set to generate a…
▽ More
Game developers benefit from availability of custom game genres when doing game market analysis. This information can help them to spot opportunities in market and make them more successful in planning a new game. In this paper we find good classifier for predicting category of a game. Prediction is based on description and title of a game. We use 2443 iOS App Store games as data set to generate a document-term matrix. To reduce the curse of dimensionality we use Latent Semantic Indexing, which, reduces the term dimension to approximately 1/9. Support Vector Machine supervised learning model is fit to pre-processed data. Model parameters are optimized using grid search and 20-fold cross validation. Best model yields to 77% mean accuracy or roughly 70% accuracy with 95% confidence. Developed classifier has been used in-house to assist games market research.
△ Less
Submitted 17 May, 2021; v1 submitted 12 May, 2021;
originally announced May 2021.
-
Revenue Attribution on iOS 14 using Conversion Values in F2P Games
Authors:
Frederick Ayala-Gomez,
Ismo Horppu,
Erlin Gulbenkoglu,
Vesa Siivola,
Balázs Pejó
Abstract:
Mobile app developers use paid advertising campaigns to acquire new users. Marketing managers decide where to spend and how much to spend based on the campaigns' performance. Apple's new privacy mechanisms have a profound impact on how performance marketing is measured. Starting iOS 14.5, all apps must get system permission for tracking explicitly via the new App Tracking Transparency Framework, w…
▽ More
Mobile app developers use paid advertising campaigns to acquire new users. Marketing managers decide where to spend and how much to spend based on the campaigns' performance. Apple's new privacy mechanisms have a profound impact on how performance marketing is measured. Starting iOS 14.5, all apps must get system permission for tracking explicitly via the new App Tracking Transparency Framework, which shows the users a pop-up asking if they give the app permission to track. If a user does not allow tracking, the required identifier to deterministically find the online advertising campaign that brought the user to install the app is not shared. Instead of relying on individual identifiers, Apple proposed a new performance mechanism called conversion value, which is an integer set by the apps for each user, and the developers can get the number of installs per conversion value for each campaign. However, interpreting how conversion values are used to measure the campaigns performance is not obvious because it requires a method to translate the conversion values to revenue. This paper investigates the task of attributing revenue to advertising campaigns using the reported conversion values per campaign. Our contributions are to formalize the problem, find the theoretically optimal revenue attribution function for any conversion value schema, and show empirical results on past data of a free-to-play mobile game using different conversion value schemas.
△ Less
Submitted 24 January, 2022; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Malware distributions and graph structure of the Web
Authors:
Sanja Šćepanović,
Igor Mishkovski,
Jukka Ruohonen,
Frederick Ayala-Gómez,
Tuomas Aura,
Sami Hyrynsalmi
Abstract:
Knowledge about the graph structure of the Web is important for understanding this complex socio-technical system and for devising proper policies supporting its future development. Knowledge about the differences between clean and malicious parts of the Web is important for understanding potential treats to its users and for devising protection mechanisms. In this study, we conduct data science m…
▽ More
Knowledge about the graph structure of the Web is important for understanding this complex socio-technical system and for devising proper policies supporting its future development. Knowledge about the differences between clean and malicious parts of the Web is important for understanding potential treats to its users and for devising protection mechanisms. In this study, we conduct data science methods on a large crawl of surface and deep Web pages with the aim to increase such knowledge. To accomplish this, we answer the following questions. Which theoretical distributions explain important local characteristics and network properties of websites? How are these characteristics and properties different between clean and malicious (malware-affected) websites? What is the prediction power of local characteristics and network properties to classify malware websites? To the best of our knowledge, this is the first large-scale study describing the differences in global properties between malicious and clean parts of the Web. In other words, our work is building on and bridging the gap between \textit{Web science} that tackles large-scale graph representations and \textit{Web cyber security} that is concerned with malicious activities on the Web. The results presented herein can also help antivirus vendors in devising approaches to improve their detection algorithms.
△ Less
Submitted 19 July, 2017;
originally announced July 2017.
-
Item-to-item recommendation based on Contextual Fisher Information
Authors:
Bálint Daróczy,
Frederick Ayala-Gómez,
András Benczúr
Abstract:
Web recommendation services bear great importance in e-commerce, as they aid the user in navigating through the items that are most relevant to her needs. In a typical Web site, long history of previous activities or purchases by the user is rarely available. Hence in most cases, recommenders propose items that are similar to the most recent ones viewed in the current user session. The correspondi…
▽ More
Web recommendation services bear great importance in e-commerce, as they aid the user in navigating through the items that are most relevant to her needs. In a typical Web site, long history of previous activities or purchases by the user is rarely available. Hence in most cases, recommenders propose items that are similar to the most recent ones viewed in the current user session. The corresponding task is called session based item-to-item recommendation. For frequent items, it is easy to present item-to-item recommendations by "people who viewed this, also viewed" lists. However, most of the items belong to the long tail, where previous actions are sparsely available. Another difficulty is the so-called cold start problem, when the item has recently appeared and had no time yet to accumulate sufficient number of transactions. In order to recommend a next item in a session in sparse or cold start situations, we also have to incorporate item similarity models. In this paper we describe a probabilistic similarity model based on Random Fields to approximate item-to-item transition probabilities. We give a generative model for the item interactions based on arbitrary distance measures over the items including explicit, implicit ratings and external metadata. The model may change in time to fit better recent events and recommend the next item based on the updated Fisher Information. Our new model outperforms both simple similarity baseline methods and recent item-to-item recommenders, under several different performance metrics and publicly available data sets. We reach significant gains in particular for recommending a new item following a rare item.
△ Less
Submitted 8 November, 2016; v1 submitted 7 November, 2016;
originally announced November 2016.