Zemanta's Student Programming Challenge 2017by Žiga Stopinšek · 16 Oct 2017
The University of Ljubljana’s Faculty of Computer and Information Science and Zemanta are co-hosting a programming competition, with a chance to win great prizes - Google Pixel 2, Google Home Max and the opportunity for a paid internship at Zemanta.
This year’s assignments are:
- Alternative Ad Title Generator
- Real-time Business Intelligence Analytics
Alternative Ad Title Generator
Everyone likes interesting and clickable ad titles as they are more intriguing to web users, help getting advertisers’ message across and generate more revenue for publishers. The goal of this assignment is to create an Alternative Ad Title Generator which takes the existing title and comes up with meaningful alternatives which maximize ‘clickability’.
We are looking for a teachable system which takes ad title as input, and outputs 5 automatically-generated alternatives. Outputs have to be meaningful and preferably retain the context of the original title.
You’re required to prepare your own ad title learning datasets by scraping the web or by using a public database. We’ll provide a dataset of words and their average click-through rates (CTRs) which you can use to help train your generators. We’ll also include a sample of titles, some of which will be used for assessing the results.
Your solution must be open source and contain the code for both learning and predicting. Additionally, provide the dataset used for teaching the model and a short report, clearly describing both setup instructions and your approach to solving the problem.
Your program will be tested with a subset of titles from the provided sample. We’ll submit the generated alternative titles to our Zemanta One platform, exposing your work to some of the largest publishers on the web like cnn.com, bbc.com or foxnews.com. In addition, our marketing experts will look at your generated titles and score them based on clickability and meaningfulness.
Real-time Business Data Analytics
Every time we’re browsing the web, advertisers are competing for our attention. They bid against each other for ad space offered by websites via ad exchanges. This process is called real-time bidding. Each pageview announces the start of an auction - bidders are notified that bidding is about to happen with a bid request, they respond with suitable ads and a price, the winning ad is selected by the exchange and shown to the viewer, and the winning bidder is notified of his win. After that, the viewer might or might not click on the ad, and he might or might not buy the offered item. All of these events are valuable business metrics and are therefore stored in logs. Now imagine millions of users browsing the web at the same time.
The goal of this assignment is to build a robust and scalable service that enables its users to run ad-hoc queries against such huge but well-defined amounts of data close to real-time.
You’re required to choose the best storage solution, build a pipeline that consumes and stores the data, and provide a simple programmable interface for querying that will be able to return results in just a few seconds on a consumer laptop. You can use any suitable open-source storage solution in your service.
We’ll provide a script that simulates different services by generating random logs and some example queries.
Your solution has to be open-source and it has to include setup and query instructions. Dockerized solutions are a big plus. Prepare a short report where you describe the architecture of the system and include some benchmarks. We’ll test your service by continuously inserting thousands of logs per second, evaluate the results of a set of non-disclosed queries and measure the response times.
Assignments will be presented on 10/25 at 13:15 at Garaža, FRI.
Assignments have to be submitted to Učilnica FRI 17/18 by 12/11/2017. The award ceremony will take place on 12/21/17 at Zemanta.