Project content:
Text Mining: implementation of a new, scalable item based collaborative filtering algorithm for 150 million events per day.
Increase in click through rate by 10% compared to previous implementation – constructed real time streaming pipeline to transform newspaper text into vector format for about 20 texts per second being transformed, inserted and updated each day, implemented a fully automated model retraining pipeline.
Technologies & methods:
Python, Spark, TensorFlow, Machine & Deep Learning, Big Data