Bert4rec for news Recommendation

Dataset used:

Microsoft News Dataset is a huge dataset for news
recommendation research.It was collected from anonymous
behavior logs of Microsoft News website.The purpose of
MIND is to serve as a benchmark dataset for news
recommendation and facilitate the research in news
recommendation and recommender systems area.
MIND contains about 160k English news articles and more
than 15 million impression logs generated by 1 million
users.We randomly sampled 1 million users who had at least
5 news click records during 6 weeks from October 12 to
November 22, 2019. Every news article contains textual
content including title, abstract, body, category and entities.
Each impression log contains the click events, non-clicked
events and historical news click behaviors of this user before
this impression.
There are 2,186,683 samples in the training set, 365,200
samples in the validation set, and 2,341,619 samples in the
test set, which can empower the training
of data-intensive news recommendation models.

[MIND Dataset]

Model Description:

Bert4Rec is a model used for products recommendation. In this project we have used the same Model for training a sequence of new articles.
BERT4Rec uses a transformer model to learn the sequential
representation of elements in a sequence. In this model we
assume the news articles to be arranged in a chronological
order in historical data. This we do using the script Thus we use masked sequences and
train the model in such a way that the model is able to
predict the masked elements.
We use the output of the pretrained
BERT4Rec model for getting the user representation
by summing up the output of this model. Later we use this
user representation to rank the candidate news.

[BERT4Rec Sequential Recommendation with Bidirectional
Encoder Representations from Transformer]


Taking the news titles in history which are arranged in chronological order we mask some news IDs in random from sequence. we train the Bert4Rec model which tries to identify the represenatation of the masked sequence.
we run the following code


later we finetune a CNN model for news representation. the CNN representation of candidate news and mean of Bert4Rec output passed on to a sigmoid layer after doing a dot product.
this is done using




Before submission pass the result.txt file to prediction.txt for proper formatting.



[BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer]


View Github