A hackathon promoting Why R? 2020 (Remote) Confernece.

A language agnostic competition devoted to text mining where every machine learning practitioner can find challenges to test his/her team!

At this hackathon you can scale the level of difficulty and the area of challenges on your own. Depending on skills and the time that you have you can tune the fun on your own!

Table of Contents

1. Why text mining? 2. Why Hackathon? 3. Challenges
4. Competition Rules 5. Mentors and Judges 6. Sponsor
7. Talks 8. For whom? 9. Teams
10. Registration 11. Event details 12. Organizers

Why text mining?

Text mining is widely known within machine wandering practitioners. The increased interest in the text mining is caused by an augmentation of internet users and by rapid growth of the internet data which is said that in a great amount is a text data. Extracting information from articles, news, posts and comments have became a desirable skill but what is even more needful are tools for text mining models diagnostics and visualizations.

Even though there are a lot of tools, books and webinars available online there is still a place for the improvement and development.

Why Hackathon?

Hackathons are events where enthusiasts of a specific topic gather in one place to work together on challenges that arose for a particular community.

Hackathons tend to be timepressure events, where solutions need to be created quicky and active cooperation between participants is necessary. To set the pace of the event, participants are divided into teams which compete to prepare the most valuable solution and win a prize.

For a participant such an undertaking is a great chance to:

  • develop the ability to work in group
  • learn from more experienced practitioners
  • take part in lectures and workshops related to text mining
  • have a remarkable networking experience
  • participate in healthy and fair competition
  • test skills in comparison with the others
  • win prizes
  • learn new data analysis techniques
  • use new tools
  • brainstorm new business use cases


Challenges and guidance for solution are published here


Competition Rules

Since the event is a competition with symbolic prices, we would like to grade solutions. Solutions should be sent as videos (max 5 min! per video). They should be presenting insights developed to solve stated challenges. Each team can send a solution for each challenge in as a separate video (one video for one challenge). More details about hackathon criterias announced at the event opening!

  • Whether there are at least 3 people in the team?
  • Is the presentation based on HackeR News data?
  • Is the solution a result of the teamwork?
  • Is the solution hosted in a public place?
  • Is this solution useful for the imaginary business team at Hacker News or has potential/clear business applications/story?
  • Is there a clear business problem/story that you are explaining?
  • How attractive is the use case?
  • How well are you able to present your solution?
  • Is the solution explainable?
  • Does the used solution have any statistical validation?

Presented solution should be submitted as a video. It is nice to have if a solution is based on a presentation or a dashboard. For challenges 2-4 the winning solution will be chosen based on insightfulness and usefulness of identified patterns. For challenge 1 the winning solution will be chosen based on a cost function however we would like to know how did you get into such predictions?

Mentors and Judges

McKinsey Analytics in Poland combines advanced data analytics solutions with in-depth industry and business knowledge, including multiple sectors such as commerce, banking, insurance, telecommunications, industrial production and heavy industry. McKinsey data scientists and architects, together with machine learning and data engineers, complement strategic and operational consulting and provide clients with advanced and robust data-driven solutions.

McKinsey Analytics experts specialize in many different areas: statistical learning, deep learning, evolutionary and multi-criteria optimization, multi-agent simulations, game theory, reinforcement learning, advanced econometrics, causal & Bayesian inference, uplift modelling, Explainable Artificial Intelligence, visualization and data engineering.

We are all looking forward to share with you some insights on how to identify and capture the most value and meaningful insights from data, and turn them into competitive advantages!


  • 2020-09-23 5:00pm UTC Julia Silge Data visualization for machine learning practitioners
  • 2020-09-24 1:00pm UTC Kenneth Benoit Why you should stop using other text mining packages and embrace quanteda
  • 2020-09-24 5:30pm UTC Why McKinsey Analytics? And how we use technology, data and global capabilities to serve our clients?

For whom?

We strongly encourage people with analytic thinking skills to participate in the event. Data analysts, developers, storytellers, BI consultants, web designers, researchers, data enthusiast are all welcome since they can learn a lot from one another!!

  • Have a good understanding of text mining challenges?
  • Eager to get failiar with text mining concepts and good practices?
  • Want to meet people devoted to text analyses?
  • Enthusiastic about presenting insights related to text mining analysis?

The event is made just for you!


We would like participants to gather in teams of 4 or 5.

If you do not have a team, please join our whyr.pl/slack/ to find out people willing to form a team. We are able to combine teams as well if you find it hard to gather a group of 4 or 5 on your own.

During the registration we will collect the list of your skills to adjust the difficulty level of the challenge and to inform judges about the overall advancement of participants. This can also help us create teams if you are unable to find one!


Text Mining Hackathon is a free event. However, the registration is required. This will allow organizers to book a capacity of mentors and judges for the event.

Use this form whyr.pl/2020/hackathon/register/ to submit the team for the competition!

Event details

  • Place: Remote Global Challenge
  • Date: 23.09.2020 - 24.09.2020
  • Start - 5:00pm UTC 23.09.2020
  • End - 5:30pm UTC 24.09.2020

  • Talks during the event
    • 2020-09-23 5:00pm UTC Julia Silge Data visualization for machine learning practitioners
    • 2020-09-24 1:00pm UTC Kenneth Benoit Why you should stop using other text mining packages and embrace quanteda
    • 2020-09-24 5:30pm UTC Why McKinsey Analytics? And how we use technology, data and global capabilities to serve our clients?
  • For? For everyone interested in text analysis and data science!
  • Tech? Any software that helps you win can be used!