In this blog article we go through a thorough analysis on user behavior on Hacker News and we try to understand what how hacker news is being used in 2021.
As 2021 saw the birth and rise of Whaly, that we went through YCombinator in the Summer 2021 Batch (S21), and that we do data analysis for a living, we wanted to take some time to offer a comprehensive analysis of what happened in the famous social network created by YCombinator, Hacker News. As a business intelligence platform ourselves, all the analyses that you'll see here were done using Whaly.
Let's get started !
If you don't know yet how the social network is working, you can find the guide describing its rules here. In order to do this analysis we used the public BigQuery dataset that is published by Hacker News. You can find the dataset here. We created a public dashboard using Whaly that you can access here and that contains data from the first of January 2021 until the end of december 2021.
If we count the number of active users as the number of user having at least posted or commented on Hacker News, then we see that there is a total number of active users of 767,496 from 2005. This year the total number of active users was 153,047 which translates into 20% of total active users. Not bad!
This year has been very active in terms of engagement, there have been more jobs posted on the platform than ever, and the number of comment have been through the roof even though there have been fewer stories posted this year.
Hacker News has seen a growth of more than 10% this year with 88,559 new active users in their rank!
When asking a couple of friends about how they would consume content from Hacker news, they all told me the same thing. I open and browse hacker news when my code is compiling. I wanted to understand if the vast majority was like my friends or if they were complete outliers.
The heatmap above shows the breakdown in day of the week and hour of day of the comments based on their creation date. This helps us understand how people are interacting with stories. The time zone used to create this chart is UTC. If you don't know the time difference between your time zone and UTC please check here. We see on this chart that the vast majority of comments are made between 2pm and 10pm during weekdays, which translates in the pacific timezone into US working hours. I couldn't find any clue on the location of each user commenting so I cannot drill down by location to see what location is the most active.
Conclusion: People in Europe tend to comment on Hacker News during the afternoon and people in the States during business hours.
Unsurprisingly, we see that the vast majority of active user throughout the year are there to comment. We can distinguish three kind of population:
Hacker News users tend to post between 1pm and 6pm UTC during weekdays, which makes me wonder if most of the users post stories during their morning commute or morning coffee break in the States. Without the country from which users are posting, we cannot infer more on the question.
What we see though is that most of the active users on Hacker News are, like any social network, here to interact with already posted content rather than post themselves content. Indeed, 60% of the active users are there to comment only and not post and only 2.6% of them post more than 12 stories throughout the year.
In order to view extreme behavior, we decided to plot the users based on their overall number of comments and the number of stories posted during the year. This is not a surprise that most of the users tend to not comment and not post a lot, but we see that a small proportion of users can be called super users as they tend to post more than 1000 times (roughly 3 times per day if you are consistent enough) and some of them tend to comment a lot >=3000 comments (roughly 10 comments a day). Surprisingly, there don't seem to be outliers that post and comment a lot. So it seems you are either someone that comments or someone that posts.
In order to make it to the front page of Hacker News, you need to maximize the number of comments and upvotes in a minimum amount of time, hence creating engagement rather quickly.
Unfortunately we had not enough information in this dataset to recreate the actual score that Hacker News will use to rank your post, so we decided to look at a different formula.
We used a custom ranking as:
(Number of Comment of a Story * 3 + Number of Upvotes) / 4
Giving 3 times more impact of comments over the upvotes. In average we have noticed that there are 3 times more upvotes than comments so this would put comments and upvotes on the same page.
Overall we can see that the best time to post seems to be on Sundays and below we can see a screenshot of the top ten hours/days to post and see that it is definitely on Sunday during the day (remember we still talk in UTC), so either early morning for the US or during the day for Europe.
In order to understand what is discussed on Hacker News, we chose to look at the stories that got the most engagements in terms of comments and upvotes using our custom ranking. You can find a link to those stories below:
Looking at the list above I would say that the core of the community is entrepreneurs and developers that like polarized content and a good mocking session (Facebook 😁) like on any major social network but mostly like to interact around tech-oriented subjects.
We then wanted to see which news providers would have the most authority on Hacker News so we extracted all domains from all the stories published this year and this is what we got:
As we can see most of the content discussed is coming from domains such as Github, Youtube, Medium, Substack, Twitter, ... This means that most of the discussions on Hacker News are about sharing opinions rather that facts coming from traditional newspapers.
We wanted to give a special medal for the most active users on Hacker News.
Users that have commented the most:
Users that have posted stories the most:
User that have posted job offers the most:
There is probably a ton of bias in the work that we did on the Hacker News dataset. But the original goal was to discover insights using Whaly and without using any SQL. A mission that was well accomplished !