How to Promote Your Great Idea?

  • The major aim of this project is to determine what is the best way to post information about a project, idea, or get competent feedback and valuable insights. Besides the very essence of the post, the timing might be considered as an importnat factor, and for that reason show posts along with other types of posts on the popular Hacker News website are investigated with respect to the number of comments received and the amount of upvotes given.
  • The dataset covers Hacker News posts, where user-submitted stories and questions are voted and commented upon. The original dataset is taken from Kaggle and is openly available. Considering the fact that today there are around five million users there, and The New Yorker claims that "landing a blog post or personal project on the front page is a badge of honor for many technologists" or "the site is now a portal to tech culture for millions of people", it would have been a valuable insight to know how to make a popular post on Hacker News and draw attention of others, including prominent investors like Y Combinator.
  • The final result of the investigation might serve well to those either looking to promote their ideas or seeking for competent answers in the tech field as to how to increase their odds of getting that. Our answer is that one may do that on the Hacker News website, and the best timing to publish a post is around midday or a bit later - up to 15:00 (Eastern Time in the US).

Initial Data Exploration

The Hacker News website was launched as a side project on February 19, 2007 by Paul Graham's investment fund and startup incubator, Y Combinator. The original intent was to build a community around a social news website with the spirit of intellectual curiosity. Its focus is on computer science and tech entrepreneurship. The content is actively moderated, so that not to allow spam and misconduct to be the case there. The users cannot downvote comments up until they have accumulated enough "karma points", which is calculated as the 501 point difference between their own content's upvotes and downvotes.

There are several types of posts on the website, like comments, ask posts, show posts and job related posts. Two categories of major interest to us are ask posts and show posts. The former is when users ask for an advice or need help to solve a problem, while the latter is about showing others a project, a potential product, etc.

Let's upload our dataset downloaded from Kaggle) and explore the content.

In [17]:
# Read the downloaded data from Hacker News, turn it into list of lists and print the first several rows
open_file = open('/Users/mac/downloads/HN_posts_year_to_Sep_26_2016.csv')
from csv import reader
read_file = reader(open_file)
hn = list(read_file)

for row in hn[:5]:  # loop for printing each row
    print(row, '\n')
['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at'] 

['12579008', 'You have two days to comment if you want stem cells to be classified as your own', 'http://www.regulations.gov/document?D=FDA-2015-D-3719-0018', '1', '0', 'altstar', '9/26/2016 3:26'] 

['12579005', 'SQLAR  the SQLite Archiver', 'https://www.sqlite.org/sqlar/doc/trunk/README.md', '1', '0', 'blacksqr', '9/26/2016 3:24'] 

['12578997', 'What if we just printed a flatscreen television on the side of our boxes?', 'https://medium.com/vanmoof/our-secrets-out-f21c1f03fdc8#.ietxmez43', '1', '0', 'pavel_lishin', '9/26/2016 3:19'] 

['12578989', 'algorithmic music', 'http://cacm.acm.org/magazines/2011/7/109891-algorithmic-composition/fulltext', '1', '0', 'poindontcare', '9/26/2016 3:16'] 

Cleaning Data to Analyse Posts with Comments

There are seven colums in the dataset, which are user id number, post title, url, number of upvotes, number of comments, author's nickname and, finally, date and time when posts were created (timezone - Eastern Time in the US). For the purpose of subsequent data analysis we remove the headers from the original dataset being analyzed and save them separately.

In [18]:
headers = hn[0] # create list of headers
hn = hn[1:] # delete headers from the main body of the dataset

print('Headers:', headers, '\n')

print('Main body of the dataset:')
for row in hn[:2]:
    print(row, '\n')
Headers: ['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at'] 

Main body of the dataset:
['12579008', 'You have two days to comment if you want stem cells to be classified as your own', 'http://www.regulations.gov/document?D=FDA-2015-D-3719-0018', '1', '0', 'altstar', '9/26/2016 3:26'] 

['12579005', 'SQLAR  the SQLite Archiver', 'https://www.sqlite.org/sqlar/doc/trunk/README.md', '1', '0', 'blacksqr', '9/26/2016 3:24'] 

It possible to see that there are many rows in the original dataset with zero values for the number of comments, which is basically one of the variables of major interest to us. Therefore, we should get rid of such rows as irrelevant for further consideration and analyse only those posts, which draw attention of others.

In [19]:
# Delete rows with 0 for the number of comments column
print("Length of the initial dataset:", len(hn))

hn_new = [] # create dataset after cleaning

for row in hn:
    if row[4] != '0':
        hn_new.append(row)

print("Length of the cleaned dataset:", len(hn_new))
Length of the initial dataset: 293119
Length of the cleaned dataset: 80401

We should also check if there are outliers in our dataset, which may potentially influence our subsequent inference. First, we calculate the percentage share of posts receiving various numbers of comments in the total of comments on ask posts and show posts.

In [20]:
# Create frequency table to calculate the total amount of comments received by each
# group of posts, i.e. those with 1 comments, 2 comments, etc.
comments = {}

comments_total = 0 # total of all comments in the cleaned dataset

for row in hn_new:  # loop through the dataset to fill our dictionary
    title = row[1]
    if title.lower().startswith('ask hn') or title.lower().startswith('show hn'):
        comments_total += int(row[4]) # only ask and show posts are considered
        if row[4] not in comments:
            comments[row[4]] = int(row[4])
        else:
            comments[row[4]] += int(row[4])

 
comments_frequency = [] # create list of tuples for sorting

for key in comments: # loop through dictionary to fill the list of tuples
        comments[key] = comments[key]*100/comments_total 
        comments_tuple = (comments[key], key)    # which can then be sorted
        comments_frequency.append(comments_tuple)

# arrange results in descending order
comments_frequency = sorted(comments_frequency, reverse = True) 
        
for entry in comments_frequency[:15]: # print each type of post with corresponding
        print(entry[1], ':', round(entry[0], 2), '(%)') # percentages
2 : 2.85 (%)
3 : 2.64 (%)
4 : 2.48 (%)
1 : 2.17 (%)
6 : 2.14 (%)
5 : 1.97 (%)
7 : 1.78 (%)
9 : 1.65 (%)
10 : 1.41 (%)
8 : 1.39 (%)
11 : 1.31 (%)
12 : 1.15 (%)
14 : 1.06 (%)
16 : 1.04 (%)
15 : 1.01 (%)

The results shown in descending order indicate that there are no significant outliers with respect to the number of comments received by single posts, and the distribution is rather uniform. Thus, the majority of posts were commented just several times.

Second, we do the same procedure for upvotes received by show and ask posts.

In [21]:
# Create frequency table to calculate the total amount of upvotes received
# by each group of posts, i.e. those with 1 comments, 2 comments, etc.
upvotes = {}

upvotes_total = 0 # total of all upvotes in the cleaned dataset

for row in hn_new:  # loop through the dataset to fill our dictionary
    title = row[1]
    if title.lower().startswith('ask hn') or title.lower().startswith('show hn'):
        upvotes_total += int(row[3]) # only ask and show posts are considered
        if row[3] not in upvotes:
            upvotes[row[3]] = int(row[3])
        else:
            upvotes[row[3]] += int(row[3])

 
upvotes_frequency = []  # create list of tuples for sorting

for key in upvotes:   # loop through dictionary to fill the list of tuples
        upvotes[key] = upvotes[key]*100/upvotes_total 
        upvotes_tuple = (upvotes[key], key)    # which can then be sorted
        upvotes_frequency.append(upvotes_tuple)

# arrange results in descending order
upvotes_frequency = sorted(upvotes_frequency, reverse = True) 
        
for entry in upvotes_frequency[:15]: # print each type of post with corresponding
        print(entry[1], ':', round(entry[0], 2), '(%)') # percentages
3 : 1.82 (%)
4 : 1.76 (%)
2 : 1.67 (%)
6 : 1.5 (%)
5 : 1.5 (%)
8 : 1.27 (%)
7 : 1.25 (%)
9 : 1.16 (%)
11 : 1.0 (%)
12 : 0.99 (%)
10 : 0.99 (%)
13 : 0.98 (%)
15 : 0.86 (%)
14 : 0.84 (%)
16 : 0.79 (%)

Similar results in descending order are obtained in this case as well - there is no posts, which received the majority of upvotes. Therefore, we may proceed further with no fear of outliers negatively influencing our conlusions.

Figuring Out the Best Hour to Post a Post

Let's further separate two types of posts we are interested and compare to the total. In addition, we may also consider other types of posts. For these purposes we create three separate datasets for each category and compare them.

In [22]:
ask_posts = []  # list for asking type posts
show_posts = [] # list for showing type posts
other_posts = [] # list for other posts

for row in hn_new:       # loop through the main dataset to create lists
    title = row[1]
    if title.lower().startswith('ask hn'): # when a posts start with 'ask hn'
        ask_posts.append(row)              # it is added to the asking type list
    elif title.lower().startswith('show hn'): # when a posts start with 'show hn'
        show_posts.append(row)                # it is added to the showing type list
    else:                                     # otherwise it is added to 
        other_posts.append(row)               # the list of other posts
        
# Print results for each type of posts       
print('Number of asking type posts and share in total:', len(ask_posts),"|", 
      round(len(ask_posts)/len(hn_new)*100,2),"(%)", '\n', ask_posts[:2], '\n', '\n')
print('Number of showing type posts and share in total:', len(show_posts),"|", 
      round(len(show_posts)/len(hn_new)*100,2),"(%)", '\n', show_posts[:2], '\n', '\n')
print('Number of other types of posts and share in total:', len(other_posts),"|", 
      round(len(other_posts)/len(hn_new)*100,2),"(%)", '\n', other_posts[:2])
Number of asking type posts and share in total: 6911 | 8.6 (%) 
 [['12578908', 'Ask HN: What TLD do you use for local development?', '', '4', '7', 'Sevrene', '9/26/2016 2:53'], ['12578522', 'Ask HN: How do you pass on your work when you die?', '', '6', '3', 'PascLeRasc', '9/26/2016 1:17']] 
 

Number of showing type posts and share in total: 5059 | 6.29 (%) 
 [['12577142', 'Show HN: Jumble  Essays on the go #PaulInYourPocket', 'https://itunes.apple.com/us/app/jumble-find-startup-essay/id1150939197?ls=1&mt=8', '1', '1', 'ryderj', '9/25/2016 20:06'], ['12576813', 'Show HN: Learn Japanese Vocab via multiple choice questions', 'http://japanese.vul.io/', '1', '1', 'soulchild37', '9/25/2016 19:06']] 
 

Number of other types of posts and share in total: 68431 | 85.11 (%) 
 [['12578975', 'Saving the Hassle of Shopping', 'https://blog.menswr.com/2016/09/07/whats-new-with-your-style-feed/', '1', '1', 'bdoux', '9/26/2016 3:13'], ['12578822', 'Amazons Algorithms Dont Find You the Best Deals', 'https://www.technologyreview.com/s/602442/amazons-algorithms-dont-find-you-the-best-deals/', '1', '1', 'yarapavan', '9/26/2016 2:26']]

From the calculated percentage shares it is possible to see that the category of other posts constitutes the majority of posts on the Hacker News website, while ask posts are slightly larger in their number as compared to show posts.

Let's calculate further the average number of comments per each type of post to see which one is the most popular in terms of comments.

In [23]:
total_ask_comments = 0 # total amount of comments for ask posts
total_ask_upvotes = 0

for row in ask_posts:
    total_ask_comments += int(row[4]) # add the number of post's comments
    total_ask_upvotes += int(row[3])
    
avg_ask_comments = total_ask_comments/len(ask_posts) # calculate average
avg_ask_upvotes = total_ask_upvotes/len(ask_posts)
    
print('Ask posts:', '\n', "Average number of comments -", avg_ask_comments, 
      "|", "Average number of upvotes -", avg_ask_upvotes, "\n")

total_show_comments = 0 # total amount of comments for show posts
total_show_upvotes = 0

for row in show_posts:
    total_show_comments += int(row[4]) 
    total_show_upvotes += int(row[3])
    
avg_show_comments = total_show_comments/len(show_posts)
avg_show_upvotes = total_show_upvotes/len(show_posts)
    
print('Show posts:', '\n', "Average number of comments -", avg_show_comments, 
      "|", "Average number of upvotes -", avg_show_upvotes, "\n")

total_other_comments = 0 # total amount of comments for other posts
total_other_upvotes = 0

for row in other_posts:
    total_other_comments += int(row[4]) 
    total_other_upvotes += int(row[3])
    
avg_other_comments = total_other_comments/len(other_posts)
avg_other_upvotes = total_other_upvotes/len(other_posts)
    
print('Other posts:', '\n', "Average number of comments -", avg_other_comments, 
      "|", "Average number of upvotes -", avg_other_upvotes)
Ask posts: 
 Average number of comments - 13.744175951381855 | Average number of upvotes - 14.40457242077847 

Show posts: 
 Average number of comments - 9.810832180272781 | Average number of upvotes - 26.62126902549911 

Other posts: 
 Average number of comments - 25.838318890561297 | Average number of upvotes - 53.43003901740439

Out of the results above we may notice that the average number of comments in the asking posts is larger than that of showing type posts. This may imply that visitors of the Hacker News website are rather likely to comment more on questions of other users, than to add their thoughts to what is shown as projects, thoughts, etc. At the same time, other posts receive even more comments on average.

Nonetheless, from the perspective of upvotes' averages show posts have larger values, as compared to the ask posts. Probably this can be explained to certain extent by the very content - questions in ask posts are generaly supposed to be just answered, while posts about projects and ideas, i.e. show posts, are delivered to receive feedback and reactions of other users. Other types of posts again show even higher numbers, but due to their miscellaneous nature it is more difficult to explain that.

Let's explore further the time distribution of the two types of comments, so that to figure out at what hours the visitors of the Hacker News website are active the most.

In [24]:
# Install package to work with time objects
import datetime as dt
ask_result_list = [] # create list of ask post with respective time
                     # and number of comments
show_result_list = [] # create list of show post with respective time
                      # and number of comments
other_result_list = [] # create list of other posts with respective time
                       # and number of comments
    
for row in ask_posts:
    ask_result_list.append([row[6], int(row[4]), int(row[3])])
    
for row in show_posts:
    show_result_list.append([row[6], int(row[4]), int(row[3])])
    
for row in other_posts:
    other_result_list.append([row[6], int(row[4]), int(row[3])])

# print created lists
print("Ask posts timing and number of comments:", ask_result_list[:2], "...") 
print("Show posts timing and number of comments:", show_result_list[:2], "...")
print("Other posts timing and number of comments:", other_result_list[:2], "...")
print("\n")

ask_counts_by_hour = {}  # create frequency table for the distribution   
                         # of ask posts per hour
ask_comments_by_hour = {} # create frequency table for the distribution  
                          # of ask posts' number of comments per hour
ask_upvotes_by_hour = {} # create frequency table for the distribution  
                          # of show posts' number of upvotes per hour

for row in ask_result_list:
    date = dt.datetime.strptime(row[0], "%m/%d/%Y %H:%M") # create time objects
    hour = date.hour                                     # to take hours out of it
    if hour not in ask_counts_by_hour:                
        ask_counts_by_hour[hour] = 1        # count the number of ask posts per hour
        ask_comments_by_hour[hour] = row[1]  # count the number of comments per hour
        ask_upvotes_by_hour[hour] = row[2]   # count the number of upvotes per hour
    else:
        ask_counts_by_hour[hour] += 1
        ask_comments_by_hour[hour] += row[1]
        ask_upvotes_by_hour[hour] += row[2]

# The same procedure is done for show posts
show_counts_by_hour = {}
show_comments_by_hour = {}
show_upvotes_by_hour = {}
        
for row in show_result_list:
    date = dt.datetime.strptime(row[0], "%m/%d/%Y %H:%M")
    hour = date.hour
    if hour not in show_counts_by_hour:
        show_counts_by_hour[hour] = 1
        show_comments_by_hour[hour] = row[1]
        show_upvotes_by_hour[hour] = row[2]
    else:
        show_counts_by_hour[hour] += 1
        show_comments_by_hour[hour] += row[1]
        show_upvotes_by_hour[hour] += row[2]
        
# The same procedure is done for other posts
other_counts_by_hour = {}
other_comments_by_hour = {}
other_upvotes_by_hour = {}
        
for row in other_result_list:
    date = dt.datetime.strptime(row[0], "%m/%d/%Y %H:%M")
    hour = date.hour
    if hour not in other_counts_by_hour:
        other_counts_by_hour[hour] = 1
        other_comments_by_hour[hour] = row[1]
        other_upvotes_by_hour[hour] = row[2]
    else:
        other_counts_by_hour[hour] += 1
        other_comments_by_hour[hour] += row[1]
        other_upvotes_by_hour[hour] += row[2]

# Print two types of frequency tables for ask and show posts
print("Frequency of show posts per each hour:", show_counts_by_hour, "\n")
print("Frequency of show posts' comments per each hour:", show_comments_by_hour, "\n")
print("Frequency of show posts' upvotes per each hour:", show_comments_by_hour, "\n")
Ask posts timing and number of comments: [['9/26/2016 2:53', 7, 4], ['9/26/2016 1:17', 3, 6]] ...
Show posts timing and number of comments: [['9/25/2016 20:06', 1, 1], ['9/25/2016 19:06', 1, 1]] ...
Other posts timing and number of comments: [['9/26/2016 3:13', 1, 1], ['9/26/2016 2:26', 1, 1]] ...


Frequency of show posts per each hour: {20: 246, 19: 270, 16: 389, 14: 331, 10: 156, 9: 158, 6: 95, 3: 97, 18: 311, 17: 370, 15: 392, 11: 228, 13: 334, 1: 135, 22: 192, 12: 300, 8: 160, 4: 90, 0: 141, 21: 209, 5: 76, 23: 148, 7: 127, 2: 104} 

Frequency of show posts' comments per each hour: {20: 2183, 19: 2791, 16: 3769, 14: 3839, 10: 1228, 9: 1411, 6: 904, 3: 934, 18: 3242, 17: 3236, 15: 3824, 11: 2413, 13: 3314, 1: 1006, 22: 1450, 12: 3609, 8: 1771, 4: 978, 0: 1283, 21: 1759, 5: 592, 23: 1444, 7: 1577, 2: 1076} 

Frequency of show posts' upvotes per each hour: {20: 2183, 19: 2791, 16: 3769, 14: 3839, 10: 1228, 9: 1411, 6: 904, 3: 934, 18: 3242, 17: 3236, 15: 3824, 11: 2413, 13: 3314, 1: 1006, 22: 1450, 12: 3609, 8: 1771, 4: 978, 0: 1283, 21: 1759, 5: 592, 23: 1444, 7: 1577, 2: 1076} 

Targeting Posts' Timing With Comments

Now that we have frequency distribution of both posts and comments to them, we may figure out at which hours posts are commented the most. For that the number of posts per hour should be divided by the number of comments of the same time period. Again, we do that for both ask posts and show posts, though the latter is of greater interest from the perspective of getting something promoted on the Hacker News website.

Let's start with the calculation of ask posts' hourly rate of posts.

In [25]:
# Create function for hourly averages calculation
def hourly_rates(items, counts):
    avg_by_hour = [] # create list of lists with hours and corresponding averages

    for hour in counts: # loop through dictionaries to calculate the average
        if hour in items:    # number of items per post
            average = items[hour]/counts[hour] 
            avg_by_hour.append([hour, round(average, 2)])
        
    swap_avg_by_hour = []  # create list for subsequent sorting

    for row in avg_by_hour:
        swap_avg_by_hour.append([row[1], row[0]])

    sorted_swap = sorted(swap_avg_by_hour, reverse = True) # sort by the number of items
    line = "{}: {} average comments per ask post"     # set format for printing

    print("Posts comments' hourly averages:","\n")
    for row in sorted_swap:      # print first eight rows in descending order
        time = dt.datetime.strptime(str(row[1]), "%H")
        time = time.strftime("%H:%M")
        print(line.format(time, row[0]))
In [26]:
# Run the function for comments' hourly averages calculation of ask posts
hourly_rates(ask_comments_by_hour, ask_counts_by_hour) 
Posts comments' hourly averages: 

15:00: 39.67 average comments per ask post
13:00: 22.22 average comments per ask post
12:00: 15.45 average comments per ask post
10:00: 13.76 average comments per ask post
17:00: 13.73 average comments per ask post
02:00: 13.2 average comments per ask post
14:00: 13.15 average comments per ask post
04:00: 12.69 average comments per ask post
08:00: 12.43 average comments per ask post
22:00: 11.75 average comments per ask post
20:00: 11.38 average comments per ask post
11:00: 11.14 average comments per ask post
05:00: 11.14 average comments per ask post
21:00: 11.06 average comments per ask post
18:00: 10.79 average comments per ask post
16:00: 10.76 average comments per ask post
03:00: 10.16 average comments per ask post
07:00: 10.1 average comments per ask post
00:00: 9.86 average comments per ask post
19:00: 9.41 average comments per ask post
01:00: 9.37 average comments per ask post
06:00: 9.02 average comments per ask post
09:00: 8.39 average comments per ask post
23:00: 8.32 average comments per ask post

According to the results above, we may find out that 15:00 (Eastern Time in the US) is the optimal time to make a post, so that to get the most comments on it. The second best time is 13:00, and the third best is 12:00. Thus, the top three hours to post an ask post are rather adjacent being around the lunch time in the time zone considered. Moreover, the difference between top one and the rest in terms of average comments is striking - those who post an ask post at 15:00 get almost thrice as many comments as at the other hours, besides 13:00.

Let's do the same procedure for show posts and see the most popular hours for this category.

In [27]:
# Run the function for comments' hourly averages calculation of show posts
hourly_rates(show_comments_by_hour, show_counts_by_hour)
Posts comments' hourly averages: 

07:00: 12.42 average comments per ask post
12:00: 12.03 average comments per ask post
14:00: 11.6 average comments per ask post
08:00: 11.07 average comments per ask post
04:00: 10.87 average comments per ask post
11:00: 10.58 average comments per ask post
18:00: 10.42 average comments per ask post
02:00: 10.35 average comments per ask post
19:00: 10.34 average comments per ask post
13:00: 9.92 average comments per ask post
23:00: 9.76 average comments per ask post
15:00: 9.76 average comments per ask post
16:00: 9.69 average comments per ask post
03:00: 9.63 average comments per ask post
06:00: 9.52 average comments per ask post
00:00: 9.1 average comments per ask post
09:00: 8.93 average comments per ask post
20:00: 8.87 average comments per ask post
17:00: 8.75 average comments per ask post
21:00: 8.42 average comments per ask post
10:00: 7.87 average comments per ask post
05:00: 7.79 average comments per ask post
22:00: 7.55 average comments per ask post
01:00: 7.45 average comments per ask post

The results are actually quite different from what we have seen above, which is surprising at the first sight. The distribution of averages is rather uniform, with the top three not differing that much from the rest - we therefore cannot point out the best hour to post a show post based only on these results alone. Moreover, the top one hour (7:00) is early in the morning, which might suggest that users outside the US Eastern Time Zone, i.e. those from other parts of the US and the outside world are actively using this section of the Hacker News website.

Finally, let's carry out the same procedure for other types of posts.

In [28]:
# Run the function for comments' hourly averages calculation of other posts
hourly_rates(other_comments_by_hour, other_counts_by_hour)
Posts comments' hourly averages: 

13:00: 29.37 average comments per ask post
12:00: 29.2 average comments per ask post
14:00: 28.09 average comments per ask post
15:00: 27.97 average comments per ask post
11:00: 27.13 average comments per ask post
17:00: 26.92 average comments per ask post
16:00: 26.83 average comments per ask post
02:00: 26.79 average comments per ask post
05:00: 26.14 average comments per ask post
09:00: 26.12 average comments per ask post
18:00: 26.08 average comments per ask post
08:00: 25.95 average comments per ask post
10:00: 25.74 average comments per ask post
19:00: 25.37 average comments per ask post
03:00: 24.56 average comments per ask post
00:00: 24.43 average comments per ask post
07:00: 24.33 average comments per ask post
06:00: 24.06 average comments per ask post
20:00: 23.68 average comments per ask post
04:00: 23.51 average comments per ask post
01:00: 23.51 average comments per ask post
21:00: 23.05 average comments per ask post
23:00: 22.84 average comments per ask post
22:00: 22.72 average comments per ask post

The distribution of averages of other posts resembles that of show posts, being rather uniform with even smaller differences between maximum and minimum values. On the other hand, it is possible to see that the top four hours are in the period between 12:00 and 15:00, which is exactly the same as the top three hours for ask posts.

On balance, an important point to consider is that there is an intersection in the most favourable timing for posting - 12:00 and 14:00 time span seem to bring slightly more comments on average for show posts, which is adjacent to the top hours for the ask posts and other types of posts. Therefore, the period between 12:00 and 15:00 seems to be the sweet spot we are ultimately looking for.

Targeting Posts' Timing With Upvotes

Using our dataset we may assess the popularity of posts not only with the number of comments, but also with the amount of upvotes. Let's proceed further with the hourly rates of the second metric for the types of posts under investigation.

In [29]:
# Run the function for upvotes' hourly averages calculation of ask posts
hourly_rates(ask_upvotes_by_hour, ask_counts_by_hour) 
Posts comments' hourly averages: 

15:00: 29.31 average comments per ask post
13:00: 23.77 average comments per ask post
17:00: 16.96 average comments per ask post
10:00: 16.71 average comments per ask post
12:00: 16.53 average comments per ask post
18:00: 14.54 average comments per ask post
08:00: 13.81 average comments per ask post
04:00: 13.7 average comments per ask post
16:00: 13.69 average comments per ask post
14:00: 13.68 average comments per ask post
02:00: 12.63 average comments per ask post
07:00: 12.26 average comments per ask post
22:00: 12.0 average comments per ask post
05:00: 11.98 average comments per ask post
21:00: 11.9 average comments per ask post
00:00: 11.75 average comments per ask post
03:00: 11.5 average comments per ask post
01:00: 11.39 average comments per ask post
20:00: 10.99 average comments per ask post
06:00: 10.98 average comments per ask post
11:00: 10.91 average comments per ask post
19:00: 10.88 average comments per ask post
09:00: 9.52 average comments per ask post
23:00: 9.07 average comments per ask post

The results for the top two hours are the same as those in terms of comments - ask posts created at 15:00 and 13:00 receive on average more upvotes - almost twice as many as the rest. It is not surprising and rather expectable - most commented posts should also be upvoted the most.

However, it is necessary to check if the same logic applies to show posts.

In [30]:
# Run the function for upvotes' hourly averages calculation of show posts
hourly_rates(show_upvotes_by_hour, show_counts_by_hour)
Posts comments' hourly averages: 

12:00: 33.57 average comments per ask post
11:00: 31.57 average comments per ask post
23:00: 30.4 average comments per ask post
19:00: 29.8 average comments per ask post
06:00: 29.38 average comments per ask post
14:00: 28.53 average comments per ask post
13:00: 28.51 average comments per ask post
18:00: 28.41 average comments per ask post
00:00: 27.65 average comments per ask post
04:00: 26.49 average comments per ask post
15:00: 26.17 average comments per ask post
16:00: 26.16 average comments per ask post
08:00: 25.98 average comments per ask post
17:00: 25.53 average comments per ask post
21:00: 25.08 average comments per ask post
20:00: 24.81 average comments per ask post
10:00: 23.89 average comments per ask post
22:00: 23.25 average comments per ask post
07:00: 23.15 average comments per ask post
02:00: 22.62 average comments per ask post
09:00: 20.84 average comments per ask post
01:00: 19.1 average comments per ask post
05:00: 19.08 average comments per ask post
03:00: 18.57 average comments per ask post

In the case of show posts the distribution of upvotes is again rather uniform - around 30 upvotes per each hour on average. Hence no strong inference may be done on these results alone. Nonetheless, 12:00 seems to be slightly more favourable for posting a show post, which is consistent with the results on comments (top second) for this category.

In [31]:
# Run the function for upvotes' hourly averages calculation of other posts
hourly_rates(other_upvotes_by_hour, other_counts_by_hour)
Posts comments' hourly averages: 

13:00: 58.62 average comments per ask post
12:00: 57.53 average comments per ask post
15:00: 55.95 average comments per ask post
17:00: 55.64 average comments per ask post
16:00: 55.62 average comments per ask post
02:00: 55.13 average comments per ask post
08:00: 55.12 average comments per ask post
18:00: 54.88 average comments per ask post
19:00: 54.72 average comments per ask post
14:00: 54.38 average comments per ask post
11:00: 53.6 average comments per ask post
05:00: 53.58 average comments per ask post
10:00: 53.24 average comments per ask post
07:00: 53.01 average comments per ask post
00:00: 52.65 average comments per ask post
03:00: 51.89 average comments per ask post
09:00: 51.62 average comments per ask post
06:00: 51.2 average comments per ask post
01:00: 50.97 average comments per ask post
21:00: 50.41 average comments per ask post
04:00: 49.46 average comments per ask post
23:00: 48.98 average comments per ask post
22:00: 48.5 average comments per ask post
20:00: 47.91 average comments per ask post

From the perspective of upvotes the subset of other posts provides almost indetical result to what we have seen for comments' averages. The top three hours indicate again that items posted in the period between 12:00 and 15:00 are slightly more likely to get upvotes.

Therefore, out preliminary proposition of the optimal time span for posting received additional confirmation, when investigating upvotes hourly distribution for the major types of posts.

Conclusion

In this project we tried to estimate the best time to post a post on Hacker News to get as much attention as possible. It could be a valuable insight for those, seeking to promote their projects and ideas in the start-up community, to know how to make the most of this website, run by Y Combinator and closely watched by prominent venture investors. Our conclusion is that 12:00 is that sweet spot to post a show post, get comments and upvotes. For asking questions and other inquiries the hours after midday also work well.

Even though this timing is not significantly different from other hours in terms of comments or upvotes for show posts, but there are two reasons to believe that you can win more if posting at this time:

  • 12:00 is among the top hours to post a show post both in terms of comments and upvotes;
  • the time period between 12:00 and 15:00 (Eastern Time in the US) is the most favourable to post any type of post in general.

All in all, the Hacker News website is a great place to ask questions, promote projects and get feedback on them, and if you manage to make your posts around midday or a bit later the odds of getting what you are looking for are higher.