Wednesday the 21st, November, 2018
I've been helping my 5th grade sister with her math league problems, and some brainteasers involving leap years came up. This sparked a seemingly (probably actually is) trivial question of how do the days of the week compare to one another in how often they appear in each month of our calendars? One would guess that over time the frequency of each weekday occurring for each month would more-or-less get evened out, right? (i.e. an equal number of mon, tues, wed, etc.) Let's find out by taking a look at how the months on our calendar actually cycle through the days of the week, and when/if the sequence comes back to itself. Some questions we may answer are:
There are a number of existing algorithms that can determine the day of the week for any given date. But I wanted to see how the distribution of these days in each month of the year play out over the course of an entire Gregorian cycle. We'll be running a program to evaluating the frequency of each weekday as they've occur in each month throughout the years since our current calendar's conception.
Fun fact: The first day of the Gregorian calendar was October 15, 1582. A Friday.
# Setting up the libaries & functions we'll be using
import pandas as pd
from IPython.display import display_html, HTML
from datetime import date, datetime, timedelta as td
from plotly.grid_objs import Grid, Column
import plotly.plotly as py
import time
# Dictionaries for dataframe column & row headers
weekdays = {0:"sun", 1:"mon", 2:"tues", 3:"wed",
4:"thurs", 5:"fri", 6:"sat"}
months = {0:"jan", 1:"feb", 2:"mar", 3:"apr",
4:"may", 5:"jun", 6:"jul", 7:"aug",
8:"sept", 9:"oct", 10:"nov", 11:"dec"}
ordinal = {0:"1st", 1:"2nd", 2:"3rd", 3:"4th",
4:"5th", 5:"6th", 6:"7th"}
# Print the dates + weekday information given the
# intervals of days from a specified origin date
def printWeekDays(daysFrom, originDate):
DayOnes = pd.to_datetime(daysFrom, unit='D', origin=pd.Timestamp(originDate))
for day in DayOnes:
print (day.strftime("%Y %b %d: %A (Day %w)"))
# Display dataframes side-by-side with their names on top
def disp_dfs(*args):
html_str = ''
for df in args:
html_str += '<div style=max-width:45%;float:left;>'\
'<p style=font-weight:bold;text-align:center;>'\
+df.name+'</p>'\
+df.to_html()+'</div>'
display_html(html_str.replace('table', 'table style=display:inline'), raw=True)
# Date ranges using datetime dates "date(%Y,%m,%d)" as input
def dateRange(start_date, end_date):
for n in range(int ((end_date - start_date).days)):
yield start_date + td(n)
How long does it take for a given date to cycle back and coincide on the same day of the week again?
The day of the week of any given date shifts 1 day for each nonleap year, and 2 years forward for each leap year.
# What a difference 4 years make
daysFrom = [0, 365+1, 365*2+1, 365*3+1, 365*4+1]
printWeekDays(daysFrom, '2000-1-1')
2000 Jan 01: Saturday (Day 6) 2001 Jan 01: Monday (Day 1) 2002 Jan 01: Tuesday (Day 2) 2003 Jan 01: Wednesday (Day 3) 2004 Jan 01: Thursday (Day 4)
Over the course of each leap year interval, the total shift is by 5 (or -2) weekdays.
So after 7 intervals (4*7 = 28 years), we should be back to the same day of the week on that date of the year.
# 7 leap cycles
daysFrom = [0] * 8
for day in range(len(daysFrom)):
daysFrom[day] = 1461*day
printWeekDays(daysFrom, '2000-1-1')
2000 Jan 01: Saturday (Day 6) 2004 Jan 01: Thursday (Day 4) 2008 Jan 01: Tuesday (Day 2) 2012 Jan 01: Sunday (Day 0) 2016 Jan 01: Friday (Day 5) 2020 Jan 01: Wednesday (Day 3) 2024 Jan 01: Monday (Day 1) 2028 Jan 01: Saturday (Day 6)
28 day intervals maintain the same day of the week, but this does not account for the fact that there's an additional adjustment such that every century year that is not divisible by 400 is not a leap year so we start to see a drift and overcount by a day in the date, if we iterate every 28 years, as we pass such centuries.
# Drifting through the centuries
daysFrom = [0] * 10
for day in range(len(daysFrom)):
daysFrom[day] = 1461*7*day
printWeekDays(daysFrom, '2000-1-1')
2000 Jan 01: Saturday (Day 6) 2028 Jan 01: Saturday (Day 6) 2056 Jan 01: Saturday (Day 6) 2084 Jan 01: Saturday (Day 6) 2112 Jan 02: Saturday (Day 6) 2140 Jan 02: Saturday (Day 6) 2168 Jan 02: Saturday (Day 6) 2196 Jan 02: Saturday (Day 6) 2224 Jan 03: Saturday (Day 6) 2252 Jan 03: Saturday (Day 6)
A full Gregorian calendar cycle is 400 years, with 3 leap years omitted because they are century years nondivisible by 400.
The total number of days in a full Gregorian cycle is 146097 = (400 yr * 365 days/yr) + 97 days from leap years.
# Full Greg
daysFrom = [0, 146097]
printWeekDays(daysFrom, '1700-1-1')
1700 Jan 01: Friday (Day 5) 2100 Jan 01: Friday (Day 5)
146097 days is divisible by 7, so we see that every 400 years we are back to the day of the week in which we started, on the date of the calendar in which we started.
But 400 years is not divisible by the 7 days of the week, and because days of the month other than nonleap year Feb have a number of days nondivisible by 7 (i.e. 29, 30, 31), there will be extra counts for the first one to three days of the week that that month began on. So this means we should expect an unequal distribution of the frequency of days of the week for each month.
# Number of days in each month for each month
nonleap = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
leap = [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
# Number of days shifted month-to-month
nonShift = [0]*12
leapShift = [0]*12
# Dataframes for each type of year's distribution.
# Every month has 28 or more days, so we can start with
# at at least 4 counts for each day of the week.
nonDist = pd.DataFrame([[4]*7]*12)
nonDist.name = 'Distribution of Days in Each Month by Order\
of Appearance in Non-Leap Years'
leapDist = pd.DataFrame([[4]*7]*12)
leapDist.name = 'Distribution of Days in Each Month by Order\
of Appearance in Leap Years'
# Then we can add in extra days depending on how many days
# over 28 each month has.
for month in range(0,12):
nonShift[month] = nonleap[month]%7
for extraDay in range(0, nonShift[month]):
nonDist.at[month, extraDay] = 5
leapShift[month] = leap[month]%7
for extraDay in range(0, leapShift[month]):
leapDist.at[month, extraDay] = 5
nonDist.rename(columns=ordinal, index=months, inplace=True)
leapDist.rename(columns=ordinal, index=months, inplace=True)
disp_dfs(nonDist, leapDist)
Distribution of Days in Each Month by Order of Appearance in Non-Leap Years
1st | 2nd | 3rd | 4th | 5th | 6th | 7th | |
---|---|---|---|---|---|---|---|
jan | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
feb | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
mar | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
apr | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
may | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
jun | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
jul | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
aug | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
sept | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
oct | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
nov | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
dec | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
Distribution of Days in Each Month by Order of Appearance in Leap Years
1st | 2nd | 3rd | 4th | 5th | 6th | 7th | |
---|---|---|---|---|---|---|---|
jan | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
feb | 5 | 4 | 4 | 4 | 4 | 4 | 4 |
mar | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
apr | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
may | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
jun | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
jul | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
aug | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
sept | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
oct | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
nov | 5 | 5 | 4 | 4 | 4 | 4 | 4 |
dec | 5 | 5 | 5 | 4 | 4 | 4 | 4 |
If we were 5th graders competing in a pencil + paper math contest, we could use the tables above, count the number of nonleap and leap years, and make adjustments for which weekdays begin on which year.
But the easiest way to get this distribution would be to and iterate through a period of 400 years. On a computer it only takes about a minute.
# Initialize blank dataframe
counts = pd.DataFrame([[0]*7]*12)
counts.rename(columns=weekdays, index=months, inplace=True)
# Count range (end_date non-inclusive)
start_date = date(1582,10,15)
end_date = date.today()
# Columns for making grid
yList = []
mList = list(months.values())
wList = list(weekdays.values())
current_columns = []
# Adds a column every year
def addColumns(columns, wList, y):
for i in wList:
y_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='year')
y_col = Column((list([y]*12)), y_col_name)
columns.append(y_col)
mListInt = [(j*10+(1.5*wList.index(i))) for j in range(0, 12)]
m_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='month')
m_col = Column(mListInt, m_col_name)
columns.append(m_col)
c_col_name = '{year}_{weekday}_{header}'.format(year=y, weekday=i, header='count')
c_col = Column(counts[i].tolist(), c_col_name)
columns.append(c_col)
# Let's count
for single_date in dateRange(start_date, end_date):
# Updates grid/csv on new year's day
if (single_date.strftime('%m %d')=='01 01') and (int(single_date.strftime('%y'))%4==0):
y = int(single_date.strftime('%Y'))-1
addColumns(current_columns, wList, y)
yList.append(str(int(single_date.strftime('%Y'))-1))
# Updates count in dataframe
m = (1 * int(single_date.strftime("%m")))-1
w = int(single_date.strftime("%w"))
counts.iloc[m, w] += 1
# Update if end date not on new year's day
if (end_date.strftime('%m %d')!='01 01'):
addColumns(current_columns, wList, end_date.strftime('%Y'))
yList.append(str(int(single_date.strftime('%Y'))))
# Upload grid to plotly
countGrid = Grid(current_columns)
url = py.grid_ops.upload(countGrid, 'weekday_counter_1582_2018_grid'+str(time.time()), auto_open=False)
url
'https://plot.ly/~album/114/'
I ran and stored the final dataframe for a full Gregorian cycle to '400_years.csv'. Drumroll please.
gregCycle = pd.read_csv('400_years.csv', index_col=0)
gregCycle.name = 'Number of Weekdays That Occur in Each Month in Each Gregorian Calendar Cycle'
# To check that we did in fact count all the days
print ("Counted " + str(gregCycle.values.sum()) + " days")
disp_dfs(gregCycle)
Counted 146097 days
Number of Weekdays That Occur in Each Month in Each Gregorian Calendar Cycle
sun | mon | tues | wed | thurs | fri | sat | |
---|---|---|---|---|---|---|---|
jan | 1772 | 1770 | 1772 | 1771 | 1772 | 1772 | 1771 |
feb | 1613 | 1615 | 1613 | 1615 | 1613 | 1614 | 1614 |
mar | 1772 | 1771 | 1772 | 1770 | 1772 | 1771 | 1772 |
apr | 1714 | 1715 | 1714 | 1715 | 1714 | 1714 | 1714 |
may | 1772 | 1770 | 1772 | 1771 | 1772 | 1772 | 1771 |
jun | 1714 | 1715 | 1714 | 1714 | 1714 | 1714 | 1715 |
jul | 1772 | 1771 | 1772 | 1772 | 1771 | 1772 | 1770 |
aug | 1771 | 1772 | 1770 | 1772 | 1771 | 1772 | 1772 |
sept | 1715 | 1714 | 1715 | 1714 | 1714 | 1714 | 1714 |
oct | 1770 | 1772 | 1771 | 1772 | 1772 | 1771 | 1772 |
nov | 1715 | 1714 | 1714 | 1714 | 1714 | 1715 | 1714 |
dec | 1771 | 1772 | 1772 | 1771 | 1772 | 1770 | 1772 |
# Figure
figure = {
'data': [],
'layout': {},
'frames': [],
'config': {'scrollzoom': True}
}
# Fill in most of layout
figure['layout']['xaxis'] = {'title': 'Month', 'gridcolor': '#FFFFFF', 'range': [-2, 120], 'zeroline': False,
'tickvals': [(i*10+3) for i in range(0, 12)], 'ticktext': mList}
figure['layout']['yaxis'] = {'title': 'Days Counted', 'type': 'lin', 'range': [0, 450], 'gridcolor': '#FFFFFF', 'autorange':False}
figure['layout']['title'] = 'Counting the Days of Each Weekday in Each Month'
figure['layout']['hovermode'] = 'x'
figure['layout']['plot_bgcolor'] = 'rgb(223, 232, 243)'
figure['layout']['autosize'] = True
# Year Slider
sliders_dict = {
'active': 0,
'yanchor': 'top',
'xanchor': 'left',
'currentvalue': {
'font': {'size': 20},
'prefix': 'Year:',
'visible': True,
'xanchor': 'right'
},
'transition': {'duration': 100, 'easing': 'cubic-in-out'},
'pad': {'b': 10, 't': 50},
'len': 0.9,
'x': 0.1,
'y': 0,
'steps': [],
}
# Play & Pause
figure['layout']['updatemenus'] = [
{
'buttons': [
{
'args': [None, {'frame': {'duration': 300, 'redraw': False},
'fromcurrent': True, 'transition': {'duration': 400, 'easing': 'quadratic-in-out'}}],
'label': 'Play',
'method': 'animate'
},
{
'args': [[None], {'frame': {'duration': 0, 'redraw': False}, 'mode': 'immediate',
'transition': {'duration': 0}}],
'label': 'Pause',
'method': 'animate'
},
{
'args': [{'yaxis.autorange': True, 'xaxis.autorange': True}],
'label': 'Rescale',
'method': 'relayout'
},
],
'direction': 'left',
'pad': {'r': 10, 't': 87},
'showactive': False,
'type': 'buttons',
'x': 0.1,
'xanchor': 'right',
'y': 0,
'yanchor': 'top'
}
]
# Custom marker styles
color = {
'sun': 'rgb(250, 249, 20)', 'mon': 'rgb(250, 20, 5)', 'tues': 'rgb(50, 170, 255)', 'wed': 'rgb(222, 182, 0)',
'thurs': 'rgb(90, 110, 250)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(20, 211, 43)'
}
symbol = {
'sun': 'circle-open-dot', 'mon': 'square-cross', 'tues': 'star-diamond', 'wed': 'hexagram',
'thurs': 'diamond', 'fri': 'pentagon', 'sat': 'star'
}
line_color = {
'sun': 'rgb(250, 99, 220)', 'mon': 'rgb(230, 99, 250)', 'tues': 'rgb(99, 110, 250)', 'wed': 'rgb(222, 222, 44)',
'thurs': 'rgb(50, 170, 255)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(220, 111, 243)'
}
gradient_color = {
'sun': 'rgb(0, 0, 0)', 'mon': 'rgb(230, 20, 0)', 'tues': 'rgb(22, 55, 250)', 'wed': 'rgb(222, 140, 0)',
'thurs': 'rgb(50, 170, 255)', 'fri': 'rgb(115, 211, 143)', 'sat': 'rgb(22, 22, 111)'
}
gradient_type = {
'sun': 'radial', 'mon': 'horizontal', 'tues': 'vertical', 'wed': 'horizontal',
'thurs': 'vertical', 'fri': 'radial', 'sat': 'radial'
}
set_size = 6
set_opacity = 0.6
set_line_width = 3
# Import data from grid
col_name_template = '{year}_{weekday}_{header}'
year = yList[0]
for day in wList:
data_dict = {
'xsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'ysrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='count'
)),
'mode': 'markers',
'textsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'hoverinfo': 'y+name',
'marker': {
'size': set_size,
'symbol': symbol[day],
'color': color[day],
'opacity': set_opacity,
'line': {'color': line_color[day], 'width': set_line_width },
'gradient': {'color': gradient_color[day], 'type':gradient_type[day]}
},
'name': day
}
figure['data'].append(data_dict)
# Updating frames
for year in yList:
frame = {'data': [], 'name': str(year), 'layout':[]}
for day in wList:
data_dict = {
'xsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'ysrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='count'
)),
'mode': 'markers',
'textsrc': countGrid.get_column_reference(col_name_template.format(
year=year, weekday=day, header='month'
)),
'marker': {
'size': set_size,
'symbol': symbol[day],
'color': color[day],
'opacity': set_opacity,
'line': {'color': line_color[day], 'width': set_line_width },
'gradient': {'color': gradient_color[day], 'type':'radial'}
},
'name': day,
}
frame['data'].append(data_dict)
layout_dict = {
'yaxis': {'autorange': True}
}
frame['layout'].append(layout_dict)
figure['frames'].append(frame)
slider_step = {'args': [
[year],
{'frame': {'duration': 30, 'redraw': False},
'mode': 'immediate',
'transition': {'duration': 10}}
],
'label': year,
'method': 'animate'}
sliders_dict['steps'].append(slider_step)
figure['layout']['sliders'] = [sliders_dict]
# Default home zoom
yMin = (int(min(yList)) - 1582) * 4
yMax = ((int(max(yList)) - int(min(yList))) * 4.5) + yMin
figure['layout']['yaxis']['range'] = [yMin, yMax]
Have you ever seen those marble racing videos?
Here are some dots to represent the days of the week for each month. Press "play" to watch them race from the start of the Gregorian calendar to present day.
(Pressing the "Rescale" button zooms in on the action. I haven't figured out how to get the scale to autoupdate with each frame yet-- if you know, please let me know! You might have to keep clicking rescale to follow the movement, or hover over the top of the graph and use the pan tool. The house-shaped icon will reset the axes.)
py.icreate_animations(figure, 'weekday-counter-1582-2018'+str(time.time()))
Here are some takeaways about each month for a full Gregorian Cycle:
Highs & Lows
So there you have it. You can file that under useful information.
I had written a similar shorter program in js on Feb 24, 2016, which I later learned coincided as the 434th anniversary of the papal bull known as the Inter gravissimas, issused by Pope_Greg13, which gave us the calendar that we have all come to know and know.
If you want to go down this rabbit hole some more, here are some links related to time-related adjustments we face as a consequence of living on this planet:
Thanks to Nick Lanam, Ethan McIntyre for pointing me in the direction of some more helpful links.