# Frenetic Array

## A canvas for logic and imagination.

We are well aware of Euler’s famous number,

$$e = 2.71828182845$$

There are also quite a few ways of deriving Euler’s Number. There’s the Taylor expansion method:

$$e = \sum _{k=0} ^{\infty} \frac{1}{k!}$$

There is also the classical limit:

$$e = \lim_{n \rightarrow \infty} \left( 1 + \frac{1}{n} \right)^n$$

Then there is a unique way of calculating. Let $R$ be a random number generated between $[0, 1]$, inclusive. Then $e$ is the average of the number of $R$s it takes to sum greater than $1$. In other words, keeping adding a random numbers with bounds $[0, 1]$ together until you exceed one, average a bunch of attempts together, and you should get $e$.

With more rigor, for uniform $(0, 1)$ random independent variables $R_1$, $R_2$, $\ldots$, $R_n$,

$$N = min \left\{ n: \sum_{k=1} ^{\infty} R_i > 1 \right\}$$

then we can calculate $e$ as such:

$$e = Average(N)$$

The proof can be found here, but it is pretty rigorous. Instead, an easier method is to write a program to verify for large enough $n$.

For $n= 1,000,000,000$1, we have the following results

e Sum Solution Limit Solution Random Uniform Variable
2.7182818284 2.7182818284 2.7182820520 2.718250315

Feel free to checkout the code below, or skim the all of the data to see the asymptotic approach towards $e$.

## All Data

n Sum Solution Limit Solution Random Uniform Variable
1 2 2 2
10 2.7182818011 2.5937424601 2.5
100 2.7182818284 2.7048138294 2.69
1000 2.7182818284 2.7169239322 2.717
10000 2.7181459268 2.7242
100000 2.7182682371 2.71643
1000000 2.7182804690 2.71961
10000000 2.7182816941 2.7182017
100000000 2.7182817983 2.71818689
1000000000 2.7182820520 2.718250315

## Source Code

import random
import math
import sys

from decimal import Decimal

def e_sum(upper_bound):
if upper_bound < 0:
return 0

return Decimal(1.0) / Decimal(math.factorial(upper_bound)) + Decimal(e_sum(upper_bound - 1))

def e_limit(n):
return Decimal((1 + 1.0 / float(n))**n)

def find_greater_than_one(value=0, attempts=0):
if value <= 1:
return find_greater_than_one(value + random.uniform(0, 1), attempts + 1)

return attempts

1. For all except the sum calculation. Because it is summation, with a factorial inside the summation, calculating takes a long time (and if done recursively, and stack overflow).

It’s not so often I come across a problem that works out so beautifully yet requires so many different aspects in competitive programming. One particular problem, courtesy of Kattis, really did impress me.

The problem was House of Cards, and the problem was this (skip for TL;DR at bottom):

Brian and Susan are old friends, and they always dare each other to do reckless things. Recently Brian had the audacity to take the bottom right exit out of their annual maze race, instead of the usual top left one. In order to trump this, Susan needs to think big. She will build a house of cards so big that, should it topple over, the entire country would be buried by cards. It’s going to be huge! The house will have a triangular shape. The illustration to the right shows a house of height $6$ and Figure 1 shows a schematic figure of a house of height $5$.

Figure 1

For aesthetic reasons, the cards used to build the tower should feature each of the four suits (clubs, diamonds, hearts, spades) equally often. Depending on the height of the tower, this may or may not be possible. Given a lower bound $h_0$ on the height of the tower, what is the smallest possible height $h \geq h_0$ such that it is possible to build the tower?

TL;DR: Using Figure 1 as a reference, you are given a lower bound on a height for a tower of cards. However, there must be an equal distribution of all four suites; clubs, diamonds, hearts, and spades.

This implies that the number of cards have to be divisible by $4$. Seeing as the input was huge $1 \leq h_0 \leq 10^{1000}$, there was no brute forcing this. So, first thought: turn this into a closed-form series, and solve the series.

Getting the values for the first five heights, I got the following set:

$${2, 7, 15, 25, 40, \ldots}$$

I was able to turn this set into a series quite easily:

$$\sum_{n = 1} ^{h_0} \left(3n - 1\right)$$

This turned into the following equation:

$$\frac{1}{2} h_0(3h_0 + 1)$$

So, all I had to do was plug $h_0$ in the equation, and increment while the number was not divisible by $4$. Then, I realized how large the input really was. The input size ($1 \cdot 10^{1000}$) was orders of magnitudes larger than typical, large data types would allow ($1.84 \cdot 10^{19}$).

I realized this couldn’t be tested against a intensive data set, because there is only one number to calculate. I thought, since the series always subtracts one, the minimum times I must increment should roughly be four. Keeping this in mind, I decided to use Python. Python can work with arbitrarily large numbers, making it ideal in this situation.

I sat down, hoped for the best, and wrote the following code.

def getNumberOfCards(x):
return (3*pow(x, 2) + x) // 2

height = int(input())
while (getNumberOfCards(height) % 4 != 0):
height += 1

print(height)


With a run time of 0.02 seconds, it worked.

I avoid the Mac App store as much as possible. It’s buggy. It’s slow. Apps are getting pulled from it left and right. But, I could not avoid it permanently because of software updates. As macOS Sierra 10.5 rolled out, I clicked on the App Store, followed by updates, and

I could not get the updates page opened. Killed App Store. Restart computer. Everything.

After doing some research, I discovered there is a way to update software from the terminal: softwareupdate. So, after running one command (sudo softwareupdate -iv), I am writing this for the latest version of macOS Sierra.

One of the reasons I kept CodeRunner handy was its ability to quickly compile code. With a single click of a button, I could run any of my frequently used languages. It didn’t matter if it was an interpreted or compiled language, so I could virtually run anything, like:

• C/C++
• Python
• Lua
• LaTeX
• Perl

However, in the last few months I started using Vim. Heavily. So much so I was trying to use Vim command in the CodeRunner buffers. So I decided I wanted to have the functionality, and in vim-esque fashion, I mapped to my leader key: <leader>r. The mnemonic <leader>run helped me remember the command on the first few tries.

function! MakeIfAvailable()
make
elseif (&filetype == "cpp")
execute("!clang++ -std=c++14" + bufname("%"))
execute("!./a.out")
elseif (&filetype == "c")
execute("!clang -std=c11" + bufname("%"))
execute("!./a.out")
elseif (&filetype == "tex")
execute("!xelatex" + bufname("%"))
execute("!open" + expand(%:r) + ".pdf")
endif
endfunction

augroup spaces
autocmd!
autocmd FileType c nnoremap <leader>r :call MakeIfAvailable()<cr>
autocmd FileType cpp nnoremap <leader>r :call MakeIfAvailable()<cr>
autocmd FileType tex nnoremap <leader>r :call MakeIfAvailable()<cr>
autocmd FileType python nnoremap <leader>r :exec '!python' shellescape(@%, 1)<cr>
autocmd FileType perl nnoremap <leader>r :exec '!perl' shellescape(@%, 1)<cr>
autocmd FileType sh nnoremap <leader>r :exec '!bash' shellescape(@%, 1)<cr>
autocmd FileType swift nnoremap <leader>r :exec '!swift' shellescape(@%, 1)<cr>
augroup END


When considering going viral, one of the critical things to consider is timing. For reddit, there are many tools and studies for this; but they’re all subreddit specific. This is for good cause, different subreddits are more active at different times of the day. But what about “r/all”?

So, skipping the implementation details (you can read about them below), I wrote a script that parsed 165,607 posts to get a general idea of what “r/all” does in a given day. Note the times are for Pacific Time (PT) zone.

So a pretty good trend. What about the mean upvote count?

Okay, not so pretty. Let’s filter out the outliers by replacing anything above 300 upvotes with a -1.

A decent trend until around 12:00 to 14:00, where everything goes sporadic. Let’s take a look at the median upvotes.

Again, let’s filter the outliers (anything above 1000).

So we can safely assume a lot of reddit posts are upvote counts are zero (actually, the mode — the most commonly occurring upvote count — is by far zero).

So far, these are pretty useless. But, there is one good litmus test for popularity: reaching over 1,000 upvotes. Let's take a look at that.

There looks like there could be something here. Let's switch to line graph and only sum to the hour.

Now that's a trend. Adding a cubic spline, we get a beautiful line graph:

It seems quite apparent that the best time to post is around 14:00 to 15:00. Neat.

## Implementation

I used the praw python package to download and parse all of the data. Unfortunately Reddit was having issues handling over 100,000 consecutive request (around 100/second), so my connection would be severed constantly. I had to manually restart where I left off.

The code is as follows

import praw
import datetime
from functools import reduce
import statistics
import time

def get_date(submission):
time = submission.created
return datetime.datetime.fromtimestamp(time)

def date_to_unix_second(t):
return (t-datetime.datetime(1970,1,1)).total_seconds()

# Thanks https://stackoverflow.com/questions/10797819/finding-the-mode-of-a-list
def mode(numbers, out_mode):
counts = {k:numbers.count(k) for k in set(numbers)}
modes = sorted(dict(filter(lambda x: x[1] == max(counts.values()), counts.items())).keys())
if out_mode=='smallest':
return modes[0]
elif out_mode=='largest':
return modes[-1]
else:
return modes

reddit = praw.Reddit(user_agent='bot1', client_id='', client_secret='', redirect_uri='' 'authorize_callback')
subreddit = reddit.subreddit('all')

day1 = datetime.datetime(2017, 5, 26, hour=0, minute=0, second=0)
day2 = datetime.datetime(2017, 5, 27, hour=0, minute=0, second=0)

current_day = day2

while current_day != day1:
current_minute = 0
total = 0

for submission in subreddit.submissions(date_to_unix_second(day1), date_to_unix_second(current_day) - 60*60):
submission_date = get_date(submission)

if submission_date.minute != current_minute:
current_minute = submission_date.minute

print("{0}:{1},{2},{3},{4},{5},{6},{7}".format(
submission_date.hour if submission_date.hour > 9 else "0" + str(submission_date.hour),
submission_date.minute if submission_date.minute > 9 else "0" + str(submission_date.minute),
))

current_day = submission_date

print(total)


For the spline, the code is as follows:

    delta_t = max(x) - min(x)
N_points = 300
xnew = [min(x) + i*delta_t/N_points for i in range(N_points)]

x_ts = [x_.timestamp() for x_ in x]
xnew_ts = [x_.timestamp() for x_ in xnew]
ynew = spline(x_ts, y, xnew_ts)


The graphs were produced by matplotlib, the code is below.

#!/usr/local/bin/python3
#
# plot.py
# Desktop
#
# Created by Illya Starikov on 05/16/17.
#

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime

import csv

def import_from_csv(filename):
with open(filename, "rt") as csvfile:

date_times = []
values = []

for date_and_time_string, value in csv_parsor:
datetime_object = datetime.strptime(date_and_time_string, '%H:%M')

date_times += [datetime_object.replace(day=26, month=5, year=2017)]
values += [value]

return (date_times, values)

def main():
# Get the data and the subplots
x, y = import_from_csv('data.csv')
fig, ax = plt.subplots()

# Make the figure wide and draw it
fig.set_size_inches((12, 5))
ax.plot_date(x, y, color='b', marker='+', markersize=6)

# Custom labels, since I have milestones
labels = ["1:00", "2:00", "3:00", "4:00", "5:00", "6:00", "7:00", "8:00", "9:00", "10:00", "11:00", "12:00", "13:00", "14:00", "15:00", "16:00", "17:00", "18:00", "19:00", "20:00", "21:00", "22:00", "23:00"]

locs = [
mdates.date2num(datetime(2017, 5, 26, 1, 00)),
mdates.date2num(datetime(2017, 5, 26, 2, 00)),
mdates.date2num(datetime(2017, 5, 26, 3, 00)),
mdates.date2num(datetime(2017, 5, 26, 4, 00)),
mdates.date2num(datetime(2017, 5, 26, 5, 00)),
mdates.date2num(datetime(2017, 5, 26, 6, 00)),
mdates.date2num(datetime(2017, 5, 26, 7, 00)),
mdates.date2num(datetime(2017, 5, 26, 8, 00)),
mdates.date2num(datetime(2017, 5, 26, 9, 00)),
mdates.date2num(datetime(2017, 5, 26, 10, 00)),
mdates.date2num(datetime(2017, 5, 26, 11, 00)),
mdates.date2num(datetime(2017, 5, 26, 12, 00)),
mdates.date2num(datetime(2017, 5, 26, 13, 00)),
mdates.date2num(datetime(2017, 5, 26, 14, 00)),
mdates.date2num(datetime(2017, 5, 26, 15, 00)),
mdates.date2num(datetime(2017, 5, 26, 16, 00)),
mdates.date2num(datetime(2017, 5, 26, 17, 00)),
mdates.date2num(datetime(2017, 5, 26, 18, 00)),
mdates.date2num(datetime(2017, 5, 26, 19, 00)),
mdates.date2num(datetime(2017, 5, 26, 20, 00)),
mdates.date2num(datetime(2017, 5, 26, 21, 00)),
mdates.date2num(datetime(2017, 5, 26, 22, 00)),
mdates.date2num(datetime(2017, 5, 26, 23, 00))
]

# Assign the data, change fonts to Bebas Neue, make titles and labels
ax.set_xticklabels(labels, fontname="Bebas Neue")
ax.set_yticklabels([int(i) for i in ax.get_yticks()], fontname="Bebas Neue")

ax.set_title("Title", fontsize=22, fontname="Bebas Neue")
ax.set_xlabel("Time of Day", fontsize=14, fontname="Bebas Neue")
ax.set_xticks(locs)

# Make nicer x axis
plt.gcf().autofmt_xdate()

# Export that shit
ax.grid()
plt.draw()
plt.savefig("figure.png", format='png', dpi=280)

if __name__ == "__main__":
main()