for ShallowDrumpf!
What does $$ \argmax_\params \sum_{(\x,\y) \in \train} \log \prob_\params(\x,\y) $$ have to do with counting?
Develop unigram language model for generating simplified Trump speeches
China, China, China, Mexico, China, Mexico ...
m = "Mexico"
c = "China"
def prob(th_china, th_mexico, word):
return th_china if word == 'China' else th_mexico
prob(0.7, 0.3, 'China')
0.7
Solution is counting:
$$ \paramc = \frac{\countc}{\countc + \countm} $$def mle(data):
theta_china = len([w for w in data if w == 'China']) / len(data)
return theta_china, 1.0 - theta_china
mle([c,c,m,c])
(0.75, 0.25)
def ll(th_china, th_mexico, data):
return sum([log(prob(th_china, th_mexico, w)) for w in data])
data = [c,c,m,c] # how does this graph look with all Cs?
smle.plot_mle_graph(lambda x,y: ll(x,y, data), mle(data),
x_label='China',y_label='Mexico')
Solution trivial (and useless) without constraints
Constraints:
smle.plot_mle_graph(lambda x,y: ll(x,y, data), mle(data),
show_constraint=True)
smle.plot_mle_graph(lambda x,y: ll(x,y, data), mle(data),
show_constraint=True, show_optimum=True)