This tutorial shows you how to use Tracery in your Python programs. In particular, it shows a handful of useful patterns for incorporating large amounts of data into your Tracery grammars that would be impractical or inconvenient to do with a Tracery generator on its own.
Tracery is an easy-to-use but powerful language and toolset for generating text from grammars made by Kate Compton. If you're not familiar with how Tracery works, try the official tutorial or this tutorial I wrote.
This tutorial is written for Python 3+ and should work also on 2.7.
In order to generate text from a Tracery grammar in Python, you'll need to install the Tracery Python module. It's easiest to do this with pip
at the command line, like so:
pip install tracery
(If you get a permissions error, try pip install --user tracery
.)
Once you've installed the tracery
module, try the following example program:
import tracery
from tracery.modifiers import base_english
# put your grammar here as the value assigned to "rules"
rules = {
"origin": "#hello.capitalize#, #location#!",
"hello": ["hello", "greetings", "howdy", "hey"],
"location": ["world", "solar system", "galaxy", "universe"]
}
grammar = tracery.Grammar(rules) # create a grammar object from the rules
grammar.add_modifiers(base_english) # add pre-programmed modifiers
print(grammar.flatten("#origin#")) # and flatten, starting with origin rule
Hello, galaxy!
This program takes a Tracery grammar (in the form of a Python dictionary) and "flattens" it, printing its output to standard output. You can take the content of a Tracery grammar you've written and paste it into this Python program as the value being assigned to the variable rules
(unless your Tracery grammar uses some aspect of JSON formatting that works differently in Python, like Unicode escapes). Run the program from the command line (or in the cell above, if you're viewing this in Jupyter Notebook) and you'll get a line of output from your grammar.
You may already have a set of Tracery grammar files that you want to generate from, or don't want to cut-and-paste the grammar into your Python script. If this is the case, no problem! You can use Python's json
library to load any file containing a Tracery grammar from a JSON file. The program below shows how this works.
Included is a sample grammar called test-grammar.json
. Let's have a look.
!cat test-grammar.json
{ "origin": "#hello.capitalize#, #location#!", "hello": ["hello", "greetings", "howdy", "hey"], "location": ["world", "solar system", "galaxy", "universe"] }
Python's json
module provides functions for reading JSON-formatted data into Python as Python data structures, and exporting Python data structures to JSON format. The .loads()
function from the module parses a string containing JSON-formatted data and returns the corresponding Python data structure (a dictionary or a list).
import tracery
from tracery.modifiers import base_english
import json
# use json.loads() and open() to read in a JSON file as a Python data structure
rules = json.loads(open("test-grammar.json").read())
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
# print ten random outputs
for i in range(10):
print(grammar.flatten("#origin#"))
Greetings, galaxy! Howdy, galaxy! Howdy, galaxy! Hey, solar system! Hello, galaxy! Hey, world! Howdy, world! Hello, world! Hey, galaxy! Hello, world!
The above example uses a for
loop to call the .flatten()
method multiple times.
An interesting affordance of using Tracery in Python is the ability to fill in parts of the grammar using external data. By "external" data, what I mean is data that isn't directly in the grammar itself, but data that you insert into the grammar when your program runs. One reason to do this might be to make the output of your grammar dynamic, using (e.g.) data returned from a web API. A simpler reason is simply that large Tracery grammars can be difficult to edit and navigate, especially when you're working with rules that might have hundreds or thousands of possible replacements.
To demonstrate, let's start with the generator discussed in my Tracery tutorial that generates takes on the "Dammit Jim, I'm a doctor, not a OTHER PROFESSION
" snowclone/trope. Such a grammar might look like this:
{
"origin": "#interjection#, #name#! I'm a #profession#, not a #profession#!",
"interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
"good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
"name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
"profession": [
"accountant",
"butcher",
"economist",
"forest fire prevention specialist",
"mathematician",
"proofreader",
"singer",
"teacher assistant",
"travel agent",
"welder"
]
}
An immediately recognizable shortcoming of this grammar is that it doesn't have a large number of alternatives. If we want there to be more professions that, dammit Jim, I'm not, we need to type them into the grammar by hand. The selection of names is also woefully small.
It would be nice if we could supplement the grammar by adding rule expansions from existing databases. For example, Corpora has a list of occupations and a list of first names, which we could incorporate into our grammar. One way to do this would be simply to copy/paste the relevant part of the JSON file into the grammar. But we can also load the data directly into the grammar using Python.
The program in the following cell specifies a partial Tracery grammar in a Python dictionary assigned to variable rules
. The grammar is then augmented with data loaded from JSON files obtained from Corpora. Using the json
library, we load the Corpora Project JSON files, find the data we need, and then assign it to new rules in the grammar. To get the example to work, we'll need to download firstNames.json and occupations.json, so let's do that first using wget.
!wget https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/firstNames.json
!wget https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/occupations.json
--2017-07-18 23:01:27-- https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/firstNames.json Resolving raw.githubusercontent.com... 151.101.164.133 Connecting to raw.githubusercontent.com|151.101.164.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 5647 (5.5K) [text/plain] Saving to: 'firstNames.json' firstNames.json 100%[===================>] 5.51K --.-KB/s in 0s 2017-07-18 23:01:28 (43.4 MB/s) - 'firstNames.json' saved [5647/5647] --2017-07-18 23:01:28-- https://raw.githubusercontent.com/dariusk/corpora/master/data/humans/occupations.json Resolving raw.githubusercontent.com... 151.101.164.133 Connecting to raw.githubusercontent.com|151.101.164.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 22680 (22K) [text/plain] Saving to: 'occupations.json' occupations.json 100%[===================>] 22.15K --.-KB/s in 0.03s 2017-07-18 23:01:28 (815 KB/s) - 'occupations.json' saved [22680/22680]
The key trick here is that when creating the grammar, we refer to rules that don't yet exist. Later in the code, we add those rules (and their associated expansions, from the Corpora Project JSON files) by assigning values to keys in the rules
dictionary. We're essentially building the grammar up gradually over the course of the program, instead of writing it all at once.
import tracery
from tracery.modifiers import base_english
import json
# the grammar refers to "name" and "profession" rules. we're not including them in the grammar
# here, but adding them later on (using corpora project data!)
rules = {
"origin": "#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!",
"interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
"good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
}
# load the JSON data from files downloaded from corpora project
names_data = json.loads(open("firstNames.json").read())
occupation_data = json.loads(open("occupations.json").read())
# set the values for "name" and "profession" rules with corpora data
rules["name"] = names_data["firstNames"]
rules["profession"] = occupation_data["occupations"]
# generate!
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
print(grammar.flatten("#origin#"))
Whoa, Alexis! I'm a power tool repairer, not a custom sewer!
EXERCISE: Write a Tracery grammar that changes its output based on the current time of day. You'll need to use something like
datetime
for this; after you've imported it, the expressiondatetime.datetime.now().hour
evaluates to the current hour of the day.
In my CSV tutorial, the final example shows how you might build sentences from data in a CSV file (in particular, a CSV exported from this spreadsheet containing data about the dogs of NYC, originally from here). The method chosen in that example for constructing sentences is suboptimal: there's a lot of just slamming strings together with the +
operator, which makes it hard to build in variation. It would be nice if we could build a Tracery grammar for generating these sentences instead!
The following example does exactly this. As with the example in the previous section, this example constructs a partial Tracery grammar, and then adds rules to the grammar with new information. The difference with this example is that we generate a sentence for multiple data sets—instead of loading in data once at the beginning. For each row in the CSV file, we create a fresh copy of the grammar, then add rule/expansion pairs with the relevant data from the row. Inside the for
loop, we construct a new grammar object and print the "flattened" (i.e. expanded) text.
import tracery
from tracery.modifiers import base_english
import json
import csv
# create the "template" grammar, which will be copied and augmented for each record of the CSV
rules = {
"origin": [
"#greeting.capitalize#! My name is #name#. #coatdesc.capitalize# and #breeddesc#. #homedesc#. #woof.capitalize#!",
"#greeting.capitalize#! #woof.capitalize#! My name is #name# and #breeddesc#. #homedesc.capitalize# and #coatdesc#.",
"#woof.capitalize#! #homedesc.capitalize# and they call me #name#. #breeddesc.capitalize# and #coatdesc#."
],
"greeting": ["hi", "howdy", "hello", "hey"],
"woof": ["woof", "arf", "bow-wow", "yip", "ruff-ruff"],
"coatdesc": [
"my coat is #color#",
"I've got #color.a# coat",
"you could say my coat is #color#"
],
"breeddesc": [
"I'm #breed.a#",
"my breed is #breed#",
"I'm the #superlative# #breed# you'll ever meet"
],
"superlative": ["cutest", "strongest", "most playful", "friendliest", "cleverest", "most loyal"],
"homedesc": [
"I'm from #borough#",
"I live in #borough#",
"I love living in #borough#",
"#borough# is where I call home"
]
}
# iterate over the first 100 rows in the CSV file
for row in list(csv.DictReader(open("dogs-of-nyc.csv")))[:100]:
# copy rules so we're not continuously overwriting values
rules_copy = dict(rules) # make a copy of the rules
# now assign new rule/expansion pairs with the data from the current row
rules_copy["name"] = row["dog_name"]
rules_copy["color"] = row["dominant_color"].lower()
rules_copy["breed"] = row["breed"]
# little bit of fluency clean-up...
if row["borough"] == "Bronx":
rules_copy["borough"] = "the " + row["borough"]
else:
rules_copy["borough"] = row["borough"]
# now generate!
grammar = tracery.Grammar(rules_copy)
grammar.add_modifiers(base_english)
print(grammar.flatten("#origin#"))
Arf! I love living in Manhattan and they call me Buddy. My breed is Afghan Hound and my coat is brindle. Ruff-ruff! Manhattan is where I call home and they call me Nicole. I'm the cleverest Afghan Hound you'll ever meet and you could say my coat is black. Yip! I live in Manhattan and they call me Abby. I'm the friendliest Afghan Hound you'll ever meet and I've got a black coat. Howdy! My name is Chloe. I've got a white coat and my breed is Afghan Hound. I live in Manhattan. Arf! Ruff-ruff! I'm from Manhattan and they call me Jazzle. My breed is Afghan Hound and my coat is blond. Hi! My name is Trouble. My coat is blond and I'm an Afghan Hound. the Bronx is where I call home. Ruff-ruff! Hello! Arf! My name is Grace and I'm the most playful Afghan Hound you'll ever meet. I'm from Manhattan and I've got a cream coat. Hey! Ruff-ruff! My name is Sisu and I'm the cleverest Afghan Hound you'll ever meet. Manhattan is where I call home and you could say my coat is black. Arf! I'm from Queens and they call me Jakie. I'm the friendliest Afghan Hound you'll ever meet and you could say my coat is white. Hey! Arf! My name is Geo and I'm an Afghan Hound. I'm from the Bronx and my coat is orange. Hello! My name is Ginger. I've got a tan coat and my breed is Afghan Hound. the Bronx is where I call home. Woof! Woof! The Bronx is where I call home and they call me Misty. I'm an Afghan Hound and my coat is tan. Howdy! My name is Troy. You could say my coat is blond and I'm the friendliest Afghan Hound you'll ever meet. I live in Staten Island. Bow-wow! Bow-wow! I'm from Queens and they call me Nick. I'm the cutest Afghan Hound you'll ever meet and you could say my coat is black. Hello! Arf! My name is Prince and I'm the most loyal Afghan Hound you'll ever meet. Queens is where I call home and my coat is tan. Howdy! Yip! My name is KIKU-NO-HM and I'm the most playful Akita you'll ever meet. I love living in Manhattan and I've got a gray coat. Hey! My name is Sasha. My coat is white and I'm the strongest Akita you'll ever meet. I live in Manhattan. Bow-wow! Yip! Queens is where I call home and they call me Bernie. I'm the strongest Akita you'll ever meet and my coat is white. Howdy! My name is Coffee. You could say my coat is brown and I'm the strongest Akita you'll ever meet. I'm from the Bronx. Yip! Hi! Arf! My name is Elsa and I'm an Akita. I'm from Brooklyn and you could say my coat is tan. Hey! My name is Jason. I've got a black coat and my breed is Akita. I'm from Queens. Ruff-ruff! Ruff-ruff! I love living in Queens and they call me Socrates. I'm an Akita and my coat is white. Arf! I live in Staten Island and they call me Suki. My breed is Akita and you could say my coat is rust. Hello! Woof! My name is Angel and I'm an Akita. Staten Island is where I call home and my coat is rust. Hey! My name is Aspen. My coat is brown and my breed is Akita. I'm from Brooklyn. Woof! Hi! Woof! My name is Bear and my breed is Akita. Queens is where I call home and my coat is black. Hey! Ruff-ruff! My name is Buster and I'm the strongest Akita you'll ever meet. I love living in Queens and you could say my coat is white. Hi! Arf! My name is CHIYO-KYRA and I'm the most playful Akita you'll ever meet. I love living in Queens and you could say my coat is brindle. Howdy! My name is Chula. I've got a white coat and my breed is Akita. Queens is where I call home. Ruff-ruff! Hi! Bow-wow! My name is Kita and I'm an Akita. Brooklyn is where I call home and you could say my coat is rust. Yip! I'm from the Bronx and they call me Nicki. I'm an Akita and you could say my coat is black. Howdy! Ruff-ruff! My name is Nikita and I'm an Akita. I love living in Queens and I've got a white coat. Arf! Queens is where I call home and they call me Ralph. I'm the strongest Akita you'll ever meet and my coat is tan. Hey! Yip! My name is Rambo and I'm an Akita. I love living in Queens and my coat is tan. Hey! Woof! My name is Shoko and I'm the strongest Akita you'll ever meet. I'm from Brooklyn and my coat is tan. Ruff-ruff! I live in the Bronx and they call me Bear. My breed is Akita and I've got a blond coat. Hello! Bow-wow! My name is DARIUS and I'm an Akita. I live in Queens and my coat is white. Bow-wow! I live in Queens and they call me Lady. I'm an Akita and you could say my coat is tan. Hi! My name is Oreo. You could say my coat is black and my breed is Akita. I live in Staten Island. Woof! Hello! Arf! My name is Prince and I'm an Akita. Queens is where I call home and you could say my coat is blond. Hello! Woof! My name is Sabastian and my breed is Akita. I live in Manhattan and I've got a white coat. Arf! I love living in the Bronx and they call me NICKO. I'm the strongest Akita you'll ever meet and you could say my coat is tan. Hey! Bow-wow! My name is AKIRO and I'm the friendliest Akita you'll ever meet. I live in the Bronx and I've got a tan coat. Hello! My name is Annie. I've got a white coat and I'm the cutest Akita you'll ever meet. Manhattan is where I call home. Bow-wow! Howdy! My name is King. You could say my coat is brown and I'm the most playful Akita you'll ever meet. the Bronx is where I call home. Yip! Howdy! Woof! My name is SCOOBYDORA and I'm the friendliest Akita you'll ever meet. I love living in Queens and my coat is brown. Woof! I love living in Manhattan and they call me TENSHI. I'm an Akita and my coat is tan. Hello! My name is Yuki. You could say my coat is black and I'm the cutest Akita you'll ever meet. the Bronx is where I call home. Ruff-ruff! Howdy! My name is Grizz. My coat is rust and I'm an Akita. I'm from Manhattan. Ruff-ruff! Hi! Bow-wow! My name is Bear and I'm an Akita. I'm from Manhattan and my coat is white. Ruff-ruff! I live in Manhattan and they call me Sukiyaki. I'm an Akita and you could say my coat is brindle. Ruff-ruff! I live in Manhattan and they call me Amy. My breed is Akita and I've got a brown coat. Ruff-ruff! I'm from the Bronx and they call me Asia. I'm an Akita and my coat is black. Howdy! Bow-wow! My name is Laki and my breed is Akita. Queens is where I call home and I've got a black coat. Hey! Woof! My name is Lucy and I'm the cleverest Akita you'll ever meet. I'm from Manhattan and my coat is white. Woof! Brooklyn is where I call home and they call me Luna. I'm an Akita and you could say my coat is white. Hi! My name is Mugs. I've got a tan coat and I'm the friendliest Akita you'll ever meet. I live in Staten Island. Ruff-ruff! Howdy! My name is Nicki. You could say my coat is white and my breed is Akita. I love living in Queens. Arf! Yip! Queens is where I call home and they call me PUKI. I'm an Akita and I've got a black coat. Howdy! Woof! My name is Star and I'm the cleverest Akita you'll ever meet. Queens is where I call home and I've got a white coat. Howdy! My name is Sydney. My coat is brown and my breed is Akita. I love living in Manhattan. Woof! Woof! I love living in the Bronx and they call me Yoshi. My breed is Akita and I've got a brindle coat. Hey! Arf! My name is Babe and my breed is Akita. I'm from Queens and my coat is white. Howdy! My name is HEAVENSENT. I've got a brown coat and my breed is Akita. Queens is where I call home. Yip! Hey! Bow-wow! My name is Kobi and I'm the strongest Akita you'll ever meet. I live in the Bronx and my coat is brindle. Hello! My name is Nikita. My coat is brindle and my breed is Akita. I love living in Manhattan. Ruff-ruff! Woof! I'm from Queens and they call me Rocky. I'm an Akita and my coat is white. Hi! Arf! My name is Romeo and I'm the friendliest Akita you'll ever meet. I love living in Brooklyn and I've got a brindle coat. Yip! I love living in Manhattan and they call me Tara. My breed is Akita and my coat is tan. Arf! I live in Queens and they call me Tara. I'm the most playful Akita you'll ever meet and you could say my coat is white. Hi! My name is Tasha. My coat is brown and I'm the friendliest Akita you'll ever meet. I love living in Queens. Ruff-ruff! Ruff-ruff! I love living in Staten Island and they call me Bella. I'm an Akita and I've got a black coat. Yip! I'm from Queens and they call me Yoji. I'm the strongest Akita you'll ever meet and you could say my coat is black. Woof! I live in Queens and they call me Tyson. My breed is Akita and I've got a white coat. Bow-wow! I live in Staten Island and they call me Cookie. I'm an Akita and you could say my coat is black. Howdy! Woof! My name is Jax and I'm the most playful Akita you'll ever meet. I'm from Queens and you could say my coat is black. Ruff-ruff! I live in Brooklyn and they call me KELBY. I'm an Akita and I've got a black coat. Hello! Yip! My name is Kiko and I'm an Akita. I'm from Staten Island and my coat is brown. Hey! Ruff-ruff! My name is n/a and I'm the most loyal Akita you'll ever meet. Manhattan is where I call home and my coat is blond. Hi! My name is Snowball. I've got a white coat and my breed is Akita. I live in Brooklyn. Woof! Hi! Yip! My name is Bogie and I'm the strongest Akita you'll ever meet. I'm from the Bronx and my coat is black. Bow-wow! I'm from Queens and they call me WON. I'm the cleverest Akita you'll ever meet and my coat is white. Yip! Brooklyn is where I call home and they call me Ginger. I'm the friendliest Akita you'll ever meet and you could say my coat is white. Hello! Arf! My name is Bear and I'm an Akita. I love living in Brooklyn and I've got a brindle coat. Yip! I love living in the Bronx and they call me Buddy. I'm the most playful Akita you'll ever meet and you could say my coat is black. Hey! My name is Coquito. You could say my coat is tan and I'm the friendliest Akita you'll ever meet. the Bronx is where I call home. Bow-wow! Hi! Arf! My name is HENESSEY and I'm the cleverest Akita you'll ever meet. I love living in Brooklyn and my coat is black. Bow-wow! I live in Brooklyn and they call me KICHI. I'm the strongest Akita you'll ever meet and you could say my coat is brown. Woof! I love living in Queens and they call me Yogi. I'm the most playful Akita you'll ever meet and my coat is white. Hello! Yip! My name is Brandy and I'm the friendliest Akita you'll ever meet. I love living in Queens and I've got a tan coat. Howdy! Arf! My name is Chester and I'm the most loyal Akita you'll ever meet. Queens is where I call home and my coat is black. Hey! My name is Jazzy. My coat is blond and I'm the friendliest Akita you'll ever meet. I'm from Queens. Yip! Woof! Queens is where I call home and they call me Seven. I'm the cutest Akita you'll ever meet and I've got a blond coat. Bow-wow! I love living in Brooklyn and they call me KIRA-A. I'm the strongest Akita you'll ever meet and my coat is rust. Bow-wow! I love living in Queens and they call me Layla. My breed is Akita and you could say my coat is black. Hey! Bow-wow! My name is Charlie and my breed is Akita. I live in Brooklyn and I've got a brindle coat. Woof! Brooklyn is where I call home and they call me Reggie. I'm the most loyal Akita you'll ever meet and my coat is rust. Hey! My name is Rocco. You could say my coat is silver and I'm an Akita. I love living in Staten Island. Yip! Hello! My name is Saki. My coat is white and my breed is Akita. I love living in the Bronx. Arf! Hey! Arf! My name is Stella and I'm the strongest Akita you'll ever meet. I'm from Brooklyn and my coat is white.
As you can see, this technique allows us to combine the expressive strengths of Tracery-based text generation with Python's CSV parser to generate simple "stories" from spreadsheet data. Pretty neat!