import os
import pandas as pd
from bbw import bbw
from IPython.core.display import display, HTML
data = [
['col0', 'col1', 'col2', 'col3'],
['Mannheim','Rhine', '97', 'Baden-Württemberg'],
['Edinburgh','River Forth', '47', 'City of Edinburgh']
]
df = pd.DataFrame(data)
df
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | col0 | col1 | col2 | col3 |
1 | Mannheim | Rhine | 97 | Baden-Württemberg |
2 | Edinburgh | River Forth | 47 | City of Edinburgh |
[web_table, url_table, label_table, cpa, cea, cta] = bbw.annotate(df)
display(HTML(web_table.to_html(escape=False)))
Up to here the examples worked without SearX, because that is not installed locally along this Jupyter notebook.
However, we can use a public instance https://searx.space/# for trying it out (but carefully as this only works for a handful examples at once).
# For example
os.environ["BBW_SEARX_URL"] = "https://searx.monicz.pl/"
os.environ["BBW_SEARX_URL"]
'https://searx.monicz.pl/'
# Use searx to get the bestname for a string with mistakes
[bbw.get_searx_bestname('Monnhem'), bbw.get_searx_bestname('dingbur')]
[['Mannheim'], ['Edinburgh', 'edinburgh', 'Dingbur']]
df[0][1] = "Monnheim"
df[0][2] = "dingbur"
df
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | col0 | col1 | col2 | col3 |
1 | Monnheim | Rhine | 97 | Baden-Württemberg |
2 | dingbur | River Forth | 47 | City of Edinburgh |
[web_table, url_table, label_table, cpa, cea, cta] = bbw.annotate(df)
display(HTML(web_table.to_html(escape=False)))
The GUI runs on a special port 8501
which you can access from the current URL by replacing the notebooks/bbw.ipynb
with proxy/8501/
.