Hacky answer to post 'Convert formulas into values using jupyterlab-spreadsheet-editor' on Jupyter Discourse page.
Note that jupyterlab-spreadsheet-editor is based on use of jexcel/Jspreadsheet CE.
To have something to work with, I went to here in the JupyterLab Spreadsheet Editor repo and clicked on the launch binder
badge to launch a session with the extension working. And then in the sessio essentially followed the gif below 'Formula support basic formula calculation (rendering) - as implemented by jExcel' as a guide to make a table that uses formulas.
Got:
Note the actual content I got from copying that table, which from was made like that using the gif below 'Formula support basic formula calculation (rendering) - as implemented by jExcel' as a guide:
5
1 =A3*B2
2 =A4*B3
3 =A5*B4
4 =A6*B5
5 =A7*B6
I figured that the HTML representation had the values because they are rendered and so I should be able to get them from there at least.
Getting the HTML table code:
In the jupyterlab-spreadsheet-editor view of my demo spreadheeet, did Shift
+ Right-click on bottom value in column that I made with formula and choose from Chrome broser menu Inspect
. (You may need to have developer tools on prior to this?) In my example it was column B. Then I right-clicked on line that is </table>
below that in the HTML elements listing. (For some reason when I did that, it looked more like </table> == $0
.) Then I right-clicked and choose Copy
> OuterHTML
.
I examined what that looked like by adding %%HTML
cell magic to the start of a Jupyter code cell and pastin in the copied HTML below it, like so:
%%HTML
<table class="jexcel jexcel_overflow" cellpadding="0" cellspacing="0" unselectable="yes"><colgroup><col width="50"><col width="40"><col width="40"><col width="40"><col width="40"></colgroup><thead class="draggable resizable"><tr><td class="jexcel_selectall"></td><td data-x="0" class="selected" style="text-align: center;">A</td><td data-x="1" style="text-align: center;" class="">B</td><td data-x="2" style="text-align: center;">C</td><td data-x="3" style="text-align: center;">D</td></tr></thead><tbody class="draggable"><tr data-y="0" class="selected"><td data-y="0" class="jexcel_row">1</td><td data-x="0" data-y="0" class="highlight-selected highlight highlight-top highlight-bottom highlight-left highlight-right" style="text-align: center;"></td><td data-x="1" data-y="0" style="text-align: center;"></td><td data-x="2" data-y="0" style="text-align: center;"></td><td data-x="3" data-y="0" style="text-align: center;"></td></tr><tr data-y="1" class=""><td data-y="1" class="jexcel_row">2</td><td data-x="0" data-y="1" style="text-align: center;" class=""></td><td data-x="1" data-y="1" style="text-align: center;" class="">5</td><td data-x="2" data-y="1" style="text-align: center;"></td><td data-x="3" data-y="1" style="text-align: center;"></td></tr><tr data-y="2" class=""><td data-y="2" class="jexcel_row">3</td><td data-x="0" data-y="2" style="text-align: center;" class="">1</td><td data-x="1" data-y="2" style="text-align: center;" class="">5</td><td data-x="2" data-y="2" style="text-align: center;"></td><td data-x="3" data-y="2" style="text-align: center;"></td></tr><tr data-y="3" class=""><td data-y="3" class="jexcel_row">4</td><td data-x="0" data-y="3" style="text-align: center;" class="">2</td><td data-x="1" data-y="3" style="text-align: center;" class="">10</td><td data-x="2" data-y="3" style="text-align: center;"></td><td data-x="3" data-y="3" style="text-align: center;"></td></tr><tr data-y="4" class=""><td data-y="4" class="jexcel_row">5</td><td data-x="0" data-y="4" style="text-align: center;" class="">3</td><td data-x="1" data-y="4" style="text-align: center;" class="">30</td><td data-x="2" data-y="4" style="text-align: center;"></td><td data-x="3" data-y="4" style="text-align: center;"></td></tr><tr data-y="5" class=""><td data-y="5" class="jexcel_row">6</td><td data-x="0" data-y="5" style="text-align: center;" class="">4</td><td data-x="1" data-y="5" style="text-align: center;" class="">120</td><td data-x="2" data-y="5" style="text-align: center;"></td><td data-x="3" data-y="5" style="text-align: center;"></td></tr><tr data-y="6" class=""><td data-y="6" class="jexcel_row" style="">7</td><td data-x="0" data-y="6" style="text-align: center;" class="">5</td><td data-x="1" data-y="6" style="text-align: center;" class="">600</td><td data-x="2" data-y="6" style="text-align: center;"></td><td data-x="3" data-y="6" style="text-align: center;"></td></tr></tbody></table>
A | B | C | D | |
1 | ||||
2 | 5 | |||
3 | 1 | 5 | ||
4 | 2 | 10 | ||
5 | 3 | 30 | ||
6 | 4 | 120 | ||
7 | 5 | 600 |
I could see that it indeed had the values in there.
Use BeautifulSoup to parse the HTML code.
First, assign the code text to a variable:
h='''<table class="jexcel jexcel_overflow" cellpadding="0" cellspacing="0" unselectable="yes"><colgroup><col width="50"><col width="40"><col width="40"><col width="40"><col width="40"></colgroup><thead class="draggable resizable"><tr><td class="jexcel_selectall"></td><td data-x="0" class="selected" style="text-align: center;">A</td><td data-x="1" style="text-align: center;" class="">B</td><td data-x="2" style="text-align: center;">C</td><td data-x="3" style="text-align: center;">D</td></tr></thead><tbody class="draggable"><tr data-y="0" class="selected"><td data-y="0" class="jexcel_row">1</td><td data-x="0" data-y="0" class="highlight-selected highlight highlight-top highlight-bottom highlight-left highlight-right" style="text-align: center;"></td><td data-x="1" data-y="0" style="text-align: center;"></td><td data-x="2" data-y="0" style="text-align: center;"></td><td data-x="3" data-y="0" style="text-align: center;"></td></tr><tr data-y="1" class=""><td data-y="1" class="jexcel_row">2</td><td data-x="0" data-y="1" style="text-align: center;" class=""></td><td data-x="1" data-y="1" style="text-align: center;" class="">5</td><td data-x="2" data-y="1" style="text-align: center;"></td><td data-x="3" data-y="1" style="text-align: center;"></td></tr><tr data-y="2" class=""><td data-y="2" class="jexcel_row">3</td><td data-x="0" data-y="2" style="text-align: center;" class="">1</td><td data-x="1" data-y="2" style="text-align: center;" class="">5</td><td data-x="2" data-y="2" style="text-align: center;"></td><td data-x="3" data-y="2" style="text-align: center;"></td></tr><tr data-y="3" class=""><td data-y="3" class="jexcel_row">4</td><td data-x="0" data-y="3" style="text-align: center;" class="">2</td><td data-x="1" data-y="3" style="text-align: center;" class="">10</td><td data-x="2" data-y="3" style="text-align: center;"></td><td data-x="3" data-y="3" style="text-align: center;"></td></tr><tr data-y="4" class=""><td data-y="4" class="jexcel_row">5</td><td data-x="0" data-y="4" style="text-align: center;" class="">3</td><td data-x="1" data-y="4" style="text-align: center;" class="">30</td><td data-x="2" data-y="4" style="text-align: center;"></td><td data-x="3" data-y="4" style="text-align: center;"></td></tr><tr data-y="5" class=""><td data-y="5" class="jexcel_row">6</td><td data-x="0" data-y="5" style="text-align: center;" class="">4</td><td data-x="1" data-y="5" style="text-align: center;" class="">120</td><td data-x="2" data-y="5" style="text-align: center;"></td><td data-x="3" data-y="5" style="text-align: center;"></td></tr><tr data-y="6" class=""><td data-y="6" class="jexcel_row" style="">7</td><td data-x="0" data-y="6" style="text-align: center;" class="">5</td><td data-x="1" data-y="6" style="text-align: center;" class="">600</td><td data-x="2" data-y="6" style="text-align: center;"></td><td data-x="3" data-y="6" style="text-align: center;"></td></tr></tbody></table>'''
Now can parse that table.
# based on https://stackoverflow.com/a/45843420/8508004
from bs4 import BeautifulSoup
html = h
soup = BeautifulSoup(html, 'html.parser')
tables = [
[
[td.get_text(strip=True) for td in tr.find_all('td')]
for tr in table.find_all('tr')
]
for table in soup.find_all('table')
]
tables
print(tables)
[[['', 'A', 'B', 'C', 'D'], ['1', '', '', '', ''], ['2', '', '5', '', ''], ['3', '1', '5', '', ''], ['4', '2', '10', '', ''], ['5', '3', '30', '', ''], ['6', '4', '120', '', ''], ['7', '5', '600', '', '']]]
Now collect the B column and the values.
column_b = [row[2] for row in tables[0][1:] if len(row) > 2]
column_b_nums = [int(x) for x in column_b if x.strip()]
column_b_nums
[5, 5, 10, 30, 120, 600]
There's the values of the formula.
What if just wanted a CSV?
Since the HTML representation has the values, that can be used to make a CSV that won't have the formulas and instead has the values as requested in the post 'Convert formulas into values using jupyterlab-spreadsheet-editor' on Jupyter Discourse page.
h='''<table class="jexcel jexcel_overflow" cellpadding="0" cellspacing="0" unselectable="yes"><colgroup><col width="50"><col width="40"><col width="40"><col width="40"><col width="40"></colgroup><thead class="draggable resizable"><tr><td class="jexcel_selectall"></td><td data-x="0" class="selected" style="text-align: center;">A</td><td data-x="1" style="text-align: center;" class="">B</td><td data-x="2" style="text-align: center;">C</td><td data-x="3" style="text-align: center;">D</td></tr></thead><tbody class="draggable"><tr data-y="0" class="selected"><td data-y="0" class="jexcel_row">1</td><td data-x="0" data-y="0" class="highlight-selected highlight highlight-top highlight-bottom highlight-left highlight-right" style="text-align: center;"></td><td data-x="1" data-y="0" style="text-align: center;"></td><td data-x="2" data-y="0" style="text-align: center;"></td><td data-x="3" data-y="0" style="text-align: center;"></td></tr><tr data-y="1" class=""><td data-y="1" class="jexcel_row">2</td><td data-x="0" data-y="1" style="text-align: center;" class=""></td><td data-x="1" data-y="1" style="text-align: center;" class="">5</td><td data-x="2" data-y="1" style="text-align: center;"></td><td data-x="3" data-y="1" style="text-align: center;"></td></tr><tr data-y="2" class=""><td data-y="2" class="jexcel_row">3</td><td data-x="0" data-y="2" style="text-align: center;" class="">1</td><td data-x="1" data-y="2" style="text-align: center;" class="">5</td><td data-x="2" data-y="2" style="text-align: center;"></td><td data-x="3" data-y="2" style="text-align: center;"></td></tr><tr data-y="3" class=""><td data-y="3" class="jexcel_row">4</td><td data-x="0" data-y="3" style="text-align: center;" class="">2</td><td data-x="1" data-y="3" style="text-align: center;" class="">10</td><td data-x="2" data-y="3" style="text-align: center;"></td><td data-x="3" data-y="3" style="text-align: center;"></td></tr><tr data-y="4" class=""><td data-y="4" class="jexcel_row">5</td><td data-x="0" data-y="4" style="text-align: center;" class="">3</td><td data-x="1" data-y="4" style="text-align: center;" class="">30</td><td data-x="2" data-y="4" style="text-align: center;"></td><td data-x="3" data-y="4" style="text-align: center;"></td></tr><tr data-y="5" class=""><td data-y="5" class="jexcel_row">6</td><td data-x="0" data-y="5" style="text-align: center;" class="">4</td><td data-x="1" data-y="5" style="text-align: center;" class="">120</td><td data-x="2" data-y="5" style="text-align: center;"></td><td data-x="3" data-y="5" style="text-align: center;"></td></tr><tr data-y="6" class=""><td data-y="6" class="jexcel_row" style="">7</td><td data-x="0" data-y="6" style="text-align: center;" class="">5</td><td data-x="1" data-y="6" style="text-align: center;" class="">600</td><td data-x="2" data-y="6" style="text-align: center;"></td><td data-x="3" data-y="6" style="text-align: center;"></td></tr></tbody></table>'''
import csv
import io
from bs4 import BeautifulSoup
def html_table_to_csv(html):
# Parse the HTML
soup = BeautifulSoup(html, 'html.parser')
# Find the table
table = soup.find('table')
# Prepare a CSV output
output = io.StringIO()
csv_writer = csv.writer(output)
# Extract rows
for tr in table.find_all('tr'):
# Extract cell texts, stripping whitespace
# Skip the first column (row numbers) by slicing [1:]
row = [td.get_text(strip=True) for td in tr.find_all('td')[1:]]
# Only write non-empty rows
if row:
csv_writer.writerow(row)
# Get the CSV content as a string
csv_content = output.getvalue()
return csv_content
# Convert the HTML to CSV
csv_output = html_table_to_csv(h)
print(csv_output)
A,B,C,D ,,, ,5,, 1,5,, 2,10,, 3,30,, 4,120,, 5,600,,
That works.
So save it as a file:
%store csv_output >csv_output.csv
Writing 'csv_output' (str) to file 'csv_output.csv'.
cat csv_output.csv
A,B,C,D ,,, ,5,, 1,5,, 2,10,, 3,30,, 4,120,, 5,600,,
That looks like the following in the result, which is pretty close to the initially rendered table that had the formulas, but now only has values: