WARNING: This document will not render correctly using nbviewer or nbconvert. To render this notebook correctly, open in IPython Notebook
and run Cell->Run All
from the menu bar.
The IPython Notebook allows Markdown, HTML, and inline LaTeX in Mardown Cells. The inline LaTeX is parsed with MathJax and Markdown is parsed with marked. Any inline HTML is left to the web browser to parse. NBConvert is a utility that allows users to easily convert their notebooks to various formats. Pandoc is used to parse markdown text in NBConvert. Since what the notebook web interface supports is a mix of Markdown, HTML, and LaTeX, Pandoc has trouble converting notebook markdown. This results in incomplete representations of the notebook in nbviewer or a compiled Latex PDF.
This isn't a Pandoc flaw; Pandoc isn't designed to parse and convert a mixed format document. Unfortunately, this means that Pandoc can only support a subset of the markup supported in the notebook web interface. This notebook compares output of Pandoc to the notebook web interface.
Changes:
05102013
06102013
Define functions to render Markdown using the notebook and Pandoc.
from IPython.nbconvert.utils.pandoc import pandoc
from IPython.display import HTML, Javascript, display
from IPython.nbconvert.filters import citation2latex, strip_files_prefix, \
markdown2html, markdown2latex
def pandoc_render(markdown):
"""Render Pandoc Markdown->LaTeX content."""
## Convert the markdown directly to latex. This is what nbconvert does.
#latex = pandoc(markdown, "markdown", "latex")
#html = pandoc(markdown, "markdown", "html", ["--mathjax"])
# nbconvert template conversions
html = strip_files_prefix(markdown2html(markdown))
latex = markdown2latex(citation2latex(markdown))
display(HTML(data="<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
"<div style='background: #AAFFAA; width: 100%;'>NBConvert Latex Output</div>" \
"<pre class='prettyprint lang-tex' style='background: #EEFFEE; border: 1px solid #DDEEDD;'><xmp>" + latex + "</xmp></pre>"\
"</div>" \
"<div style='display: inline-block; width: 2%;'></div>" \
"<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
"<div style='background: #FFAAAA; width: 100%;'>NBViewer Output</div>" \
"<div style='display: inline-block; width: 100%;'>" + html + "</div>" \
"</div>"))
javascript = """
$.getScript("https://google-code-prettify.googlecode.com/svn/loader/run_prettify.js");
"""
display(Javascript(data=javascript))
def notebook_render(markdown):
javascript = """
var mdcell = new IPython.MarkdownCell();
mdcell.create_element();
mdcell.set_text('""" + markdown.replace("\\", "\\\\").replace("'", "\'").replace("\n", "\\n") + """');
mdcell.render();
$(element).append(mdcell.element)
.removeClass()
.css('left', '66%')
.css('position', 'absolute')
.css('width', '30%')
mdcell.element.prepend(
$('<div />')
.removeClass()
.css('background', '#AAAAFF')
.css('width', '100 %')
.html('Notebook Output')
);
container.show()
"""
display(Javascript(data=javascript))
def pandoc_html_render(markdown):
"""Render Pandoc Markdown->LaTeX content."""
# Convert the markdown directly to latex. This is what nbconvert does.
latex = pandoc(markdown, "markdown", "latex")
# Convert the pandoc generated latex to HTML so it can be rendered in
# the web browser.
html = pandoc(latex, "latex", "html", ["--mathjax"])
display(HTML(data="<div style='background: #AAFFAA; width: 40%;'>HTML Pandoc Output</div>" \
"<div style='display: inline-block; width: 40%;'>" + html + "</div>"))
return html
def compare_render(markdown):
notebook_render(markdown)
pandoc_render(markdown)
try:
import lxml
print 'LXML found!'
except:
print 'Warning! No LXML found - the old citation2latex filter will not work'
LXML found!
Heading level 6 is not supported by Pandoc.
compare_render(r"""
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6""")
\section{Heading 1} \subsection{Heading 2} \subsubsection{Heading 3} \paragraph{Heading 4} \subparagraph{Heading 5} Heading 6
Headers aren't recognized by (Pandoc on Windows?) if there isn't a blank line above the headers.
compare_render(r"""
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6 """)
print("\n"*10)
\section{Heading 1} \subsection{Heading 2} \subsubsection{Heading 3} \paragraph{Heading 4} \subparagraph{Heading 5} Heading 6
If internal links are defined, these will not work in nbviewer and latex as the local link is not existing.
compare_render(r"""
[Link2Heading](http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb#General-markdown)
""")
\href{http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb\#General-markdown}{Link2Heading}
Basic Markdown bold and italic works.
compare_render(r"""
This is Markdown **bold** and *italic* text.
""")
This is Markdown \textbf{bold} and \emph{italic} text.
This is Markdown bold and italic text.
Nested lists work as well
compare_render(r"""
- li 1
- li 2
1. li 3
1. li 4
- li 5
""")
\begin{itemize} \itemsep1pt\parskip0pt\parsep0pt \item li 1 \item li 2 \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \itemsep1pt\parskip0pt\parsep0pt \item li 3 \item li 4 \end{enumerate} \item li 5 \end{itemize}
Unicode support
compare_render(ur"""
überschuß +***^°³³ α β θ
""")
überschuß +\emph{*}\^{}°³³ α β θ
überschuß +*^°³³ α β θ
Pandoc may produce invalid latex, e.g \sout is not allowed in headings
compare_render(r"""
# Heading 1 ~~strikeout~~
""")
\section{Heading 1 \sout{strikeout}}
Horizontal lines work just fine
compare_render(r"""
above
--------
below
""")
above \begin{center}\rule{3in}{0.4pt}\end{center} below
above
below
(maybe we should deactivate this)
compare_render(r"""
This is Markdown ~subscript~ and ^superscript^ text.
""")
This is Markdown \textsubscript{subscript} and \textsuperscript{superscript} text.
This is Markdown subscript and superscript text.
No space before underline behaves inconsistent (Pandoc extension: intraword_underscores - deactivate?)
compare_render(r"""
This is Markdown not_italic_.
""")
This is Markdown not\_italic\_.
This is Markdown not_italic_.
Pandoc allows to define tex macros which are respected for all output formats, the notebook not.
compare_render(r"""
\newcommand{\tuple}[1]{\langle #1 \rangle}
$\tuple{a, b, c}$
""")
\newcommand{\tuple}[1]{\langle #1 \rangle} $\tuple{a, b, c}$
\(\langle a, b, c \rangle\)
When placing the \newcommand inside a math environment it works within the notebook and nbviewer, but produces invalid latex (the newcommand is only valid in the same math environment).
compare_render(r"""
$\newcommand{\foo}[1]{...:: #1 ::...}$
$\foo{bar}$
""")
$\newcommand{\foo}[1]{...:: #1 ::...}$ $\foo{bar}$
\(\newcommand{\foo}[1]{...:: #1 ::...}\) \(\foo{bar}\)
Raw HTML gets dropped entirely when converting to $\LaTeX$.
compare_render(r"""
This is HTML <b>bold</b> and <i>italic</i> text.
""")
This is HTML bold and italic text.
This is HTML bold and italic text.
Same for something like center
compare_render(r"""
<center>Center aligned</center>
""")
Center aligned
Raw $\LaTeX$ gets droppen entirely when converted to HTML. (I don't know why the HTML output is cropped here???)
compare_render(r"""
This is \LaTeX \bf{bold} and \emph{italic} text.
""")
This is \LaTeX \bf{bold} and \emph{italic} text.
This is
A combination of raw $\LaTeX$ and raw HTML
compare_render(r"""
**foo** $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ <b>b\$ar</b> $$test$$
\cite{}
""")
\textbf{foo} $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ b\$ar \[test\] \cite{}
foo \(\left( \sum_{k=1}^n a_k b_k \right)^2 \leq\) b$ar \[test\]
HTML tables render in the notebook, but not in Pandoc.
compare_render(r"""
<table>
<tr>
<td>a</td>
<td>b</td>
</tr>
<tr>
<td>c</td>
<td>d</td>
</tr>
</table>
""")
a b c d
a | b |
c | d |
Instead, Pandoc supports simple ascii tables. Unfortunately marked.js doesn't support this, and therefore it is not supported in the notebook.
compare_render(r"""
+---+---+
| a | b |
+---+---+
| c | d |
+---+---+
""")
\begin{longtable}[c]{@{}ll@{}} \hline\noalign{\medskip} \begin{minipage}[t]{0.06\columnwidth}\raggedright a \end{minipage} & \begin{minipage}[t]{0.06\columnwidth}\raggedright b \end{minipage} \\\noalign{\medskip} \begin{minipage}[t]{0.06\columnwidth}\raggedright c \end{minipage} & \begin{minipage}[t]{0.06\columnwidth}\raggedright d \end{minipage} \\\noalign{\medskip} \hline \end{longtable}
a |
b |
c |
d |
An alternative to basic ascii tables is pipe tables. Pipe tables can be recognized by Pandoc and are supported by marked, hence, this is the best way to add tables.
compare_render(r"""
|Left |Center |Right|
|:----|:-----:|----:|
|Text1|Text2 |Text3|
""")
\begin{longtable}[c]{@{}lcr@{}} \hline\noalign{\medskip} Left & Center & Right \\\noalign{\medskip} \hline\noalign{\medskip} Text1 & Text2 & Text3 \\\noalign{\medskip} \hline \end{longtable}
Left | Center | Right |
---|---|---|
Text1 | Text2 | Text3 |
Pandoc recognizes cell alignment in simple tables. Since marked.js doesn't recognize ascii tables, it can't render this table.
compare_render(r"""
Right Aligned Center Aligned Left Aligned
------------- -------------- ------------
Why does this
actually work? Who
knows ...
""")
print("\n"*5)
\begin{longtable}[c]{@{}lll@{}} \hline\noalign{\medskip} Right Aligned & Center Aligned & Left Aligned \\\noalign{\medskip} \hline\noalign{\medskip} Why & does & this \\\noalign{\medskip} actually & work? & Who \\\noalign{\medskip} knows & \ldots{} & \\\noalign{\medskip} \hline \end{longtable}
Right Aligned | Center Aligned | Left Aligned |
---|---|---|
Why | does | this |
actually | work? | Who |
knows | ... |
Markdown images work on both. However, remote images are not allowed in $\LaTeX$. Maybe add a preprocessor to download these. The alternate text is displayed in nbviewer next to the image.
compare_render(r"""
![Alternate Text](http://ipython.org/_static/IPy_header.png)
""")
\begin{figure}[htbp] \centering \includegraphics{http://ipython.org/_static/IPy_header.png} \caption{Alternate Text} \end{figure}
HTML Images only work in the notebook.
compare_render(r"""
<img src="http://ipython.org/_static/IPy_header.png">
""")
Simple inline and displaystyle maths work fine
compare_render(r"""
My equation:
$$ 5/x=2y $$
It is inline $ 5/x=2y $ here.
""")
My equation: \[ 5/x=2y \] It is inline \$ 5/x=2y \$ here.
My equation: \[ 5/x=2y \]
It is inline $ 5/x=2y $ here.
If the first $ is on a new line, the equation is not captured by md2tex, if both $s are on a new line md2html fails (Note the raw latex is dropped) but the notebook renders it correctly.
compare_render(r"""
$5 \cdot x=2$
$
5 \cdot x=2$
$
5 \cdot x=2
$
""")
$5 \cdot x=2$ \$ 5 \cdot x=2\$ \$ 5 \cdot x=2 \$
\(5 \cdot x=2\)
$ 5 x=2$
$ 5 x=2 $
MathJax permits some $\LaTeX$ math constructs without $s, of course these raw $\LaTeX$ is stripped when converting to html. Moreove, the & are escaped by the lxml parsing #4251.
compare_render(r"""
\begin{align}
a & b\\
d & c
\end{align}
\begin{eqnarray}
a & b \\
c & d
\end{eqnarray}
""")
\begin{align} a & b\\ d & c \end{align} \begin{eqnarray} a & b \\ c & d \end{eqnarray}
There is another lxml issue, #4283
compare_render(r"""
1<2 is true, but 3>4 is false.
$1<2$ is true, but $3>4$ is false.
1<2 it is even worse if it is alone in a line.
""")
14 is false. $14$ is false. 1
1<2 is true, but 3>4 is false.
\(1<2\) is true, but \(3>4\) is false.
1<2 it is even worse if it is alone in a line.
compare_render(r"""
some source code
```
a = "test"
print(a)
```
""")
some source code \begin{verbatim} a = "test" print(a) \end{verbatim}
some source code
a = "test"
print(a)
Language specific syntax highlighting by Pandoc requires additional dependencies to render correctly.
compare_render(r"""
some source code
```python
a = "test"
print(a)
```
""")
some source code \begin{Shaded} \begin{Highlighting}[] \NormalTok{a = }\StringTok{"test"} \KeywordTok{print}\NormalTok{(a)} \end{Highlighting} \end{Shaded}
some source code
a = "test"
print(a)