nbio

Reading and writing Jupyter notebooks

Reading a notebook

A notebook is just a json file.

Exported source

def _read_json(self, encoding=None, errors=None):
    return loads(Path(self).read_text(encoding=encoding, errors=errors))

minimal_fn = Path('../tests/minimal.ipynb')
minimal_txt = AttrDict(_read_json(minimal_fn))

It contains two sections, the metadata…:

minimal_txt.metadata

{'kernelspec': {'display_name': 'Python 3 (ipykernel)',
  'language': 'python',
  'name': 'python3'}}

…and, more importantly, the cells:

minimal_txt.cells

[{'cell_type': 'markdown',
  'metadata': {},
  'source': ['## A minimal notebook']},
 {'cell_type': 'code',
  'execution_count': None,
  'metadata': {},
  'outputs': [{'data': {'text/plain': ['2']},
    'execution_count': None,
    'metadata': {},
    'output_type': 'execute_result'}],
  'source': ['# Do some arithmetic\n', '1+1']}]

The second cell here is a code cell, however it contains no outputs, because it hasn’t been executed yet. To execute a notebook, we first need to convert it into a format suitable for nbclient (which expects some dict keys to be available as attrs, and some available as regular dict keys). Normally, nbformat is used for this step, but it’s rather slow and inflexible, so we’ll write our own function based on fastcore’s handy dict2obj, which makes all keys available as both attrs and keys.

source

NbCell

 NbCell (idx, cell)

dict subclass that also provides access to keys as attrs

We use an AttrDict subclass which has some basic functionality for accessing notebook cells.

source

dict2nb

 dict2nb (js=None, **kwargs)

Convert dict js to an AttrDict,

We can now convert our JSON into this nbclient-compatible format, which pretty prints the source code of cells in notebooks.

minimal = dict2nb(minimal_txt)
cell = minimal.cells[1]
cell

{ 'cell_type': 'code',
  'execution_count': None,
  'idx_': 1,
  'metadata': {},
  'outputs': [ { 'data': {'text/plain': ['2']},
                 'execution_count': None,
                 'metadata': {},
                 'output_type': 'execute_result'}],
  'source': '# Do some arithmetic\n1+1'}

The abstract syntax tree of source code cells is available in the parsed_ property:

cell.parsed_(), cell.parsed_()[0].value.op

([<ast.Expr>], <ast.Add>)

source

read_nb

 read_nb (path)

Return notebook at path

This reads the JSON for the file at path and converts it with dict2nb. For instance:

minimal = read_nb(minimal_fn)
str(minimal.cells[0])

"{'cell_type': 'markdown', 'metadata': {}, 'source': '## A minimal notebook', 'idx_': 0}"

The file name read is stored in path_:

minimal.path_

'../tests/minimal.ipynb'

Creating a notebook

source

new_nb

 new_nb (cells=None, meta=None, nbformat=4, nbformat_minor=5)

Returns an empty new notebook

Use this function when creating a new notebook. Useful for when you don’t want to create a notebook on disk first and then read it.

source

mk_cell

 mk_cell (text, cell_type='code', **kwargs)

Create an NbCell containing text

	Type	Default	Details
text			`source` attr in cell
cell_type	str	code	`cell_type` attr in cell
kwargs

mk_cell('print(1)', execution_count=0)

{ 'cell_type': 'code',
  'directives_': {},
  'execution_count': 0,
  'idx_': 0,
  'metadata': {},
  'source': 'print(1)'}

Writing a notebook

source

nb2dict

 nb2dict (d, k=None)

Convert parsed notebook to dict

This returns the exact same dict as is read from the notebook JSON.

minimal_fn = Path('../tests/minimal.ipynb')
minimal = read_nb(minimal_fn)

minimal_dict = _read_json(minimal_fn)
assert minimal_dict==nb2dict(minimal)

source

nb2str

 nb2str (nb)

Convert nb to a str

To save a notebook we first need to convert it to a str:

print(nb2str(minimal)[:45])

{
 "cells": [
  {
   "cell_type": "markdown",

source

write_nb

 write_nb (nb, path)

Write nb to path

This returns the exact same string as saved by Jupyter.

tmp = Path('tmp.ipynb')
try:
    minimal_txt = minimal_fn.read_text()
    write_nb(minimal, tmp)
    assert minimal_txt==tmp.read_text()
finally: tmp.unlink()

Here’s how to put all the pieces of execnb.nbio together:

nb = new_nb([mk_cell('print(1)')])
path = Path('test.ipynb')
write_nb(nb, path)
nb2 = read_nb(path)
print(nb2.cells)
path.unlink()

[{'cell_type': 'code', 'metadata': {}, 'source': 'print(1)', 'idx_': 0}]