A simple wrapper over `pydot` to make it more consistent, unsurprising, and pythonic

Acknowledgement: fastdot is heavily influenced by work from David Page, who built a system for drawing graphs based on a highly flexible data structure he designed.

Install

We suggest installing with conda: conda install -c fastai fastdot. You can alternatively install with pip: pip install fastdot; however, if you use this approach, you'll also need to install graphviz (e.g. using apt, brew, etc).

Synopsis

Start with some data representing objects and connections between them (e.g. they wouldn't normally be just strings like in this example, but would be neural net layers, or users and products, or car trips, etc):

layers1 = ['conv','conv','lin']
layers2 = ['conv','lin']
block1,block2 = ['block1','block2']
conns = ((block1, block2),
         (block1, layers2[-1]))

Then map them directly to a visual respresentation:

g = graph_items(seq_cluster(layers1, block1),
                seq_cluster(layers2, block2))
g.add_items(*object_connections(conns))
g
G cluster_n0469250ff4894294b1b9f2fe397e9b81 block1 cluster_n30e93b9c79ef405391fa4d3fa4837070 block2 n85051b3884494047afabef9b20ffcbbc conv nb46d91b3f6ef455595413105c4958be0 conv n85051b3884494047afabef9b20ffcbbc->nb46d91b3f6ef455595413105c4958be0 ne8ee1e884f834b668b92405c0286a6ff lin nb46d91b3f6ef455595413105c4958be0->ne8ee1e884f834b668b92405c0286a6ff n865b150ec20943f4a578711d5431daea conv ne8ee1e884f834b668b92405c0286a6ff->n865b150ec20943f4a578711d5431daea n135d4aadc4c04081bc2cc7a2e03780eb lin ne8ee1e884f834b668b92405c0286a6ff->n135d4aadc4c04081bc2cc7a2e03780eb n865b150ec20943f4a578711d5431daea->n135d4aadc4c04081bc2cc7a2e03780eb

See the symbolic graphs and object graphs sections below for a more complete example.

fastdot overview

fastdot is a thin wrapper over the excellent pydot program (which is in turn a thin wrapper over the absolutely wonderful Graphviz software), designed to make it more consistent, unsurprising, and pythonic. (An example of removing surprise: pydot.Node('node') gives an obscure compilation exception, since node is a keyword in the underlying graphviz program, whereas fastdot.Node('node') works just fine, due to auto-quoting.) In fact, you never need to provide names in fastdot; you can create edges directly between objects.

Here's a quick example of some of the main functionality:

g = Dot()
c = Cluster('cl', fillcolor='pink')
a1,a2,b = c.add_items('a', 'a', 'b')
c.add_items(a1.connect(a2), a2.connect(b))
g.add_item(Node('Check tooltip', tooltip="I have a tooltip!"))
g.add_item(c)
g
G cluster_na66d0353478e446ebc83f841ca204eb2 cl n79b1f8ca98404bb7b6c233860813a782 Check tooltip n14db4780dd4a43dfb64198add839c9bc a n17c736831e114832820bf073ae1a63de a n14db4780dd4a43dfb64198add839c9bc->n17c736831e114832820bf073ae1a63de n34954e259e6442548971df31d2309de5 b n17c736831e114832820bf073ae1a63de->n34954e259e6442548971df31d2309de5

As you see, graphs know how to show themselves in Jupyter notebooks directly and can be exported to HTML (it uses SVG behind the scenes). Tooltips appear in both notebooks and exported HTML pages. Nodes with the same label, by default, are set to the same color. Also, as shown above, you can just use add_item or add_items, regardless of the type of item.

Symbolic graphs

fastdot is particularly designed to make it easier to create graphs symbolically - for instance, for Python dictionaries, PyTorch/TensorFlow models, and so forth. Here's a simple example with some mock neural network layers and sequential models. First, let's define our mock classes:

@dataclass(frozen=True)
class Layer: name:str; n_filters:int=1
class Linear(Layer): pass
class Conv2d(Layer): pass

@dataclass(frozen=True)
class Sequential: layers:list; name:str

Here's our sequential blocks for our "model":

block1 = Sequential([Conv2d('conv', 5), Linear('lin', 3)], 'block1')
block2 = Sequential([Conv2d('conv1', 8), Conv2d('conv2', 2), Linear('lin')], 'block2')

fastdot can create all node properties directly from objects; you just have to define functions describing how to map the object's attributes to graph properties. These mappings go in the node_defaults and cluster_defaults dictionaries (although by default labels are set using str(), so we don't need any special cluster defaults in this case):

node_defaults['fillcolor'] = lambda o: 'greenyellow' if isinstance(o,Linear) else 'pink'
cluster_defaults['label'] = node_defaults['label'] = attrgetter('name')
node_defaults['tooltip'] = str

With that in place, we can directly create nodes from our objects, for instance using the convenient seq_cluster function:

c1 = seq_cluster(block1.layers, block1)
c2 = seq_cluster(block2.layers, block2)
e1,e2 = c1.connect(c2),c1.connect(c2.last())
graph_items(c1,c2,e1,e2)
G cluster_n645502768ba549a993df05b0266d8da2 block1 cluster_n4d24aef0ffae4095945e1d376103cd08 block2 nfe3a36e50d5d40dc95f52e29b2431618 conv nfa394388ea6e4f86945e0bbeb8a4b7ae lin nfe3a36e50d5d40dc95f52e29b2431618->nfa394388ea6e4f86945e0bbeb8a4b7ae nd6b7614099614193ad622035291f3d98 conv1 nfa394388ea6e4f86945e0bbeb8a4b7ae->nd6b7614099614193ad622035291f3d98 nae914ac0b1e2425caa093024ef9d62f6 lin nfa394388ea6e4f86945e0bbeb8a4b7ae->nae914ac0b1e2425caa093024ef9d62f6 n3dcb0c54db2d439c9d4cd5c346f44be4 conv2 nd6b7614099614193ad622035291f3d98->n3dcb0c54db2d439c9d4cd5c346f44be4 n3dcb0c54db2d439c9d4cd5c346f44be4->nae914ac0b1e2425caa093024ef9d62f6

Note that in this example we didn't even need to create the Dot object separately - graph_items creates it directly from the graph items provided.

Using object graphs

In the above example, we defined our edges directly between fastdot objects. In practice, however, you'll most likely have your edges defined directly between python objects, for instance like this:

conns = (
    (block1, block2),
    (block1, block2.layers[-1]),
)

In this case, you'll want some way to connect your python objects to the fastdot graph items that represent them. A mapping is stored automatically by fastdot, and is made available through the object2graph function:

g = graph_items(seq_cluster(block1.layers, block1), seq_cluster(block2.layers, block2))
object2graph(block1.layers[-1])
<pydot.Node at 0x7fa459e92690>

You can use this to graph your connections without needing access to the graph items:

g.add_items(*[object2graph(a).connect(object2graph(b))
              for a,b in conns])
g
G cluster_n4f5df71258e84fdc986871c349ffc7c5 block1 cluster_n5c14eef7a6df4c0cb324195ed88be69a block2 neba677b5bbe6408a8d4cd179a828f3ce conv n73257360e2a54cad83591b8f5eedc6a0 lin neba677b5bbe6408a8d4cd179a828f3ce->n73257360e2a54cad83591b8f5eedc6a0 n4f985bb5842f4a28b7e2312e7a94c721 conv1 n73257360e2a54cad83591b8f5eedc6a0->n4f985bb5842f4a28b7e2312e7a94c721 nbea2b3b058a54942b0793e6a42fae03e lin n73257360e2a54cad83591b8f5eedc6a0->nbea2b3b058a54942b0793e6a42fae03e n9b33f8afb59c4805bb8d33406b64471a conv2 n4f985bb5842f4a28b7e2312e7a94c721->n9b33f8afb59c4805bb8d33406b64471a n9b33f8afb59c4805bb8d33406b64471a->nbea2b3b058a54942b0793e6a42fae03e

There's a helper function, object_connections, which creates these connections for you. So the above can be simplified to:

g = graph_items(seq_cluster(block1.layers, block1), seq_cluster(block2.layers, block2))
g.add_items(*object_connections(conns))
g
G cluster_ncefe06b4c0564aa29b8b178db56e0440 block1 cluster_na6e912223eaa4d98bd76f6bee93cb607 block2 nec4d6940a3de48dab7f0041ce3bb7a20 conv nf02951729cc344e495115cc90dfddaf9 lin nec4d6940a3de48dab7f0041ce3bb7a20->nf02951729cc344e495115cc90dfddaf9 n8b7f2f5c2cec45359b294e3f067f12a5 conv1 nf02951729cc344e495115cc90dfddaf9->n8b7f2f5c2cec45359b294e3f067f12a5 ne1e845258a64423bbfa6407dbd61f732 lin nf02951729cc344e495115cc90dfddaf9->ne1e845258a64423bbfa6407dbd61f732 n8eb7fa52fc1e49cfa62e2a1b065de669 conv2 n8b7f2f5c2cec45359b294e3f067f12a5->n8eb7fa52fc1e49cfa62e2a1b065de669 n8eb7fa52fc1e49cfa62e2a1b065de669->ne1e845258a64423bbfa6407dbd61f732