Jupyter books in Azure Data Studio

Jupyter Book: A collection of SQL, PowerShell, and Python-based notebooks and markdown files.

In a recent Channel9 segment, Vicky Harp gave a brief glimpse of the Jupyter Books support that was announced in the November 2019 release of Azure Data Studio (ADS). Jupyter books enable adding logical organization to physical collections of SQL, PowerShell, and Python-based Jupyter notebooks and markdown (text) files. Azure Data Studio also has the functionality to search through a Book, along with using relative links between notebooks.

Essentially, the Jupyter Book support allows for creating a Table of Contents for a bunch of individual ADS notebooks. Each of the notebooks becomes a page or chapter in the overall book.   

After watching the Channel9 segment, my question was: 
How can I create a Jupyter book for my project & work-related notebooks?

The first thing to know is that Jupyter Books and “Jupyter Book support” (in Azure Data Studio) are slightly different concepts. Jupyter Books let you build web-based collections of Jupyter notebooks. Jupyter Books support allows you to build collections of Jupyter notebooks on your local computer or network (ie. not web-based). Additionally, all of the standards and functionality of the online Jupyter Books may not be fully supported/implemented in Azure Data Studio.

The basic anatomy of a Jupyter Book in Azure Data Studio

The physical folder & file structure (shown below left) consists of:

  • the _config.yml file that’s in the book’s root folder.
  • a content folder that contains all of the SQL & PowerShell notebooks, markdown files, images, etc. (This folder can contain subfolders.)
  • a _data folder that contains the toc.yml file.

The toc.yml file (above center) controls the logical appearance of the Jupyter book in the ADS sidebar (above right). It’s essentially the book’s primary configuration file and is written using yaml (a human-readable data-serialization language).

The _config.yml file contains the book’s title and a basic description. 

Creating a toc.yml file

The toc.yml file can be created using a text editor or in Azure Data Studio (choose YAML from the Save As menu). The contents of the file will vary between books. However, here are some syntax rules to keep in mind:

  • Yaml structure is shown through indentation (one or more spaces). In most cases, a new line indicates the end of a field.
  • Comments start with a hash symbol.
    • # This line is a comment
  • Sequence items are denoted by a dash ().
    • - Item 1
      - Item 2
  • Key value pairs are separated by a colon.
    • key name : key value
    • title : The title of a notebook
  • Arrays are signified by a colon followed by a sequence of items.
    • sections :
      - Chapter 1
      - Chapter 2
  • Each notebook or markdown file is designated with a Title and URL (path). The URL contains the relative path to the file, using forward slashes (/Section1/PS1), but does not have filename extensions. Using a filename extension in the URL will result in a Missing file error when the book is opened in ADS.
    • - title: PowerShell Notebook
      url: /Section1/PS1.ipynb

Example toc.yml
 - title: About this book
   url: readme
   search: true
   not_numbered: true
   expand_sections: true
 - title: Table of Contents
   url: toc
   not_numbered: true
   expand_sections: true
   sections:
      - title: Section 1 (markdown file)
        url: /Section1/MD1
        not_numbered: true
        expand_sections: false
        sections:
           - title: PowerShell Notebook
             url: /Section1/PS1
      - title: Section 2 (SQL notebook)
        url: /Section2/SQL2
        not_numbered: true
        expand_sections: true
        sections:
           - title: Chapter 2A (SQL notebook)
             url: /Section2/SQL2A
             not_numbered: true
             expand_sections: true
             sections:
                - title: Subchapter 1 (SQL notebook)
                  url: /Section2/SQL2A-1
                - title: Subchapter 2 (SQL notebook)
                  url: /Section2/SQL2A-2

Opening a Jupyter Book

Once the toc.yml and _config.yml files have been created, and the notebooks organized in the content folder, the Jupyter Book can be opened in ADS. To open the book:

  1. Click on the Jupyter Book icon in the Side Bar.
  2. Click on the drop-down arrow beside the Saved Books list, if it is not already open.
  3. Click on the Open Jupyter Book icon to the right of the Saved Books list.
  4. In the Select Folder window that opens, browse to (and select) the root folder of the Jupyter Book (where the _config.yml file is).
  5. The book’s structure (toc.yml) will now appear under the Saved Books list.

The new Create Book wizard

In the March 2020 update to Azure Data Studio, there is now a Create Book option that will assist with consolidating multiple notebooks into a single Jupyter book.

To open the wizard, click on the Jupyter Books icon on the left-side menu. Then click on the ellipses […] that are to the right of the Saved Books list. The Create Book (Preview) option should then appear.

Clicking on the Create Book (Preview) button will open a new notebook. The first step will use pip to install the jupyter-book module.

Step 2 will prompt for where to create the new Jupyter book, followed by the path to the existing notebooks.

When the initial book has been created, the structure contains a number of files and subfolders. This structure appears to be the same as the one used in the Jupyter Book online project and may be an indication of where ADS Jupyter Book support is headed.

Executing the next script block will clean-up the extraneous files & folders.

The third step will generate a link that will open the new book in Azure Data Studio.

Unfortunately, this doesn’t always work as expected…

So, while the addition of an automated process to create a new Jupyter Book is an exciting new feature, it’s definitely still a Preview (in-development) option.

1 thought on “Jupyter books in Azure Data Studio”

Leave a Reply

Your email address will not be published. Required fields are marked *