Arc Jupyter
Arc Jupyter is a custom kernel for Jupyter Notebooks (via the plugin API) which allows users to build Arc jobs in an interactive manner. This page aims to document some of the functionality of the plugin. Due to the startup time of the Java Virtual Machine that Spark requires some of these features can take a while to become active.
Magics
Arc Jupyter provides some magic
commands to make developing notebooks easier.
Magic | Description |
---|---|
%conf |
Allows setting Arc Jupyter configuration variables. |
%configexecute |
Shorthand for executing a ConfigExecute stage. Expects a single line configuration then a SQL statement. This can be used to generate runtime variables from data. |
%env |
Allows setting Arc Jupyter environment variables as key/value pairs. E.g. ETL_CONF_BASE_DIR=/home/jovyan/tutorial to set the ETL_CONF_BASE_DIR variable. |
%list |
Allows listing files in a directory. Expects directory to be passed in second line. |
%log |
Shorthand for executing a LogExecute stage. Expects a single line configuration then a SQL statement. |
%metadata |
Display an Arc metadata dataset for the input view. |
%metadatafilter |
Shorthand for executing a MetadataFilterTransform stage. Expects a single line configuration then a SQL statement. |
%metadatavalidate |
Shorthand for executing a MetadataValidate stage. Expects a single line configuration then a SQL statement. |
%schema |
Display a JSON formatted schema for the input view. |
%secret |
Allows entering runtime secrets which are not saved. |
%sql |
Shorthand for executing a SQLTransform stage. Expects a single line configuration then a SQL statement. As views are registered (by stages like SQLTransform they will be added to this list to rapidly generate select statements for all columns (including nested values). |
%sqlvalidate |
Shorthand for executing a SQLValidate stage. Expects a single line configuration then a SQL statement. |
%version |
Print Arc Jupyter version information. |
Example
Completer
Arc Jupyter provides Completer
functionality to help rapidly develop jobs. This functionality relies on the Spark Kernel having started so will only work after at least one stage has executed.
To execute this start typing some letters and press the tab
key to invoke the Completer
functionality. The arrow keys and the enter
button can be used to select an item. Additionally the link icon on the right of the box can be clicked to link to external documentation.
This functionality is part of the Arc plugin functionality so can be automatically added with any custom extensions you develop if the JupyterCompleter
trait is implemented.