PixieDust_Optimus
PixieDust is a productivity tool for Python or Scala notebooks, which lets a developer encapsulate business logic into something easy for your customers to consume. This version is for the Optimus fork.
What is PixieDust?
PixieDust is an open source helper library that works as an add-on to Jupyter notebooks to improve the user experience of working with data. It also fills a gap for users who have no access to configuration files when a notebook is hosted on the cloud.
Use in Python or Scala
PixieDust greatly simplifies working with Python display libraries like matplotlib, but works just as effectively in Scala notebooks too. You no longer have compromise your love of Scala to generate great charts. PixieDust lets you bring robust Python visualization options to your Scala notebooks. Installer and instructions to use Scala with PixieDust are coming soon...
Features
PixieDust's current capabilities include:
-
packageManager lets you install Spark packages inside a Python notebook. This is something that you can't do today on hosted Jupyter notebooks, which prevents developers from using a large number of spark package add-ons.
-
Visualizations. One single API called
display()
lets you visualize your Spark object in different ways: table, charts, maps, etc.... This module is designed to be extensible, providing an API that lets anyone easily contribute a new visualization plugin.This sample visualization plugin uses d3 to show the different flight routes for each airport:
-
Embedded apps. Let nonprogrammers actively use notebooks. Transform a hard-to-read notebook into a polished graphic app for business users. Check out these preliminary sample apps:
- An app can feature embedded forms and responses, flightpredict, which lets users enter flight details to see the likelihood of landing on-time.
- Or present a sophisticated workflow, like our twitter demo, which delivers a real-time feed of tweets, trending hashtags, and aggregated sentiment charts with Watson Tone Analyzer.
-
Extensibility. Create your own visualizations or apps using the PixieDust extensibility APIs. If you know html and css, you can write and deliver amazing graphics without forcing notebook users to type one line of code. Use the shape of the data to control when PixieDust shows your visualization in a menu.
-
Export. Notebook users can download data to .csv, HTML, JSON, etc. locally on your laptop
-
Scala Bridge. Use Scala directly in your Python notebook. Variables are automatically transfered from Python to Scala and vice-versa. Learn more.
Or start in a Scala notebook. As mentioned, all these PixieDust features work not only in Python, but in Scala too. So if you prefer Scala, you'll soon be able to start there and use PixieDust to insert sophisticated Python graphic options within your Scala notebook. Instructions coming soon.
-
Spark progress monitor. Track the status of your Spark job. No more waiting in the dark. Notebook users can now see how a cell's code is running behind the scenes.
Usage
You can use PixieDust locally or online within IBM's Data Science Experience (DSX).
Use online
To use PixieDust online
-
Sign up for a free trial on IBM's Data Science Experience
-
Create a new notebook from URL using this template and learn the basics
https://github.com/ibm-watson-data-lab/pixiedust/blob/master/notebook/DSX/Welcome%20to%20PixieDust.ipynb
Use locally
- Pixiedust supports
- Spark 1.6 or 2.0
- Python 2.7 or 3.5
Sample notebooks
Wherever you prefer to work, try out the following sample notebooks:
- Welcome to PixieDust The ultimate notebook to get started with PixieDust.
- Intro to PixieDust. Uses PackageManager to install GraphFrames, generates a dataframe from a simple data set, and lets you try the display() API. See also: Intro to PixieDust for Spark 2.0
- Mapping Intro lets you load sample data sets, explore display() API features, including maps.
Tutorials
- Discover hidden Facebook usage insights
- FlightPredict II: The Sequel shows how to predict flight delays with PixieDust. Includes an embedded app
- Sentiment Analysis of Twitter Hashtags with Spark revisits a spark streaming app this time using PixieDust and Jupyter. Includes an embedded app.
Contribute
Note: PixieDust currently supports Spark DataFrames, Spark GraphFrames and Pandas DataFrames, with more to come. If you can't wait, write your own today and contribute it back.
Read how to contribute for details on our code of conduct and instructions for submitting pull requests to us.
Developer Guide
Dive into the PixieDust developer docs and learn how to build your own custom visualization or embedded app. You can also pitch in and contribute an enhancement to PixieDust's core features.
We can't wait to see what you build.
License
Apache License, Version 2.0.
For details and all the legalese, read LICENSE.