Running Zeppelin 0.8.2 on Windows 10 with Python and PySpark support

December 18, 2019

While trying to get Zeppelin up and running locally for a local PySpark project I hit no end of issues. Googling online turned up minimal information related to my supply of error messages. Hopefully this helps you get up and running.

0. Prerequisites

To get things started you need a functioning Python environment with PIP available. On my machine I’m using Anaconda to install Python 3. The conda command is available in PowerShell.

Second Zeppelin is written in Java and requires the JDK to run. Given licensing changes over the past year with Oracle’s Java download and install version 8 of the OpenJDK. Make sure to check the box adding it’s folders on to your PATH.

Finally after we download zeppelin we will need to extract it from an archive. I’m using 7-zip as my tool of choice to extract the .tar.gz file.

1. Download Zeppelin

For my local environment I downloaded the zeppelin-0.8.2-bin-all.tar.gz from the [Apache Zeppelin download page].

2. “Install” Zeppelin

Zeppelin doesn’t have an installer. It is “installed” in whatever directory contains the zeppelin-0.8.2-bin-all folder. I’ve run in to issues in the past with applications having problems residing in paths containing a “ ” (space) character. To get around this I’m going to extract the archive in to a directory named C:\apps\zeppelin-0.8.2. Fire up PowerShell and run the following:

mkdir C:\apps
mkdir C:\apps\zeppelin-0.8.2

Now copy the extracted files in to this directory. Note you may have to extract the file twice as the .tar.gz indicates the file is compressed 2x.

cp -r ~\Downloads\zeppelin-0.8.2-bin-all\* C:\apps\zeppelin-0.8.2

With all the files in place Zeppelin is now “installed”. Let’s get it started and explore some error messages.

3. Running Zeppelin

Open powershell, navigate to the C:\apps\zeppelin-0.8.2\ directory, and run bin\zeppelin.cmd

cd C:\apps\zeppelin-0.8.2
.\bin\zeppelin.cmd

This is going to start the zeppelin application. Note that some output will be displayed, but it won’t be available in your browser until the text Done, zeppelin server started is shown. See my output below:

PS C:\apps\zeppelin-0.8.2> .\bin\zeppelin.cmd
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/apps/zeppelin-0.8.2/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/apps/zeppelin-0.8.2/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 WARN [2019-12-18 19:36:28,404] ({main} ZeppelinConfiguration.java[create]:117) - Failed to load configuration, proceeding with a default
 INFO [2019-12-18 19:36:28,419] ({main} ZeppelinConfiguration.java[create]:129) - Server Host: 127.0.0.1
 INFO [2019-12-18 19:36:28,435] ({main} ZeppelinConfiguration.java[create]:131) - Server Port: 8080
 INFO [2019-12-18 19:36:28,435] ({main} ZeppelinConfiguration.java[create]:135) - Context Path: /
 INFO [2019-12-18 19:36:28,466] ({main} ZeppelinConfiguration.java[create]:136) - Zeppelin Version: 0.8.2
 INFO [2019-12-18 19:36:28,482] ({main} Log.java[initialized]:193) - Logging initialized @391ms to org.eclipse.jetty.util.log.Slf4jLog
 WARN [2019-12-18 19:36:28,561] ({main} ServerConnector.java[setSoLingerTime]:458) - Ignoring deprecated socket close linger time
 INFO [2019-12-18 19:36:28,593] ({main} ZeppelinServer.java[setupWebAppContext]:413) - ZeppelinServer Webapp path: C:\apps\zeppelin-0.8.2\webapps
 INFO [2019-12-18 19:36:28,652] ({main} ZeppelinServer.java[main]:239) - Starting zeppelin server
 INFO [2019-12-18 19:36:28,655] ({main} Server.java[doStart]:370) - jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: c4550056e785fb5665914545889f21dc136ad9e6; jvm 1.8.0_222-b10
 INFO [2019-12-18 19:37:25,514] ({main} StandardDescriptorProcessor.java[visitServlet]:283) - NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
 INFO [2019-12-18 19:37:25,545] ({main} DefaultSessionIdManager.java[doStart]:365) - DefaultSessionIdManager workerName=node0
 INFO [2019-12-18 19:37:25,545] ({main} DefaultSessionIdManager.java[doStart]:370) - No SessionScavenger set, using defaults
 INFO [2019-12-18 19:37:25,545] ({main} HouseKeeper.java[startScavenging]:149) - node0 Scavenging every 600000ms
 INFO [2019-12-18 19:37:25,764] ({main} SchedulerFactory.java[<init>]:59) - Scheduler Thread Pool Size: 100
 WARN [2019-12-18 19:37:25,780] ({main} ZeppelinConfiguration.java[getConfigFSDir]:527) - zeppelin.config.fs.dir is not specified, fall back to local conf directory zeppelin.conf.dir
 WARN [2019-12-18 19:37:25,780] ({main} ZeppelinConfiguration.java[getConfigFSDir]:527) - zeppelin.config.fs.dir is not specified, fall back to local conf directory zeppelin.conf.dir
DEBUG [2019-12-18 19:37:25,795] ({main} InterpreterSettingManager.java[<init>]:150) - InterpreterRootPath: C:\apps\zeppelin-0.8.2\interpreter
 INFO [2019-12-18 19:37:25,878] ({main} InterpreterSettingManager.java[<init>]:165) - Using RecoveryStorage: org.apache.zeppelin.interpreter.recovery.NullRecoveryStorage
 INFO [2019-12-18 19:37:25,878] ({main} InterpreterSettingManager.java[<init>]:169) - Using LifecycleManager: org.apache.zeppelin.interpreter.lifecycle.NullLifecycleManager
 WARN [2019-12-18 19:37:25,878] ({main} InterpreterSettingManager.java[init]:331) - No interpreter-setting.json found in C:\apps\zeppelin-0.8.2\interpreter\${interpreter.name}
DEBUG [2019-12-18 19:37:25,894] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\alluxio\interpreter-setting.json
 INFO [2019-12-18 19:37:26,117] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: alluxio
DEBUG [2019-12-18 19:37:26,117] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\angular\interpreter-setting.json
 INFO [2019-12-18 19:37:26,117] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: angular
DEBUG [2019-12-18 19:37:26,132] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\bigquery\interpreter-setting.json
 INFO [2019-12-18 19:37:26,132] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: bigquery
DEBUG [2019-12-18 19:37:26,148] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\cassandra\interpreter-setting.json
 INFO [2019-12-18 19:37:26,148] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: cassandra
DEBUG [2019-12-18 19:37:26,164] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\elasticsearch\interpreter-setting.json
 INFO [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: elasticsearch
DEBUG [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\file\interpreter-setting.json
 INFO [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: file
DEBUG [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\flink\interpreter-setting.json
 INFO [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: flink
DEBUG [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\groovy\interpreter-setting.json
 INFO [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: groovy
DEBUG [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\hbase\interpreter-setting.json
 INFO [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: hbase
DEBUG [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\ignite\interpreter-setting.json
 INFO [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: ignite
DEBUG [2019-12-18 19:37:26,179] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\jdbc\interpreter-setting.json
 INFO [2019-12-18 19:37:26,195] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: jdbc
DEBUG [2019-12-18 19:37:26,195] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\kylin\interpreter-setting.json
 INFO [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: kylin
DEBUG [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\lens\interpreter-setting.json
 INFO [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: lens
 WARN [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[init]:331) - No interpreter-setting.json found in C:\apps\zeppelin-0.8.2\interpreter\lib
DEBUG [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\livy\interpreter-setting.json
 INFO [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: livy
DEBUG [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\md\interpreter-setting.json
 INFO [2019-12-18 19:37:26,210] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: md
DEBUG [2019-12-18 19:37:26,226] ({main} InterpreterSettingManager.java[registerInterpreterFromResource]:361) - Reading interpreter-setting.json from jar:file:/C:/apps/zeppelin-0.8.2/interpreter/neo4j/zeppelin-neo4j-0.8.2.jar!/interpreter-setting.json as Resource
 INFO [2019-12-18 19:37:26,226] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: neo4j
DEBUG [2019-12-18 19:37:26,243] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\pig\interpreter-setting.json
 INFO [2019-12-18 19:37:26,272] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: pig
DEBUG [2019-12-18 19:37:26,274] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\python\interpreter-setting.json
 INFO [2019-12-18 19:37:26,275] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: python
DEBUG [2019-12-18 19:37:26,302] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\sap\interpreter-setting.json
 INFO [2019-12-18 19:37:26,318] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: sap
 WARN [2019-12-18 19:37:26,380] ({main} InterpreterSettingManager.java[init]:331) - No interpreter-setting.json found in C:\apps\zeppelin-0.8.2\interpreter\scio
DEBUG [2019-12-18 19:37:26,380] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\sh\interpreter-setting.json
 INFO [2019-12-18 19:37:26,380] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: sh
DEBUG [2019-12-18 19:37:26,396] ({main} InterpreterSettingManager.java[registerInterpreterFromPath]:373) - Reading interpreter-setting.json from file C:\apps\zeppelin-0.8.2\interpreter\spark\interpreter-setting.json
 INFO [2019-12-18 19:37:26,396] ({main} InterpreterSettingManager.java[registerInterpreterSetting]:425) - Register InterpreterSettingTemplate: spark
 INFO [2019-12-18 19:37:26,396] ({main} LocalConfigStorage.java[loadInterpreterSettings]:63) - Load Interpreter Setting from file: C:\apps\zeppelin-0.8.2\conf\interpreter.json
 INFO [2019-12-18 19:37:26,427] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting ignite from interpreter.json
 INFO [2019-12-18 19:37:26,427] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting python from interpreter.json
 INFO [2019-12-18 19:37:26,443] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting sap from interpreter.json
 INFO [2019-12-18 19:37:26,458] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting jdbc from interpreter.json
 INFO [2019-12-18 19:37:26,477] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting lens from interpreter.json
 INFO [2019-12-18 19:37:26,478] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting pig from interpreter.json
 INFO [2019-12-18 19:37:26,481] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting flink from interpreter.json
 INFO [2019-12-18 19:37:26,493] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting angular from interpreter.json
 INFO [2019-12-18 19:37:26,509] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting livy from interpreter.json
 INFO [2019-12-18 19:37:26,511] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting file from interpreter.json
 INFO [2019-12-18 19:37:26,524] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting groovy from interpreter.json
 INFO [2019-12-18 19:37:26,540] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting neo4j from interpreter.json
 INFO [2019-12-18 19:37:26,543] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting elasticsearch from interpreter.json
 INFO [2019-12-18 19:37:26,550] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting cassandra from interpreter.json
 INFO [2019-12-18 19:37:26,550] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting sh from interpreter.json
 INFO [2019-12-18 19:37:26,566] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting spark from interpreter.json
 INFO [2019-12-18 19:37:26,566] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting md from interpreter.json
 INFO [2019-12-18 19:37:26,566] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting alluxio from interpreter.json
 INFO [2019-12-18 19:37:26,566] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting bigquery from interpreter.json
 INFO [2019-12-18 19:37:26,566] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting hbase from interpreter.json
 INFO [2019-12-18 19:37:26,566] ({main} InterpreterSettingManager.java[loadFromFile]:279) - Create Interpreter Setting kylin from interpreter.json
 INFO [2019-12-18 19:37:26,598] ({main} LocalConfigStorage.java[save]:53) - Save Interpreter Setting to C:\apps\zeppelin-0.8.2\conf\interpreter.json
 INFO [2019-12-18 19:37:26,717] ({main} VfsLog.java[info]:138) - Using "C:\Users\Trilogy\AppData\Local\Temp\vfs_cache" as temporary files store.
 INFO [2019-12-18 19:37:26,764] ({main} GitNotebookRepo.java[<init>]:64) - Opening a git repo at '/apps/zeppelin-0.8.2/notebook'
 INFO [2019-12-18 19:37:26,950] ({main} NotebookRepoSync.java[<init>]:77) - Instantiate NotebookRepo: org.apache.zeppelin.notebook.repo.GitNotebookRepo
 WARN [2019-12-18 19:37:27,051] ({main} NotebookAuthorization.java[getInstance]:86) - Notebook authorization module was called without initialization, initializing with default configuration
 WARN [2019-12-18 19:37:27,051] ({main} LocalConfigStorage.java[loadNotebookAuthorization]:77) - NotebookAuthorization file C:\apps\zeppelin-0.8.2\conf\notebook-authorization.json is not existed
 INFO [2019-12-18 19:37:27,051] ({main} Credentials.java[loadFromFile]:121) - C:\apps\zeppelin-0.8.2\conf\credentials.json
 INFO [2019-12-18 19:37:27,101] ({main} StdSchedulerFactory.java[instantiate]:1184) - Using default implementation for ThreadExecutor
 INFO [2019-12-18 19:37:27,104] ({main} SimpleThreadPool.java[initialize]:268) - Job execution threads will use class loader of thread: main
 INFO [2019-12-18 19:37:27,115] ({main} SchedulerSignalerImpl.java[<init>]:61) - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
 INFO [2019-12-18 19:37:27,116] ({main} QuartzScheduler.java[<init>]:240) - Quartz Scheduler v.2.2.1 created.
 INFO [2019-12-18 19:37:27,117] ({main} RAMJobStore.java[initialize]:155) - RAMJobStore initialized.
 INFO [2019-12-18 19:37:27,133] ({main} QuartzScheduler.java[initialize]:305) - Scheduler meta-data: Quartz Scheduler (v2.2.1) 'DefaultQuartzScheduler' with instanceId 'NON_CLUSTERED'
  Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
  NOT STARTED.
  Currently in standby mode.
  Number of jobs executed: 0
  Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads.
  Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered.

 INFO [2019-12-18 19:37:27,133] ({main} StdSchedulerFactory.java[instantiate]:1339) - Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.properties'
 INFO [2019-12-18 19:37:27,149] ({main} StdSchedulerFactory.java[instantiate]:1343) - Quartz scheduler version: 2.2.1
 INFO [2019-12-18 19:37:27,164] ({main} QuartzScheduler.java[start]:575) - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started.
 INFO [2019-12-18 19:37:27,251] ({main} FolderView.java[createFolder]:107) - Create folder Zeppelin Tutorial
 INFO [2019-12-18 19:37:27,251] ({main} FolderView.java[createFolder]:107) - Create folder /
 INFO [2019-12-18 19:37:27,251] ({main} Folder.java[setParent]:169) - Set parent of / to /
 INFO [2019-12-18 19:37:27,251] ({main} Folder.java[setParent]:169) - Set parent of Zeppelin Tutorial to /
 INFO [2019-12-18 19:37:27,267] ({main} Folder.java[addNote]:185) - Add note 2C2AUG798 to folder Zeppelin Tutorial
 WARN [2019-12-18 19:37:27,267] ({main} Notebook.java[refreshCron]:981) - execution of the cron job is skipped cron is not enabled from Zeppelin server
 WARN [2019-12-18 19:37:27,267] ({main} InterpreterSettingManager.java[compare]:886) - InterpreterGroup sap is not specified in zeppelin.interpreter.group.order
 WARN [2019-12-18 19:37:27,267] ({main} InterpreterSettingManager.java[compare]:886) - InterpreterGroup sap is not specified in zeppelin.interpreter.group.order
 WARN [2019-12-18 19:37:27,267] ({main} InterpreterSettingManager.java[compare]:892) - InterpreterGroup sap is not specified in zeppelin.interpreter.group.order
 WARN [2019-12-18 19:37:27,267] ({main} InterpreterSettingManager.java[compare]:892) - InterpreterGroup sap is not specified in zeppelin.interpreter.group.order
 WARN [2019-12-18 19:37:27,267] ({main} InterpreterSettingManager.java[compare]:892) - InterpreterGroup sap is not specified in zeppelin.interpreter.group.order
 WARN [2019-12-18 19:37:27,267] ({main} InterpreterSettingManager.java[compare]:892) - InterpreterGroup sap is not specified in zeppelin.interpreter.group.order
 INFO [2019-12-18 19:37:27,267] ({main} Notebook.java[<init>]:127) - Notebook indexing started...
 INFO [2019-12-18 19:37:27,370] ({main} LuceneSearch.java[addIndexDocs]:305) - Indexing 1 notebooks took 90ms
 INFO [2019-12-18 19:37:27,370] ({main} Notebook.java[<init>]:129) - Notebook indexing finished: 1 indexed in 0s
 INFO [2019-12-18 19:37:27,375] ({main} Helium.java[loadConf]:103) - Add helium local registry C:\apps\zeppelin-0.8.2/helium
 INFO [2019-12-18 19:37:27,396] ({main} Helium.java[loadConf]:100) - Add helium online registry https://s3.amazonaws.com/helium-package/helium.json
 WARN [2019-12-18 19:37:27,400] ({main} Helium.java[loadConf]:111) - C:\apps\zeppelin-0.8.2\conf/helium.json does not exists
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.ConfigurationsRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.ConfigurationsRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.InterpreterRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.InterpreterRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.NotebookRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.NotebookRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.CredentialRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.CredentialRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.LoginRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.LoginRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.NotebookRepoRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.NotebookRepoRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.HeliumRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.HeliumRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.SecurityRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.SecurityRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.zeppelin.rest.ZeppelinRestApi registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.zeppelin.rest.ZeppelinRestApi will be ignored.
Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.Errors logErrors
WARNING: The following warnings have been detected: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.InterpreterRestApi.listInterpreter(java.lang.String), should not consume any entity.
WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String) throws java.io.IOException,java.lang.IllegalArgumentException, should not consume any entity.
WARNING: The (sub)resource method createNote in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.

Dec 18, 2019 7:37:28 PM org.glassfish.jersey.internal.Errors logErrors
WARNING: The following warnings have been detected: WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.CredentialRestApi.getCredentials(java.lang.String) throws java.io.IOException,java.lang.IllegalArgumentException, should not consume any entity.
WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.zeppelin.rest.InterpreterRestApi.listInterpreter(java.lang.String), should not consume any entity.
WARNING: The (sub)resource method createNote in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.
WARNING: The (sub)resource method getNoteList in org.apache.zeppelin.rest.NotebookRestApi contains empty path annotation.

 INFO [2019-12-18 19:37:28,493] ({main} ContextHandler.java[doStart]:855) - Started [email protected]{zeppelin-web,/,file:///C:/apps/zeppelin-0.8.2/webapps/webapp/,AVAILABLE}{C:\apps\zeppelin-0.8.2\zeppelin-web-0.8.2.war}
 INFO [2019-12-18 19:37:28,524] ({main} AbstractConnector.java[doStart]:292) - Started [email protected]{HTTP/1.1,[http/1.1]}{127.0.0.1:8080}
 INFO [2019-12-18 19:37:28,524] ({main} Server.java[doStart]:407) - Started @60427ms
 INFO [2019-12-18 19:37:28,524] ({main} ZeppelinServer.java[main]:249) - Done, zeppelin server started

With Zeppelin started we can now navigate to the web interface and verify it is behaving as we expect. Open http://127.0.0.1:8080/#/ in your browser and let the user interface load.

Zeppelin Homepage

Testing Zeppelin

Now that we are in the web interface let’s kick the tires and see if it just works. Zeppelin ships with a number of tutorial notebooks. In the left-most column select the Zeppelin Tutorial entry in the list (it has a little folder icon next to it) then choose the Basic Features (Spark) notebook. Next click the “Play” icon in the top toolbar and confirm the operation. This will execute all paragraphs in the notebook using a background process.

As it runs you will see some output beneath each paragraph. The second paragraph reads in a CSV file in to a dataframe and registers it for querying with Spark SQL. The following three paragraphs run SQL queries against this table and graph the results.

Zeppelin Spark Tutorial

With a regular Spark notebook running without any issues let’s try some Python and PySpark. Back at the top of the window select the Notebook entry in the main navigation bar. Select Zeppelin Tutorial again then choose Matplotlib (Python - PySpark). You may be asked to configure your interpreters. Simply click the Save button if it appears. Just like last time let’s play button to run all paragraphs again.

We are quickly alerted to an issue when the second paragraph is attempted to run. The error message presented:

org.apache.thrift.TApplicationException: Internal error processing createInterpreter
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:209)
	at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:192)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$2.call(RemoteInterpreter.java:169)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$2.call(RemoteInterpreter.java:165)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:165)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)
	at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:408)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
	at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:315)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

is a bit misleading. The error message is saying there was an issue creating the interpreter where the paragraph is actually executed. In our PowerShell window the true error is displayed:

DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) - Caused by: java.lang.RuntimeException: java.io.IOException: The system cannot find the path specified
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) -  at org.apache.zeppelin.python.PythonInterpreter.<init>(PythonInterpreter.java:105)
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) -  ... 13 more
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) - Caused by: java.io.IOException: The system cannot find the path specified
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) -  at java.io.WinNTFileSystem.createFileExclusively(Native Method)
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) -  at java.io.File.createTempFile(File.java:2024)
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) -  at org.apache.zeppelin.python.PythonInterpreter.<init>(PythonInterpreter.java:102)
DEBUG [2019-12-18 19:57:56,112] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) -  ... 13 more

Googling around for java.io.IOException: The system cannot find the path specified wasn’t very productive so our next step was to take a look at the code. Github hosts a mirror of the project’s repository at apache/zeppelin. Select the Branches drop down on th left and select the Tags tab. Here enter the version we’re running 0.8.2 or select it from the list. This will take you to https://github.com/apache/zeppelin/tree/v0.8.2 which should represent the code used to build the version we have installed.

![github_tags.png]

Now we press the t key which brings up a search box that filters file names. Enter PythonInterpreter.java which is listed in our error message output and press Enter. Now we will scroll down to line 105 and take a look at the code:

  public PythonInterpreter(Properties property) {
    super(property);
    try {
      File scriptFile = File.createTempFile("zeppelin_python-", ".py", new File("/tmp"));
      scriptPath = scriptFile.getAbsolutePath();
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
  }

The error message is being caused by an attempt to create a file under /tmp. Looking at our harddrive there is no /tmp directory. Let’s make that and see what happens.

mkdir /tmp

Clicking the play button on the second paragraph resulted in a graph being displayed! Excellent, let’s try it again with the entire notebook.

Traceback (most recent call last):
  File "C:\Users\Trilogy\AppData\Local\Temp\zeppelin_pyspark-5585656243242624288.py", line 22, in <module>
    from pyspark.conf import SparkConf
ModuleNotFoundError: No module named 'pyspark'
Traceback (most recent call last):
  File "C:\Users\Trilogy\AppData\Local\Temp\zeppelin_pyspark-5585656243242624288.py", line 22, in <module>
    from pyspark.conf import SparkConf
ModuleNotFoundError: No module named 'pyspark'
pyspark is not responding

Drat, it looks like we don’t have PySpark installed. Let’s take care of this with:

pip install pyspark

Running the paragraph returned the text pyspark is not returning. Given we just changed a library let’s stop Zeppelin and start it back up. After the process restarts run the notebook again. Now we are met with a new error:

import org.apache.spark.SparkContext._
import spark.implicits._
import spark.sql
import org.apache.spark.sql.functions._
Traceback (most recent call last):
  File "C:\Users\Trilogy\AppData\Local\Temp\zeppelin_pyspark-6390702313557505597.py", line 309, in <module>
    sc = _zsc_ = SparkContext(jsc=jsc, gateway=gateway, conf=conf)
  File "C:\Users\Trilogy\Anaconda3\lib\site-packages\pyspark\context.py", line 119, in __init__
    if gateway is not None and gateway.gateway_parameters.auth_token is None:
AttributeError: 'GatewayParameters' object has no attribute 'auth_token'

This looks like we’re trying to call code that doesn’t exist. Poking around in the source code for this library indicates that this method should have the auth_token parameter. Digging around in the C:\apps\zeppelin-0.8.2\interpreter\spark\pyspark directory there are two zip files present.

Python Libraries

Looking at the version of py4j installed along with PySpark my versions don’t match! Let’s see if we can figure out the version of PySpark in use here (along with Py4J). Py4J makes it easy for us by including the version in the filename 0.10.4 (vs 0.10.7 that was installed with our pip install invocation). PySpark is a little trickier as the version is not included in the filename of the archive. Let’s extract this archive and take a look at the version.py file in the extracted folder. This indicates Zeppelin 0.8.2 wants to use pyspark==2.2.1. At this point we can delete our extracted folder and invalid version then install the appropriate version.

rm C:\apps\zeppelin-0.8.2\interpreter\spark\pyspark\pyspark
pip uninstall pyspark py4j
pip install pyspark==2.2.1 py4j==0.10.4

With these changes in place go ahead and fire Zeppelin back up. Running all the paragraphs should complete without any issues.

Python Output

You now have a fully functioning Zeppelin installation with connected Python and PySpark interpretters.

TL;DR

mkdir /tmp
pip install pyspark==2.2.1 py4j==0.10.4