Install without pip
NOTE: Only Python 2.7 and Python 3.5 are supported for now.
-
You can download the BigDL release and nightly build from the Release Page or build the BigDL package from source.
-
Install Python dependencies:
- BigDL only depends on
Numpy
for now. - For Spark standalone cluster:
- If you're running in cluster mode, you need to install Python dependencies on both client and each worker node.
- Install Numpy:
sudo apt-get install python-numpy
(Ubuntu)
- For Yarn cluster:
- You can run BigDL Python programs on YARN clusters without changes to the cluster (e.g., no need to pre-install the Python dependencies). You can first package all the required Python dependencies into a virtual environment on the localnode (where you will run the spark-submit command), and then directly use spark-submit to run the BigDL Python program on the YARN cluster (using that virtual environment). Please refer to this Packing-dependencies for more details.
- BigDL only depends on