Install without pip
-
You can download the BigDL release and nightly build from the Release Page or build the BigDL package from source.
-
Install python dependencies:
- BigDL only depend on
Numpy
for now. - For Spark standalone cluster:
- if you're running in cluster mode, you need to install python dependencies on both client and each worker nodes
- Install Numpy:
sudo apt-get install python-numpy
(Ubuntu)
- For Yarn cluster:
- You can run BigDL Python programs on YARN clusters without changes to the cluster (e.g., no need to pre-install the Python dependencies). You can first package all the required Python dependency into a virtual environment on the localnode (where you will run the spark-submit command), and then directly use spark-submit to run the BigDL Python program on the YARN cluster (using that virtual environment). Please refer to this Packing-dependencies for more details.
- BigDL only depend on