How to install numpy on M1 Max, with the most accelerated performance (Apple's vecLib)? Here's the answer as of Dec 6 2021.
So that your Python is run natively on arm64, not translated via Rosseta.
- Download Miniforge3-MacOSX-arm64.sh, then
- Run the script, then open another shell
$ bash Miniforge3-MacOSX-arm64.sh- Create an environment (here I use name
np_veclib)
$ conda create -n np_veclib python=3.9
$ conda activate np_veclib- To compile
numpy, first need to installcythonandpybind11:
$ conda install cython pybind11- Compile
numpyby (Thanks @Marijn's answer) - don't useconda install!
$ pip install --no-binary :all: --no-use-pep517 numpy- An alternative of 2. is to build from source
$ git clone https://github.com/numpy/numpy
$ cd numpy
$ cp site.cfg.example site.cfg
$ nano site.cfgEdit the copied site.cfg: add the following lines:
[accelerate]
libraries = Accelerate, vecLib
Then build and install:
$ NPY_LAPACK_ORDER=accelerate python setup.py build
$ python setup.py install- After either 2 or 3, now test whether numpy is using vecLib:
>>> import numpy
>>> numpy.show_config()Then, info like /System/Library/Frameworks/vecLib.framework/Headers should be printed.
Make conda recognize packages installed by pip
conda config --set pip_interop_enabled true
This must be done, otherwise if e.g. conda install pandas, then numpy will be in The following packages will be installed list and installed again. But the new installed one is from conda-forge channel and is slow.
Except for the above optimal one, I also tried several other installations
- A.
np_default:conda create -n np_default python=3.9 numpy - B.
np_openblas:conda create -n np_openblas python=3.9 numpy blas=*=*openblas* - C.
np_netlib:conda create -n np_netlib python=3.9 numpy blas=*=*netlib*
The above ABC options are directly installed from conda-forge channel. numpy.show_config() will show identical results. To see the difference, examine by conda list - e.g. openblas packages are installed in B. Note that mkl or blis is not supported on arm64.
- D.
np_openblas_source: First install openblas bybrew install openblas. Then add[openblas]path/opt/homebrew/opt/openblastosite.cfgand build Numpy from source. M1andi9–9880Hin this post.- My old
i5-6360U2cores on MacBook Pro 2016 13in.
Here I use two benchmarks:
mysvd.py: My SVD decomposition
import time
import numpy as np
np.random.seed(42)
a = np.random.uniform(size=(300, 300))
runtimes = 10
timecosts = []
for _ in range(runtimes):
s_time = time.time()
for i in range(100):
a += 1
np.linalg.svd(a)
timecosts.append(time.time() - s_time)
print(f'mean of {runtimes} runs: {np.mean(timecosts):.5f}s')dario.py: A benchmark script by Dario Radečić at the post above.
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| sec | np_veclib | np_default | np_openblas | np_netlib | np_openblas_source | M1 | i9–9880H | i5-6360U |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| mysvd | 1.02300 | 4.29386 | 4.13854 | 4.75812 | 12.57879 | / | / | 2.39917 |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| dario | 21 | 41 | 39 | 323 | 40 | 33 | 23 | 78 |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
micromamba install numpy "libblas=*=*accelerate"works well.