How to set cuBLAS on Windows machine
**warning**
- DO NOT install many version of cuda toolkit.
- check cuda toolkit compatible for your device
: https://www.wikiwand.com/en/CUDA#/GPUs_supported\
- MUST CHECK which dev environment you use.
*pipenv on vscode, use vscode terminal.
*on anaconda, use anaconda terminal
**install**
1. install cuda toolkit => https://developer.nvidia.com/cuda-toolkit-archive
2. install visual studio with these options below
- C++ core features
- MSVC vxxx - VS 2022 C++ x64/x86 build tool
- C++ CMake tools for Windows.
- Windows 10/11 SDK.
**cuda version check**
- nvcc --version
or
- nvidia-smi
**run old command shell as administrator, not in powershell**
1. clone llama-cpp-python
git clone --recursive -j8 https://github.com/abetlen/llama-cpp-python.git
2. clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp
3. move "llama.cpp" folder into "llama-cpp-python/vendor/"
4. cd llama-cpp-python
5. run commands below
mkdir build
cd build
cmake .. -DLLAMA_CUBLAS=ON
cmake --build . --config Release
6. set environment values below
set FORCE_CMAKE=1
set CMAKE_ARGS=-DLLAMA_CUBLAS=ON
* you can check the value you set with "echo %FORCE_CMAKE%"
7. run below command in llama-cpp-python folder
python -m pip install -e . --force-reinstall --prefer-binary --no-cache-dir --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu118
** check your cudatoolkit version and put it at the end of the command above.
ex) cu120 or cu12.2 etc