The DIY forge - AI content Tutorials and Coding - Installation of MiniCPM-V 4.5 and test scripts

Details: Written by: Super User; Category: tutorial; Published: September 14, 2025; Hits: 369

Installation of MiniCPM-V 4.5 and test scripts

MiniCPM is a family of compact, efficient, open-source language models developed by OpenBMB.
Its multimodal variants, like MiniCPM-V, can understand both text and images, enabling strong performance on vision-language tasks while being small enough to run on consumer-level hardware.

(Thank you AI, for the definition.)

In summary MiniCPM is like a VLM and if you feed it pictures, it can comment on them.
You can feed it many things: one picture, several pictures, videos, pdf (with Ollama-OCR)...

For its size it is surprisingly smart.
Honestly, wow...

It could even fit in a mobile phone and there is a test apk project that you can test.

Check this page for the APK:
https://github.com/harjeb/MiniCPM-V

However (in September, 14 2025) the mobile phone version is far to work as well as the computer version: it takes ages to load pictures, crash when the loaded picture is too big, is not efficient enough for a mobile phone, drains your battery way too fast, need (I would say) at least 16go of ram on your mobile phone, is slow. Should you install the APK ? I uninstalled mine...

So for the moment the Android version is very experimental, but if you are in a place where there is no internet and you need some extra helps, it could certainly be the future. The technology doesn't need much to be usable and an assistant that tells you how to cure a wound, help to identify a plant, translate signs offline is certainly interesting. The problem is likely the absence of efficient graphic card acceleration on a mobile phone, making it far too slow next to an online solution.

(Ending the digression about MiniCPM on Android phones...)

Here is how to install it from the Github's source:

While it seems that there is a way to install MiniCPM-V 4.5 with Ollama

https://github.com/OpenSQZ/MiniCPM-V-CookBook/blob/main/deployment/ollama/minicpm-v4_5_ollama.md

I didn't test this procedure and I didn't favor it, because I had myself problems of context length when I tried to use GGUF versions of MiniCPM, because of a likely problem of compatibility related to llama.cpp (using another procedure).

If you read this after September 2025, it is possible that llama.cpp could now work with MiniCPM-V 4.5.

So I installed MiniCPM-V 4.5 using the source available on Github and i decided to store the model in a local dedicated directory, since my Huggingface directory is a little bit bloated.

Below is the procedure that I suggest to install the packages/libraries that python needs to run MiniCPM-V 4.5 on a computer with a CUDA graphic card:

Besides the lack of CUDA with Pytorch, the main problem that you may experience with the default procedure of installation of MiniCPM-V 4.5 is this error:

from transformers import Qwen3Config, PretrainedConfig
ImportError: cannot import name 'Qwen3Config' from 'transformers' (D:\download\AI\minicpm_install\venv\lib\site-packages\transformers\__init__.py)

It is due to an incorrect version of "transformers", so don't follow what is said on the Github page of MiniCPM:
"Please ensure that transformers==4.44.2 is installed, as other versions may have compatibility issues."

This advice is likely for "MiniCPM-o 2.6" but you are going to need 'Qwen2...something' instead of Qwen3

I used transformers==4.56.1 and i was able to make it work.

I coded this prototype script to comment on pictures (chat) and this other one to comment on videos (one command) with MiniCPM-V 4.5, so you can test the script.

This is what I did to make them work.

Create a new virtual environment in the directory where you want to install the script:
python -m venv venv

activate it:
.\venv\Scripts\activate.bat

Clone the repository:

git clone https://github.com/OpenBMB/MiniCPM-V

cd MiniCPM-V

You only need this line, below if you want to use my test scripts, otherwise, it is not needed. If you run the basic "chat.py" script that is in the main folder of MiniCPM, it will rather download and use the model from the Huggingface's cache.
git clone https://huggingface.co/openbmb/MiniCPM-V-4_5
(Takes time)

python.exe -m pip install --upgrade pip
(it is always better)

pip install wheel
(there is a wheel file in the requirements)

download: requirements_MiniCPM-V-4_5.txt and put it in the MiniCPM-V directory:

Dont forget to adapt this url, below: https://download.pytorch.org/whl/cu128 this one is only if you have CUDA 12.8:

pip install --force-reinstall -r requirements_MiniCPM-V-4_5.txt torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu128

or do this instead (faster):
pip install --force-reinstall -r https://m14w.com/img/requirements_MiniCPM-V-4_5.txt torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu128

If you come to this page because of an error and the pip packages partially installed, make sure to use compatible packages. Please refer to requirements_MiniCPM-V-4_5.txt to find which packages I used.

I'll probably provide soon some little manual to use these scripts for pictures (chat) and this other one to comment on videos

Suggestions

Popular articles

Installation of MiniCPM-V 4.5 and test scripts

Installation of MiniCPM-V 4.5 and test scripts