Home
What can you ask to MiniCPM-V 4.5 on pictures?
- Details
- Written by: Super User
- Category: tests
- Hits: 76
What can you ask to MiniCPM-V 4.5 on pictures?
In my latest tutorial, I provided some corrected installation method for MiniCPM-V 4.5 and some test scripts, so it's time to use them to see what we can ask to MiniCPM-V 4.5.
Today I will use the script to test the pictures.
Download the script above and put it in your "MiniCPM-V" directory, then activate your venv environment and run it in a command line prompt from the "MiniCPM-V" directory. (If you are used to use automatic1111 or comfyUI, you know what I mean, so I won't explain more).I would also suggest that you read first the "MiniCPM Model License.md" file (open it with notepad) from the same directory.
When you start the script, the main commands are:
Installation of MiniCPM-V 4.5 and test scripts
- Details
- Written by: Super User
- Category: tutorial
- Hits: 67
Installation of MiniCPM-V 4.5 and test scripts
MiniCPM is a family of compact, efficient, open-source language models developed by OpenBMB.
Its multimodal variants, like MiniCPM-V, can understand both text and images, enabling strong performance on vision-language tasks while being small enough to run on consumer-level hardware.
(Thank you AI, for the definition.)
In summary MiniCPM is like a VLM and if you feed it pictures, it can comment on them.
You can feed it many things: one picture, several pictures, videos, pdf (with Ollama-OCR)...
For its size it is surprisingly smart.
Honestly, wow...
It could even fit in a mobile phone and there is a test apk project that you can test.
How to install fastvlm from Apple on a windows computer?
- Details
- Written by: Super User
- Category: tutorial
- Hits: 107
Can you install an Apple VLM model on a Windows system?
Beside a couple of other products that look promising like the Airpod pro 3 (reportedly with live translation). Apple also released, a little bit before, a series of VLM models on huggingface, the FastVLM series:
Problem: its license is extremely restrictive
Second problem: which version of FastVLM are we talking about?
Read more: How to install fastvlm from Apple on a windows computer?
How to install SageAttention2 on a Windows system with ComfyUI?
- Details
- Written by: Super User
- Category: tutorial
- Hits: 140
How to install SageAttention2 on a Windows system with ComfyUI?
As I explained in this recent tutorial on how I optimized a ComfyUI workflow for Wan 2.2, SageAttention is a tool/component that allows you to generate videos faster.
To make the things easier, I'll consider that SINCE YOU HAVE COMFYUI, you already have a python virtual environment (venv) with everything installed and so on, because this will make the things much shorter to explain.
If you come here, I suppose that you are already aware of this kind of command:
Read more: How to install SageAttention2 on a Windows system with ComfyUI?
Comfyui: Is your workflow optimized for your computer?
- Details
- Written by: Super User
- Category: video generation
- Hits: 183
This article, based on my own experimentations with wan 2.2, explains why you should optimize your settings and your nodes with ComfyUI. This means to not just rely on third-party tools...
At the end, I created my own optimized workflow that I link here to do a "First frame, last Frame" video generation with wan 2.2.
If you are interested in the workflow, this link will redirect you to the related CivitAi page. Note, that I still plan to tune it a bit.
Click on "read more" if you want to know how I did...
Read more: Comfyui: Is your workflow optimized for your computer?
Official release of Sora
- Details
- Written by: Super User
- Category: Text-to-Video
- Hits: 1939
Click on the picture below or here where you can find how to use Sora
How to generate super-resolution versions of images ? - Part 2: Basic txt2img upscaling concepts and upscalers in Automatic1111 (img2img and Extras tab)
- Details
- Written by: Super User
- Category: image to image
- Hits: 3705
This is the second part of my tutorials on super-resolutions.
I want to be able to generate pictures above their normal resolutions.
It can be like in this famous "zoom and enhance" trick, to give additional details to pictures.
But it can also be to make generated images larger with a better quality or to remove the blur that would otherwise have an upscaled picture. This is what we are going to see here.
This article is more a serie of experiments that I make with Automatic1111, to see the changes induced by some upscale settings on generated pictures, that's not really a "how to".
How to generate super-resolution versions of images ? - Part 1: restore pictures with img2img
- Details
- Written by: Super User
- Category: image to image
- Hits: 7692
Part 1: How to restore blurry/pixelated pictures with basic img2img in Automatic111 ?
I want to be able to do this famous Hollywood "zoom - enhance" trope trick at home.
I used various tools to get a result and this result varies a lot in function of the technic that I used.
In this article I will cover how to use the img2img tab of Automatic1111, an interface to use "stable diffusion" models, to generate enhanced pictures.
How to use Stable Diffusion 3 models with Automatic1111 ?
- Details
- Written by: Super User
- Category: Text-to-Image
- Hits: 11229
Update: i see that this tutorial is now obsolete, there is no more a sd3 branch of Automatic1111. If you experience any issue, download Forge instead.
Let's make it short: You can't with the normal version but you can with a special branch of the software.

Do you want to know more ?
Read more: How to use Stable Diffusion 3 models with Automatic1111 ?
Microsoft released some courses on Generative AI
- Details
- Written by: Super User
- Category: Uncategorised
- Hits: 984
There are, for the moment, 18 lessons that are available on github.
The github features videos and source code.
Have a look on the screenshot or click on this link to go to the Github of Microsoft.
Page 1 of 6