Hands-on with Cursor Max Mode: Building a Mistral Small 3.1 Multimodal App for $1.7 (With Source Code)
Today, I’m excited to introduce Sonnet Max mode in the latest Cursor release and share my hands-on experience using this powerful feature.
For just $1.7 and two quick calls to Sonnet thinking mode, I successfully built a Mistral Small 3.1 API-powered multimodal application. This app not only performs image analysis and text-based conversations but also includes a convenient chat history review feature.
1. App Demo: Intuitive and User-Friendly
Let’s first take a look at the application interface built with Cursor Max mode:
As you can see, the app features a clean, modern UI with straightforward functionality. Users can easily:
✅ Upload images for analysis
✅ Engage in text conversations with the model
✅ Review past interactions via the history button
At its core, the app leverages Mistral Small 3.1, which provides robust multimodal capabilities.
2. Mistral Small 3.1: Small Model, Big Potential
Mistral Small 3.1 is the latest model from Mistral AI, an upgrade over Small 3 with enhanced multimodal understanding.
Key Features:
🔹 Multimodal Comprehension – Processes both text and images
🔹 Extended Context (128K) – Handles longer conversations and complex queries
🔹 High Performance (24B Parameters) – Outperforms Gemma 3 27B and GPT-4o Mini in many benchmarks
Performance Comparison (Mistral Official Data):
Limitations:
- Math performance slightly lags behind Gemma 3 27B
- Optimized for English & French – East Asian language support still needs improvement
3. Cursor Max Mode: The Ultimate Development Tool
Cursor Max mode is the smartest, longest-context, and most “thoughtful” mode available in Cursor.
- Cost: $0.05 per request & per tool use
- First-time setup: Enable “Usage-based pricing” in settings
- Pro Tip: Set a “Hard Limit” to control costs
4. Step-by-Step Development: From Prototype to Polish
Preparation:
- Created a new
app.py
file in Cursor - Copied Mistral’s official Python code example into
app.py
- Modified the model name to
mistral-small-latest
Initial Build:
- Activated Agent Mode → Max Mode
- Entered the prompt:
Cursor auto-generated the app code, including necessary files and folders.
- First request cost: ~$0.6 (producing a functional MVP)
Optimization & Refinements:
⚠ Caution: The 3.7 thinking mode sometimes adds unnecessary restrictions—always review generated code carefully.
Key Improvements:
🔸 Added support for multiple image sizes/formats
🔸 Fixed Base64-encoded image display in chat history
🔸 Enhanced UI/UX design for better usability
5. Multimodal Testing: Strong but Room for Growth
I tested Mistral Small 3.1’s capabilities with various inputs:
✅ IKEA Sofa Assembly Diagram
- Recognized most parts but misidentified washers as screws
✅ Chart Analysis
- Accurately interpreted trends, peaks, and volatility
⚠ Chinese Comic Recognition
- Mistranslated “轻音乐” (light music) as “can’t hear”
- Needs improvement in Chinese OCR
✅ English Chart Recognition
- Provided precise metric estimations
Verdict: Excellent with English/Western content, but East Asian language support needs work.
Source Code Available!
Want to try this app yourself? Grab the code here:
🔗 https://github.com/nicekate/mistral-flask-app
Final Thoughts:
For just $1.7, Cursor Max mode + Mistral Small 3.1 delivers impressive multimodal functionality. While not perfect (especially for non-Latin scripts), it’s a cost-effective way to prototype AI-powered apps.
What will you build with it? 🚀