Running local models on an M4 with 24GB memory

TL;DR

A software engineer demonstrates running smaller AI models locally on a 24GB M4 MacBook Pro. While not comparable to state-of-the-art models, it offers a usable experience for basic tasks, reducing reliance on cloud services.

A software engineer has demonstrated that it is possible to run certain smaller AI models locally on a 24GB M4 MacBook Pro, enabling basic tasks without internet access.The engineer experimented with various models and setup options, ultimately achieving a workable configuration with Qwen 3.5 9B (Q4) running on LM Studio. This model can perform tasks such as code suggestions and research, but it does not match the capabilities of larger, state-of-the-art models. The setup involves complex configuration, including model selection, inference settings, and enabling features like ‘thinking’ mode. Performance is limited to approximately 40 tokens per second, and the model sometimes gets distracted or loops. Despite these limitations, the setup allows users to reduce dependence on cloud-based AI services and operate offline, which is especially relevant for privacy-conscious users or those with limited internet bandwidth.

Why It Matters

This development shows that accessible hardware like a 24GB M4 MacBook Pro can handle smaller AI models effectively, opening possibilities for offline AI use, privacy preservation, and reduced reliance on large cloud providers. While not replacing high-end AI, it democratizes basic AI tasks for more users, especially developers and researchers, and highlights the growing viability of local AI deployment.

Apple 2024 MacBook Pro with Apple M4 Pro Chip (16-inch, 24GB RAM, 512GB SSD Storage) (QWERTY English) Space Black (Renewed)

Apple 2024 MacBook Pro with Apple M4 Pro Chip (16-inch, 24GB RAM, 512GB SSD Storage) (QWERTY English) Space Black (Renewed)

SUPERCHARGED BY M4 PRO OR M4 MAX — The 16-inch MacBook Pro with the M4 Pro or M4…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Recent advances in AI model compression and optimization have made it feasible to run smaller models locally. Previous efforts focused on cloud-based solutions; however, hardware improvements and software tools now enable more capable local inference. The experiment aligns with ongoing trends toward edge AI and privacy-focused computing. The engineer’s setup builds on existing open-source tools like llama.cpp, LM Studio, and Pi, which facilitate local model deployment. Prior to this, most users relied on remote cloud services for AI tasks, with local options limited to very small models or requiring specialized hardware.

“It’s surprisingly good for something that can run on a 24GB MacBook Pro while leaving space for other applications.”

— Johanna Larsson, Software Engineer

“While it’s not as powerful as SOTA models, it encourages a more engaged workflow and reduces dependency on big tech cloud services.”

— Johanna Larsson

Domain-Specific Small Language Models: Efficient AI for local deployment

Domain-Specific Small Language Models: Efficient AI for local deployment

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how well this setup will scale with different models or tasks, or how it performs over extended use. The performance and stability may vary depending on hardware, configuration, and model choice, and further testing is needed to establish broader applicability.

Amazon

AI inference hardware for MacBook

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Further experimentation will focus on optimizing configurations, testing additional models, and assessing long-term stability. Developers may explore automating setup processes and improving model performance, with potential community sharing of best practices. Future updates could include support for larger context windows or more advanced model features as hardware and software tools evolve.

AI at the Edge: Solving Real-World Problems with Embedded Machine Learning

AI at the Edge: Solving Real-World Problems with Embedded Machine Learning

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Can I run larger models on my MacBook Pro with 24GB RAM?

Currently, only smaller models like Qwen 3.5 9B (Q4) are feasible. Larger models require more memory and computational power, making them impractical on this hardware.

What are the main challenges in setting up local models?

Configuring the model, enabling features like ‘thinking’ mode, and optimizing inference settings require technical expertise and trial-and-error. Compatibility issues and performance tuning are common hurdles.

Does running models locally compromise their capabilities compared to cloud-based models?

Yes, smaller local models typically lack the complexity and long-term reasoning abilities of state-of-the-art cloud models. They are suitable for basic tasks and research but not for solving complex, multi-step problems.

What are the benefits of running models locally?

Offline operation, enhanced privacy, reduced reliance on internet connectivity, and potential cost savings are key benefits. It also allows more control over the environment and data.

You May Also Like

Team USA Dominates Canada in Basketball Showdown

Dive into Team USA's dominant performance against Canada in basketball, featuring standout players and a preview of their journey to the Paris Olympics.

Explore the Miscalculations in AI Marketing That Are Costing Brands Millions in Lost Opportunities.

Learn how miscalculations in AI marketing strategies could be draining your budget, but the solution might be simpler than you think.

Flipper One – we need your help

The creators of Flipper One announce an open development process for their Linux-based hardware, calling for community contributions to achieve full open-source support.

Capricorn Traits Unveiled: Yolanda Hadid's Revelation

Open the door to the world of Capricorn traits through Yolanda Hadid's story, revealing a tapestry of strength, ambition, and unwavering determination.