How to run local LLMs on your smartphone

Our smartphones are basically just pocket computers, and they have features that are on par with those of computers in many ways. From using our smartphones for productivity purposes to using them for gaming, there is a lot you can do to get the most out of your phone. For example, you can run an LLM entirely on your smartphone without having an internet connection. The only downside is that you need one of the best phones to run any of these models.




If you want to run an AI model on a smartphone, you first need to know that to run virtually any model, you need a much RAM. This philosophy is why you need a lot of VRAM when working with applications like Stable Diffusion, and this also applies to text-based models. Basically, these models are usually loaded into RAM for the duration of the workload, and it is a lot faster than executing from memory.

RAM is faster for several reasons, but the two most important are lower latency, as it is closer to the CPU, and higher bandwidth. Due to these characteristics, it is necessary to load large language models (LLM) into RAM, but the next question that usually arises is exactly how much These models use RAM. Vicuna-7B is one of the most popular models that anyone can run. It is an LLM trained on a dataset of 7 billion parameters and can be deployed on an Android smartphone through MLC LLM, a universal app that helps in deploying LLM. To interact with the model on an Android smartphone, it requires around 6 GB of RAM, which is not a high hurdle these days.


If you are interested in running an LLM on your smartphone, read on because you will be surprised how easy it is.

Related

Easily run local LLMs on Mac and Windows with LM Studio

If you want to run LLMs on your PC or laptop, it has never been easier thanks to the free and powerful LM Studio. How to use it

How to use MLC to run LLMs on your smartphone

It is a very simple application

To download and run LLMs on your smartphone, you can download MLC LLM, a program that will deploy and load models for you. These models can then be run within the app, and the app will take care of loading these models into RAM and running them for you. In some ways, it's a bit like LM Studio, but only for your smartphone.


MLC LLM supports Vicuna 7B and RedPajama 3B installation by default, but you can give it a URL to load a model manually if you want to try a different one. There are many different models you can find on Hugging Face that might be worth trying and come in different sizes, but it's best to make sure you don't go beyond a 7B parameter model, otherwise these might be too big for your phone as they can use a lot of RAM.

To set up an LLM on your smartphone, follow these steps:

  1. Download and install MLC LLM
  2. Download one of the models shown or add it manually from Hugging Face
  3. Open it and wait for it to load


That's all! It makes the process incredibly easy to get an LLM up and running on your smartphone. If you want to try it out, this is by far the best way. MLC also has an app for iOS and the app is available on the App Store. There are hundreds of models on Hugging Face, so check them out and see what you can find!

Related

Best AI apps: tools you can run on Windows, macOS, or Linux

If you want to play with some AI tools on your computer, you can use some of these AI tools to do just that.

Leave a Comment