MLC LLM

MLC LLM is a universal solution that allows any language model to be deployed natively on a diverse set of hardware backends and native applications.

Please visit Getting Started for detailed instructions.

Demos

iOS

Our iOS app, MLCChat, is available on App Store for iPhone and iPad. This app is tested on iPhone 15 Pro Max, iPhone 14 Pro Max, iPhone 14 Pro and iPhone 12 Pro. Besides the Getting Started page, documentation is available for building iOS apps with MLC LLM.

Note: Llama-7B takes 4GB of RAM and RedPajama-3B takes 2.2GB to run. We recommend a latest device with 6GB RAM for Llama-7B, or 4GB RAM for RedPajama-3B, to run the app. The text generation speed could vary from time to time, for example, slow in the beginning but recover to a normal speed then.

Android

The demo APK is available to download. The demo is tested on Samsung S23 with Snapdragon 8 Gen 2 chip, Redmi Note 12 Pro with Snapdragon 685 and Google Pixel phones. Besides the Getting Started page, documentation is available for building android apps with MLC LLM.

Windows Linux Mac

Our cpp interface runs on AMD, Intel, Apple and NVIDIA GPUs. Besides the Getting Started page, documentation is available for building C++ apps with MLC LLM.

Web Browser

WebLLM is our companion project that deploys MLC LLM natively to browsers using WebGPU and WebAssembly. Still everything runs inside the browser without server resources, and accelerated by local GPUs (e.g. AMD, Intel, Apple or NVIDIA).

Disclaimer

The pre-packaged demos are subject to the model License.