Offline LLM Inference

📖 Overview

🔹 A Flutter application designed for private, localized Large Language Model (LLM) inference. By leveraging llama.cpp, this project allows users to run powerful AI models directly on their devices without needing an internet connection, ensuring maximum privacy and data security. It supports various quantized model formats and provides a responsive chat interface.

💻 Tech stack & More

✅ Flutter Framework
✅ llama.cpp / GGUF support
✅ On-device Inference
✅ Model Quantization
✅ Responsive UI Development
✅ Secure Local Storage