Aana is open intelligence.

'Aana' (ആന),  means 'Elephant' in Malayalam.

Open Source Multimodal AI, delivering high accuracy but with just 10% of the compute, cutting time and costs across text, audio, and video for cost-effective, accessible enterprise AI.

Aana on GitHubContact us

Fresh from the Lab: Our Latest Innovations

NEW
Re-Distilling Smaller DeepSeek R1 Models for Better Performance
Learn how we enhanced DeepSeek R1 distilled models using logits distillation—achieving +4-14% gains on GSM8K while keeping training costs as low as $3-$18 per run!
NEW
Accelerating LLM Inference with GemLite, TorchAO, and SGLang
Our collaboration with the PyTorch team and the SGlang team at LMSysOrg to integrate GemLite into TorchAO and SGLang—unlocking faster and more efficient LLM inference.

10x smaller model footprint

We've developed extreme quantization techniques that retain high accuracy, including the world's first usable pure 1-bit LLM.

10-50x faster

Our kernels run Llama-3-8b at 200 tokens/sec on consumer GPUs.

Up to 40x cheaper

Our optimizations let you run 40b LLMs on consumer gaming GPUs, drastically reducing cost (e.g. Llama-3-70b on A6000 for 0.5$/hour).

Advancing the Frontier of GenAI

Gemlite: Towards Building Custom Low-Bit Fused CUDA Kernels
Simple CUDA kernels designed to help developers easily create their own low-bit “fused” General Matrix-Vector Multiplication (GEMV) CUDA code.
Introducing the open-source Aana SDK
Our open-Source SDK empowering the future of multimodal AI applications.


HQQ+: towards 1-bit models
Extreme low-bit quantization for greater compute efficiency.
Aanaphi-2
Our fast and capable 3B small language model, ranked #1 in open LLM leaderboard (Feb 2024).
Half-Quadratic Quantization (HQQ)
Faster and more accurate quantization techniques requiring no calibration data.
Faster & smaller Whisper for ASR
A Deep Dive into Quantization and Torch Compilation.

Existing capabilities

Summarization
Automatically analyze videos with remarkable nuance and capability.
Search
Converse with a system that understands your intent.
Recommendations
Extract similar and related content from first-party data, in a non-intrusive and fully private manner.
Data analytics
Ask, not code. Express yourself in natural language to answer complex questions.
Feel free to contact us to discuss your use cases!
Summarization
Automatically analyze videos with remarkable nuance and capability.
Recommendations
Extract similar and related content from first-party data, in a non-intrusive and fully private manner.
Search
Converse with a system that understands your intent.
Data analytics
Ask, not code. Express yourself in natural language to answer complex questions.
Have a use case do discuss?
Contact us

About us

We are experts in multimodal artificial intelligence, computer vision, and audio recognition who build efficient models and systems for processing large data streams.

Our team features 3 Best PhD Awards in machine learning from European institutions, with 100+ publications and 10,000+ citations.

Mobius Labs is based in Berlin, Germany.

30+ companies already trust us globally.

X
LinkedIn
Blog
Imprint
Mobius Labs GmbH is receiving additional funding by the ProFIT program of the Investment Bank of Berlin. The goal of the ProFIT project “Superhuman Vision 2.0 for every application- no code, customizable, on- premise  AI solutions ” is to revolutionize the work with technical images. (f.e.) This project is co-financed by the European Fund for Regional Development (EFRE). In the ProFIT project, we are exploring models that can recognize various objects and keywords in images and can also detect and segment these objects into specific pixel locations. Furthermore, we are investigating the application of ML algorithms on edge devices, including satellites, mobile phones, and tablets. Additionally, we will explore the combination of multiple modalities, such as audio embedded in videos and the language extracted from that audio.