Hang tight! We’re taking you to our new page.

Hang tight! We’re taking you to our new page..

Hang tight! We’re taking you to our new page...

News: Our IP and core tech are now part of Dropbox, brining our multimodal and open-source innovation to Dropbox scale

New landing page Try Aana

Aana is open intelligence.

'Aana' (ആന), means 'Elephant' in Malayalam.

Open Source Multimodal AI, delivering high accuracy but with just 10% of the compute, cutting time and costs across text, audio, and video for cost-effective, accessible enterprise AI.

Aana on GitHub Contact us

Fresh from the Lab: Our Latest Innovations

NEW
Re-Distilling Smaller DeepSeek R1 Models for Better Performance

Learn how we enhanced DeepSeek R1 distilled models using logits distillation—achieving +4-14% gains on GSM8K while keeping training costs as low as $3-$18 per run!

NEW
Accelerating LLM Inference with GemLite, TorchAO, and SGLang

Our collaboration with the PyTorch team and the SGlang team at LMSysOrg to integrate GemLite into TorchAO and SGLang—unlocking faster and more efficient LLM inference.

10x smaller model footprint

We've developed extreme quantization techniques that retain high accuracy, including the world's first usable pure 1-bit LLM.

10-50x faster

Our kernels run Llama-3-8b at 200 tokens/sec on consumer GPUs.

Up to 40x cheaper

Our optimizations let you run 40b LLMs on consumer gaming GPUs, drastically reducing cost (e.g. Llama-3-70b on A6000 for 0.5$/hour).

Advancing the Frontier of GenAI

Gemlite: Towards Building Custom Low-Bit Fused CUDA Kernels

Simple CUDA kernels designed to help developers easily create their own low-bit “fused” General Matrix-Vector Multiplication (GEMV) CUDA code.

Introducing the open-source Aana SDK

Our open-Source SDK empowering the future of multimodal AI applications.

‍

HQQ+: towards 1-bit models

Extreme low-bit quantization for greater compute efficiency.

Aanaphi-2

Our fast and capable 3B small language model, ranked #1 in open LLM leaderboard (Feb 2024).

Half-Quadratic Quantization (HQQ)

Faster and more accurate quantization techniques requiring no calibration data.

Faster & smaller Whisper for ASR

A Deep Dive into Quantization and Torch Compilation.

Wanna try it?

Aana SDK is Open Source and the future of AI applications making it easy to create lightweight apps anywhere.

The code to implement similar capabilities using Aana SDK

Existing capabilities

Summarization

Automatically analyze videos with remarkable nuance and capability.

Search

Converse with a system that understands your intent.

Recommendations

Extract similar and related content from first-party data, in a non-intrusive and fully private manner.

Data analytics

Ask, not code. Express yourself in natural language to answer complex questions.

Feel free to contact us to discuss your use cases!

Summarization

Automatically analyze videos with remarkable nuance and capability.

Recommendations

Extract similar and related content from first-party data, in a non-intrusive and fully private manner.

Search

Converse with a system that understands your intent.

Data analytics

Ask, not code. Express yourself in natural language to answer complex questions.

About us

We are experts in multimodal artificial intelligence, computer vision, and audio recognition who build efficient models and systems for processing large data streams.

Our team features 3 Best PhD Awards in machine learning from European institutions, with 100+ publications and 10,000+ citations.

Mobius Labs is based in Berlin, Germany.

30+ companies already trust us globally.

Blog

Imprint

Mobius Labs GmbH is receiving additional funding by the ProFIT program of the Investment Bank of Berlin. The goal of the ProFIT project “Superhuman Vision 2.0 for every application- no code, customizable, on- premise AI solutions ” is to revolutionize the work with technical images. (f.e.) This project is co-financed by the European Fund for Regional Development (EFRE). In the ProFIT project, we are exploring models that can recognize various objects and keywords in images and can also detect and segment these objects into specific pixel locations. Furthermore, we are investigating the application of ML algorithms on edge devices, including satellites, mobile phones, and tablets. Additionally, we will explore the combination of multiple modalities, such as audio embedded in videos and the language extracted from that audio.