
MoffettAI
MoffettAI is an AI chip design company specializing in sparse computing for high‑performance, energy‑efficient AI inference in data centers and edge environments. Founded in 2018 and headquartered in Shenzhen with offices in Shanghai, Beijing and Silicon Valley, the company provides cloud and endpoint AI acceleration platforms that support computer vision, natural language processing, multimodal models and other large‑scale AI workloads.
MoffettAI’s core innovation is its proprietary dual‑sparsity architecture and Antoum series AI chips, which support sparsity ratios up to 32× and are designed specifically for large‑scale inference. These chips power the company’s AI accelerator cards and inference engine, enabling customers in internet, telecom, smart city, life sciences, autonomous driving and other industries to run complex models with higher efficiency and lower total cost of ownership.
Main products and technologies
Antoum AI chip family
Antoum is a high‑performance, general‑purpose programmable AI chip that supports CNN, RNN, LSTM, Transformer, BERT and other mainstream network architectures, as well as both floating‑point and fixed‑point data types. By combining dual‑sparsity algorithms with a redesigned AI chip architecture, Antoum achieves up to 32× sparsity, significantly improving effective compute utilization and energy efficiency for inference workloads.Sparse AI accelerator cards (S4, S10, S30, S100, S300)
MoffettAI’s accelerator cards integrate Antoum chips and deliver ultra‑high performance and energy efficiency for data‑center AI inference, supporting computer vision, NLP, recommendation, speech and other large‑scale applications. Compared with GPU‑based solutions, these cards can provide multi‑fold effective compute at similar hardware cost while reducing power consumption and TCO.Sparse AI inference engine and software stack
The company offers a full software stack including the Moffett NNKit and NNCompressor toolchain, which supports mainstream frameworks such as TensorFlow, PyTorch and MXNet and provides 4–32× model sparsification while maintaining accuracy. This allows customers to compress and deploy large models onto a single card with lower latency, simplifying migration from existing GPU environments.
Key advantages
Leadership in sparse AI computing
MoffettAI positions itself as a global leader in sparse computing, with a proprietary dual‑sparsity architecture and commercial AI chips that support up to 32× sparsity and deliver order‑of‑magnitude gains in performance‑per‑watt.End‑to‑end “chip + card + software” ecosystem
By providing AI chips, accelerator cards and a complete inference software stack, the company offers a unified platform that is compatible with mainstream AI frameworks and easier to adopt in existing data‑center and edge‑computing environments.Broad application coverage and strong efficiency gains
MoffettAI’s solutions are used in data‑center AI, internet services, telecom, smart city, finance, education, healthcare, industrial manufacturing, energy and other sectors, enabling customers to run vision, NLP, speech and recommendation models at lower cost and power while maintaining high accuracy.

