Khatrimazafullnet Fixed - The
Abstract We introduce KhatrimazaFullNet-Fixed, a fixed-point variant of the KhatrimazaFullNet architecture designed for resource-constrained devices performing multimodal (image, audio, text) inference and continual on-device learning. By combining block-wise quantization, low-rank weight factorization, and a stability-preserving fixed-point optimizer, our method reduces memory footprint and energy use while maintaining accuracy and training stability. Experiments on image classification (CIFAR-100), audio keyword spotting (Speech Commands), and multimodal retrieval (MS-COCO subset) show that KhatrimazaFullNet-Fixed achieves up to 8× reduction in model size, 3–5× lower inference energy, and <2% absolute accuracy loss vs. full-precision baselines; on-device continual updates using the fixed-point optimizer avoid catastrophic divergence typical in quantized training. We release code and profiling scripts to facilitate reproducible evaluation on mobile NPUs.
Abstract We introduce KhatrimazaFullNet-Fixed, a fixed-point variant of the KhatrimazaFullNet architecture designed for resource-constrained devices performing multimodal (image, audio, text) inference and continual on-device learning. By combining block-wise quantization, low-rank weight factorization, and a stability-preserving fixed-point optimizer, our method reduces memory footprint and energy use while maintaining accuracy and training stability. Experiments on image classification (CIFAR-100), audio keyword spotting (Speech Commands), and multimodal retrieval (MS-COCO subset) show that KhatrimazaFullNet-Fixed achieves up to 8× reduction in model size, 3–5× lower inference energy, and <2% absolute accuracy loss vs. full-precision baselines; on-device continual updates using the fixed-point optimizer avoid catastrophic divergence typical in quantized training. We release code and profiling scripts to facilitate reproducible evaluation on mobile NPUs.