Testing Qwen3.6-35B-A3B on RTX 3060 for Receipt-to-JSON Extraction

A user replaced Google Vision in a receipt processing pipeline with the local Qwen3.6-35B-A3B model running on an RTX 3060 GPU. The experiment demonstrated that the local setup could successfully parse key fields from Japanese receipts into JSON format.

Hardware: RTX 3060 12GB VRAM, utilizing llama.cpp and Qwen3.6-35B-A3B 12GB-target GGUF quantization.
Accuracy: Consistent extraction of store, date, subtotal, tax, and total fields across approximately 30 Japanese receipts.
Performance: Average processing time of ~31.75 seconds per receipt with ~11.06 GiB peak VRAM usage.

This approach offers a viable local alternative to cloud-based OCR services for document extraction tasks without requiring external API dependencies.