VoxieCam was built by Marko Balabanovic as a test of how much you can do in a few days in early 2026 with AI agentic engineering. You can read about the process of creating VoxieCam on the Whizzy Ideas blog: Agentic Engineering: 5 lessons from a 4-day AI experiment.
The app uses AI running on the phone to recognise objects in real time ("MobileCLIP", created by Apple in 2024). This is a very small AI model (small enough to run quickly on a phone!), but that means it is around 70% accurate. You can usually swipe through the word cards to find the one you're looking for.
We'll be adding more languages, but to begin with it can only translate from English to Serbian. The written and audio Serbian translations are all built into the app.