OCR documents, transcribe speech in 40+ languages, classify images, detect documents and barcodes, smart-crop photos, upscale with Core ML, colorize B&W. Every model runs on your device's Neural Engine — your content stays private.

Extract printed and handwritten text from photos using Vision.
Transcribe audio in 40+ languages using Apple's Speech framework.
Detect the language of any block of text.
Identify objects, scenes and concepts in any photo.
Find faces with bounding boxes and landmark points.
Auto-crop to the salient subject of the image.
Find document edges in a photo for scanning.
Read every common barcode and QR code type.
Generate QR codes from URLs, contacts, Wi-Fi credentials and more.
2× and 4× upscale with DRCT or RealESRGAN models.
Restore old or damaged photos with InstructIR.
Bring black-and-white photos to life with DDColor.
Models like HVI-CIDNet (8 MB), Adaptive 3DLUT (5 MB), FBCNN (70 MB), InstructIR (64 MB).
DRCT, DDColor, GFPGAN, BiSeNet for advanced edits.
Depth Anything, RealESRGAN x4, NAFNet, CodeFormer, LaMa.
Yes. OCR uses Apple's Vision framework, which runs entirely on-device. No internet connection is needed and your photos are never uploaded.
Filemorph supports speech-to-text in 40+ languages including English, Spanish, French, German, Japanese, Korean, Chinese, Russian and Arabic, using Apple's on-device Speech framework.
Yes, the Core ML models are downloaded once on first use, then run locally on the Neural Engine for every subsequent conversion. Models are tier-based: small for iPhone, larger for iPad Pro and Mac.
No. After the initial model download, every AI operation runs on your device. Nothing about your content leaves your phone.
13 AI tools running on your Neural Engine.
Download on theApp Store