An AI/ML-based mobile app designed to assist the lives of the visually impaired

Sep 27, 2021 2 min read

Guidedog

Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled. You may as well think of it as “speaking guide dog,” as the name suggests. It has three key features based on the scene captured by your mobile phone:

Reads text upon command
Describes the scene around you upon command
Warns you if there is an obstacle in front of you

Check out this demo video to learn more about our app!

Android App

UI/UX
- Simple and Responsive
- Voice Assistant architecture for targeted audience
Libraries / APIs
- GC Speech-to-text and Text-to-Speech
- Android SDK , androidX
- ML Kit object detection and tracking api
- TensorFlow Lite MobileNet Image Classification Model

Backend

Flask API
- Image Captioning
- Optical Character Recognition
Deployment
- Google App Engine
- fast central API with different endpoints

Image Captioning

We used tensorflow to build and train model for image captioning on MS-COCO 2014 based on the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. The model uses standard convolutional network as an encoder to extract features from images (we use Inception V3) and feed the generated features into an attention-based decoder generate sentences. While the paper used LSTM model as a decoder, we use a simpler RNN instead.

Get more insights : Devpost

GitHub

https://github.com/kyuheejo/guidedog

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

An AI/ML-based mobile app designed to assist the lives of the visually impaired

Guidedog

Android App

Backend

Image Captioning

Get more insights : Devpost

GitHub

John

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Python library to decode CAN data from MoTeC M84

Guidedog

Android App

Backend

Image Captioning

Get more insights : Devpost

GitHub

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Python library to decode CAN data from MoTeC M84

You might also like...