When I started working at NVIDIA we were given some money to buy things at the employee store. Rather than buy tshirt swag, I right away bought a jetson nano. I’ve been playing with it off and on in the year in a half since I’ve been here. So here’s a nice project I did to end 2025: A simple app that recognizes people and greets them with a nice sounding voice.
This used to be a much bigger deal but now with cursor doing most of the work of coding for me it’s really just the infrastructure stuff I have to take care of. So here’s what I did:
Person Recognition
The community really has this down. I thought I’d have to do some training on a model or something really difficult. Not so, the models off the shelf work fine. Python already has this. The hardest thing is to just ensure you have the right things installed.
For facial recognition I used InsightFace. The python package was all I needed to get started. The hard part here was getting all the dependencies. I’m using Jetpack 6.2. At some point you’ll need packages which apt can’t seem to find. You install them as wheels from this website. There’s also some dependency hell:
pip install insightface
pip install 'matplotlib<3.8.0' && pip install 'numpy<1.24'
You’ll also need the onnxruntime-gpu. I had normal onnxruntime and it wasn’t working on the GPU. Since we have a GPU, let’s use it.
To get the face recognition working I just took some photos from my library and put them in named photos of people in my family. Then I uploaded those photos to the jetson nano. In my code directory it made the features for the face. It was then able to print out the face names.
Voice Text to Speech
I started with espeak. I was able to get the face recognition to say: “Hello <your name>”. I had my daughters walk in and look at the camera and it said: “Hello <child name>”. They were amazed! I love seeing that look! When the technology becomes personal and blurs the lines between magic and science!
But the espeak wasn’t that great of a voice. On to piper. Contrary to my LLM telling me to download a bunch of binary files its a simple pip install piper-tts.
Then I looked at the sample voices. Kristin worked for me, so I downloaded her voice:
python3 -m piper.download_voices en_US-kristin-medium
I also liked some of the English voices like Alba. Anyway, now I have voice files and I could integrate it into my code.
Fin
I deployed this for my family coming over for New Years Eve! The code is all up here. Let’s see how they react when their faces are detected!