Portfolio

Live Portrait Monitor

A deep learning-based application for animating portraits displayed on a monitor, leveraging advanced face reenactment techniques. Learn more on GitHub.

Webcam Live Portrait

Real-time portrait animation using a webcam feed, utilizing deep learning-based face tracking and reenactment methods. Learn more on GitHub.

VoiceGuard

An AI-powered system designed to detect voice phishing in real time, ensuring enhanced security against fraudulent audio-based threats. Learn more on GitHub.

Face Segmentation

Semantic segmentation of facial features using PyTorch, enabling applications in augmented reality, digital makeup, and face modification. Learn more on GitHub.

Deep-Live Monitor

A sophisticated deep learning system for animating images displayed on a monitor, leveraging advanced computer vision techniques. Learn more on GitHub.

DeepVoiceGuard

DeepVoiceGuard is a robust solution for detecting spoofed audio in Automatic Speaker Verification (ASV) systems. This project utilizes the RawNet2 model, trained on the ASVspoof 2019 dataset, and deploys the trained model using FastAPI for real-time inference. Learn more on GitHub.

VoiceGUARD2

VoiceGUARD2 offers an end-to-end solution for classifying audio as human or AI-generated using the Wav2Vec2 model. It supports multi-class classification, distinguishing between real voices and synthetic audio produced by models such as DiffWave and WaveNet... The project encompasses dataset preparation, preprocessing, fine-tuning, inference, and API deployment for real-time predictions via FastAPI. Learn more on GitHub.

face_detection_onnx

This repository implements face detection using the SCRFD model, a fast and lightweight solution optimized for edge devices. The project employs the ONNX format for the model and leverages OpenCV for processing images and videos, enabling efficient and accurate face detection across various media formats. Learn more on GitHub.

License-Plate-Detection_ONNX

This repository provides code and instructions for performing license plate detection using YOLOv5 with ONNX Runtime. It supports inference on images, videos, and webcam feeds, utilizing GPU acceleration for efficient processing. The project includes Python scripts for easy deployment and integration into various applications. Learn more on GitHub.

Uzbek Sign Language Recognition

This project focuses on recognizing Uzbek Sign Language (USL), the primary language for deaf and hard of hearing individuals in Uzbekistan. The system aims to facilitate communication by translating USL gestures into text, benefiting both the deaf community and those seeking to communicate with them. The dataset comprises images representing various USL gestures, and the model is trained to accurately classify these signs. Learn more on GitHub.

VoiceVerifier-vv

VoiceVerifier-vv is a FastAPI-based speaker classification system that removes silence from audio files, extracts speaker embeddings using SpeechBrain's ECAPA-TDNN model, and performs classification using cosine similarity and a Random Forest classifier. Learn more on GitHub.

Gaze Emotion Recognition

This project integrates gaze tracking and facial emotion estimation to analyze user emotions in real-time. Utilizing OpenCV for face detection and DeepFace for emotion recognition, it processes webcam input to display emotion labels on detected faces. Additionally, it includes an audio-to-text feature, enhancing the multimodal analysis capabilities. Learn more on GitHub.

Additional Projects

Explore more of my work on GitHub.