Pratik Shrestha

Регистрация: 20.02.2026

Специализация: Computer Vision

— Software Coordinator with experience managing large-scale technical events and leading open-source projects. — Computer Engineering graduate specializing in Computer Vision and Deep Learning, with a strong research focus on 3D Reconstruction, Multimodal AI, and Medical Image Analysis. — Proven track record of winning international and national AI competitions, contributing to journal publications, and developing end-to-end AI systems from research to deployment. Honors and awards: — 2nd Position: SAGES CVS Lighthouse Challenge (Sub-Challenge C) - *International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2025), Sep 2025. — 3rd Position: SAGES CVS Lighthouse Challenge (Sub-Challenge A) - *International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2025), Sep 2025. — Best students: Applied Data Science Elective Course - Samsung and Pulchowk Campus, Jan 2025. — Winner: LLM Category, Dataverse - LOCUS, Feb 2024. — Winner: Data Insights Category - Jan 2022. — Winner: Climate Change Category - Nov 2021.

Artificial Intelligence Computer Vision PyTorch Python Deep Learning JavaScript scikit-learn LLM Git Docker 3D Modeling Tensorflow OpenCV

Скиллы

Python

Pytorch

Numpy

Опыт работы

Research Assistant

с 05.2025 - По настоящий момент |Nepal Applied Mathematics and Informatics Institute for research

Pytorch, OpenCV, MMCV

Research and apply State of the Art AI techniques on medical imaging, specifically on Medical Image Segmentation, Out of Distribution Detection and Federated Learning

Computer Vision Research Engineer

с 06.2025 - По настоящий момент |Redev AI

Pytorch, OpenCV, Hugging Face

Experiment with State of the Art VLMs for Automatic Annotation, Training Detection and Classification Models

Software Coordinator

с 06.2024 - По настоящий момент |LOCUS

CRM

● Managed software-related events for LOCUS 2025, including the Software Fellowship, Hackathons, and AI Competitions. ● Oversaw and led various software projects under LOCUS and the LOCUS Open Source Team (LOST). ● Worked on screening and guiding software projects for the exhibition at LOCUS, Nepal's largest technological festival

Vice President

08.2019 - 02.2020 |St. Xavier's College Computer Club

CRM

● Organized club events and competitions, including the CSP Olympiad, Web Development Competition, and the SXC Computer Festival.

Computer Vision

Projects: Reconstruction of Heritage Structures

Python, PyTorch, 3D Gaussian Splatting, gsplat, Computer Vision, 3D Reconstruction

● Researched dataset capturing methodologies for the 3D reconstruction of large heritage structures. ● Utilized state-of-the-art Gaussian Splatting techniques (Hierarchical Gaussian Splatting, CityGaussian, 3DGS, gsplat) to reconstruct and compare 3D models for their effectiveness in this task. ● Investigated the impact of using masks and bilateral grids on the reconstruction quality of 3D models. ● Researched and worked on methods to build virtual walkthroughs of heritage structures.

Computer Vision

Projects: Virtual Try-On of Shoes

Python, PyTorch, SAM (Segment Anything Model), Gaussian Splatting, Snapchat LensStudio, AR, Image Segmentation, Dataset Curation

● Collected over 100 videos of shoes and sampled 30 images per video to create a dataset of 3000 shoe images from various orientations, resolutions, lighting conditions, and backgrounds. ● Created an automatic annotation pipeline with SAM to produce a large-scale shoe segmentation dataset. ● Trained different segmentation models on the created dataset. ● Utilized Gaussian Splatting to generate 3D models of the shoes from segmented images. ● Built an AR application using Snapchat's LensStudio to overlay the generated 3D shoe models onto a user's foot.

Computer Vision

Projects: Cattle Muzzle Pattern Matching

Python, PyTorch, Siamese Networks, ViT, ResNet, EfficientNet, Grad-CAM

● Researched existing methodologies, pre-processing techniques, models, and metrics for pattern matching. ● Trained a Siamese Model with various backbones (ViT, ResNet, EfficientNet) and compared their performance. ● Utilized Grad-CAM visualizations to interpret model results. ● Experimented with different datasets, models, and pre-processing techniques, performing ablation studies.

Computer Vision

Projects: SCOPE - Semantic Captioning for Optimized Photo Exploration

Python, PyTorch, Image Captioning, ViT, DeiT, GPT-2, BERT, Model Quantization, Knowledge Distillation, Embeddings

● Researched deep learning methods for Image Captioning. ● Experimented with combinations of different Vision Encoders (ViT, DeiT) and Text Decoders (GPT-2, distilGPT). ● Evaluated the performance of large and small models under time-constrained training scenarios. ● Researched downsizing methods for large models (Knowledge Distillation, Quantization) to enable edge-device deployment. ● Quantized GPT-2, BERT, and ViT models to reduce their space and compute requirements. ● Developed a system that generates image captions, processes them through BERT to create embeddings, and saves them as metadata to enable text-based image searches.

Computer Vision

Project

Python, PyTorch, CNNs, Medical Image Analysis, Image Classification, Dataset Curation, Transformers, Deep Learning Fundamentals

Personal Project: ● Trained classification models with different CNN backbones to classify fundus images into normal, three disease categories (Diabetes, Glaucoma, Cataract), or other. ● Analyzed model performance and dataset, proposing measures to increase accuracy. ● Searched for additional data and wrote transformation scripts to align the new dataset with the old one. ● Identified and corrected inconsistencies in the dataset labels by writing correction scripts. Other Projects: ● Mahakabi: Explored generative models for creative text generation. ● Vision Transformer Implementation: Built a Vision Transformer model from scratch. ● Anode: A novel project concept and implementation.

Teaching Assistant

01.2025 - 01.2025 |NAAMII

CRM

● Volunteered as a Teaching Assistant, supporting the Foundational Models lab and guiding participants through a project on SimCLR.

Lead, Children in Technology / Lead, Software Fellowship

07.2024 - 11.2024 |LOCUS

CRM

● Led a team of six to educate high school students on internet safety and technology. ● Conducted sessions in eight schools across four districts of Nepal. ● Organized a 10-day workshop on software fundamentals for over 250 beginner IT students. ● Led a team of over 30 volunteers in designing the syllabus and preparing teaching materials covering Web Design, Frontend, Backend, Deployment, and Data Science. ● Oversaw sponsorships, scheduling, and logistics for the event.

Assistant, Seminar on Generative AI

06.2023 - 06.2023 |IT Club

CRM

● Assisted participants in understanding generative model concepts and guided them through coding and executing the examples.

Образование

Computer Engineering (Бакалавр)

2021 - 2025

Pulchowk Campus, Tribhuvan University

Языки

АнглийскийПродвинутыйНепальскийРодной