Albert Mundu
Hey there! I am a researcher and educator specializing in the intersection of Computer Vision and Natural Language Processing. With a strong foundation in deep learning, my work focuses on building intelligent systems for complex image and video scene understanding.
- PhD Research Scholar at IIIT-Allahabad (CVBL): Researching Image/Video Scene Understanding with a core focus on Multimodal AI, Computer Vision, and Natural Language Processing.
- Assistant Professor at Galgotias University: Instructing graduate courses in Machine Learning, Advanced Data Structures, and Algorithm Design & Analysis.
- Former Computer Vision Engineer Intern at Spyne AI: Engineered and deployed AI models for e-commerce object shadow generation utilizing VAE, GAN, Diffusion, and Blender.
- Research Objective: Committed to advancing deep learning technologies and leveraging expertise in Multimodal AI for pioneering academic and industry projects.
Let’s Connect
I am always open to discussing new research opportunities, academic collaborations, or just chatting about the latest advancements in Computer Vision, Natural Language Processing, Deep Learning, LLMs, MLLMs, VLMs, Image Generation (SD, FLUX, ZIT), and Distillation. Feel free to reach out!
news
| Dec 14, 2025 | Attended and presented ThreatNet at IEEE UPCON 2025. |
|---|---|
| Oct 10, 2025 | ThreatNet: Multimodal Firearm Threat Assessment Network accepted in IEEE UPCON 2025. |
| Aug 27, 2024 | ETransCap: Efficient Transformer for Image Captioning, Applied Intelligence, Springer is in press. |
| Aug 24, 2023 | Joined Galgotias University, Greater Noida as Assistant Professor. |