Prabhjyot
Singh
ML Researcher & Engineer at the University of Waterloo.
Ethical AI · Reinforcement Learning · Autonomous Systems
About Me
I'm a Master's student at the University of Waterloo researching ethical AI and reinforcement learning under the UWECEML Lab. My work focuses on building evaluation frameworks that surface misaligned agent behavior and training methods that respect human values.
Across 6 co-op work terms, I've applied ML and software engineering in industry — from building PyTorch computer vision models at Kindred AI to designing LLM-powered agentic systems at BrainRidge Consulting. I enjoy sitting at the intersection of research and engineering.
Outside of research, I'm drawn to the philosophical questions behind alignment: what does it mean for an AI to behave "ethically," and how do we measure that rigorously?
Work History
6 co-op terms and industry experience across ML, robotics, and software engineering.
Graduate Research Assistant
- Researching ethical AI within reinforcement learning environments on the Moral AI Systems team — defining experiment protocols and evaluation criteria.
- Building Craftax RL experiments and JAX training/evaluation pipelines for agent-behavior analysis and benchmarking.
- Developing a Compute Canada compatible framework for scalable, reproducible ethical-AI experiments (configs, seeding, logging, batch runs).
Software Engineer
- Designed and developed LLM-powered agents using Claude Sonnet 4, implementing advanced prompt engineering and validation loops for reliable structured outputs.
- Built and deployed scalable NestJS microservices enabling secure GitHub and Jira REST API integration for automated issue and repository management.
- Architected a role-based authentication system with Auth0, Redis, and JWT for consistent RBAC across distributed services.
Robotics Software Developer
- Developed and optimized embedded firmware for FANUC and ABB robotic systems in TypeScript, C++, and KAREL.
- Improved trajectory planning and motion control algorithms, reducing erratic robotic movement by 30%.
- Led a codebase refactoring initiative, reducing file count by 20% and improving overall architecture clarity.
System Analyst
- Deployed an ITSM solution reducing ticket turnaround time by 45% and standardizing support workflows.
- Rolled out networked digital signage across five warehouses, linking plug-and-play devices into a centralized dashboard.
Robotics Test Engineer
- Designed and implemented a PyTorch supervised learning model to adjust image brightness/contrast, reducing segmentation error from 30% to 10%.
- Expanded automated end-to-end test coverage to 95% using Python and Cucumber in a virtual simulation environment.
- Uncovered a 30% error rate in low-light scenarios, driving firmware calibration improvements for enhanced sensor reliability.
QA Engineer
- Implemented automated regression testing with Cypress, reducing manual QA effort by 15%.
- Collaborated with product and engineering teams to improve documentation and accelerate feature releases by 20%.
QA Engineer
- Built automated test suites in JavaScript with Ghost Inspector to validate OCR workflows.
- Optimized HubSpot web pages (HTML/CSS/JS), boosting Lighthouse performance scores by 30 points.
Fullstack Engineer
- Developed AWS-integrated REST APIs and built responsive frontends with Angular, TypeScript, and CSS.
- Configured Ubuntu and AWS Linux servers for scalable, production-ready deployments.
Selected Work
Projects spanning ML systems, edge computing, human-robot interaction, and reinforcement learning.
Aether-Edge
Decentralized edge-native building management system reducing Age of Information from 19.4s (centralized) to near-zero.
NAO Robot Teacher Gender Study
Webots simulation investigating gender bias in human-robot interaction within educational environments using a humanoid NAO robot.
Aegis Lights
Self-adaptive urban traffic control system achieving a 45–49% reduction in average trip time across all traffic scenarios.
Canary
IoT personal air quality monitor — custom PCB, embedded BLE firmware, 3D-printed enclosure, and Android companion app. Built as a 4th year capstone.
Ethical AI & Reinforcement Learning
Building AI systems that are both capable and aligned with human values. Currently at the UWECEML Lab, University of Waterloo.
Ethical AI & Value Alignment
Investigating how reinforcement learning agents can learn to respect human values and moral constraints — through reward modeling, constrained optimization, and alignment verification.
Agent Behavior Analysis
Designing rigorous evaluation protocols that measure ethical behavior in RL environments. Building reproducible benchmarks on Craftax to surface and quantify misaligned agent behaviors.
Autonomous & Adaptive Systems
Exploring how intelligent systems can dynamically adapt to complex environments while maintaining behavioral guarantees — from self-adaptive control loops to edge-native sensing.
Methods for Training and Evaluating Ethical Reinforcement Learning Behaviour
- Researching ethical AI within reinforcement learning environments — defining experiment protocols and evaluation criteria for agent moral behavior.
- Building Craftax RL experiments and JAX training/evaluation pipelines for large-scale agent-behavior analysis and benchmarking.
- Developing a Compute Canada compatible framework for scalable, reproducible ethical-AI experiments with full configuration management, seeding, and batch run support.
Academic Background
University of Waterloo, one of Canada's top engineering programs.
Get In Touch
I'm currently seeking ML research internships and full-time roles for 2026. If you're working on interesting problems in AI alignment, RL, or autonomous systems — I'd love to connect.