Pioneered fully automated pipeline for the core product, reducing the time to process new client projects from 4 weeks to 1 dayStandardized component interfaces and built parameterizable template repository with fully automated CI/CD process, reducing the time to integrate new research solutions from 2 months to 6 hoursOrchestrated distributed experiments with full lineage, processing up to 400 million frames per hour
Deep Learning Data Engineer
Oct 2023 - May 2024 • 8 months
Created scalable framework for ingesting, processing, and curating visual data, delivering 7 datasets and 54 million samplesDeveloped sampling algorithm for custom training distributions, increasing model performance by 3-7% for 12 projects in total
Invented method to measure degree of privacy-preservation in text embeddings, contributing to $90K business dealDesigned parallelized, distributed experiments with innovative caching scheme, reducing benchmarking time for different product solutions to under 3 minutesCreated web-based labeling tool pre-annotated with internal ML-based suggestions, reducing manual annotation time by 95%Deployed an enhanced email parser in production, leading to an average accuracy gain of 0.5% across all customers
Investigated 11 approaches to detect performance issues in neural networks deployed in productionDelivered novel method detecting inference mistakes and distribution shifts on all production datasets with an overhead of less than 1% by combining 3 different techniquesCollaborated within Algorithms Team to improve existing algorithms and finish 4 software engineering issuesIntroduced new custom debugging approach, making team members 50% more efficient in solving new issuesEnhanced optimization scheme of 2 production models, resulting in performance increases at 20+ customersIncreased stability and performance of most important classifier algorithm in company by 10%Documented 3 solutions for general problems during existing codebase setup, optimizing future employee onboarding process
Developed end-to-end testing framework for satellite traffic in Python and Robot FrameworkCreated package evaluating and summarizing performance of complete Dialog system with one single commandPresented results and discussed software architecture in daily meetings with international team of 15 peopleCreated 3 integration tests for modem configuration user interface, ensuring reliable software at 120+ customersCompared and summarized testing framework support and learning curve for 7 different IDE’s in Confluence, optimizing development speed for all employees