Adam Ivansky, Developer in Buffalo, NY, United States
Adam is available for hire
Hire Adam

Adam Ivansky

Verified Expert  in Engineering

Machine Learning Developer

Location
Buffalo, NY, United States
Toptal Member Since
November 6, 2018

Adam有9年的工程师经验和2年的技术负载经验. His tools of choice include Python 3, Snowflake, Spark, and SQL. His main focus areas include ETLs and machine learning marketing pipelines. Adam能够与高技术和非技术专家进行有效的沟通.

Portfolio

Apple
Python 3, Python API, Amazon EKS, Docker, Kubernetes, Amazon S3 (AWS S3)...
BJ's Wholesale Club
Jenkins, AWS CLI, Amazon S3 (AWS S3), Redshift, Python 3, Spark...
eBay
SQL, TensorFlow, Scikit-learn, Tableau, PySpark, Apache Hive, Python, Teradata...

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Python, Terraform, Snowflake, PySpark, Amazon Elastic Container Service (Amazon ECS), ETL, Django, FastAPI, Streaming Data

The most amazing...

...project I've worked on is the development of a Spark metastore data warehouse.

Work Experience

Data Engineering Tech Lead

2020 - 2021
Apple
  • Served as a data engineer in charge of two projects end-to-end. The projects involved collecting data from 3rd-party cloud vendors.
  • 开发基于Python和Spark的定时etl,从各种api收集数据,并将数据加载到Amazon S3和PostgreSQL数据库中. The ETLs were deployed to Airflow and Kubernetes.
  • 构建了许多api,将数据仓库中的数据公开给数据的使用者.
  • Created and modified ETLs based on AWS Glue. Created a serverless ETL based on Amazon SQS and AWS Lambda.
Technologies: Python 3, Python API, Amazon EKS, Docker, Kubernetes, Amazon S3 (AWS S3), Amazon Simple Queue Service (SQS), Amazon Elastic MapReduce (EMR), Redshift, PostgreSQL, SQL, Spark, Data Engineering

Data Engineer

2019 - 2020
BJ's Wholesale Club
  • 基于运行在AWS EMR上的PySpark开发了一个ETL管道,用于从Redshift提取数据到S3.
  • Contributed to a product recommendation engine based on Spark machine learning.
  • Developed a data quality assessment tool in PySpark.
  • Owned cloud cost reporting. Managed EMR cluster creation/termination in AWS CLI and AWS console.
  • Completely automated ETL/marketing pipeline in Jenkins.
  • 对基于第三方数据识别新潜在成员的算法做出了贡献.
Technologies: Jenkins, AWS CLI, Amazon S3 (AWS S3), Redshift, Python 3, Spark, Amazon Elastic MapReduce (EMR), SQL, Data Engineering

Senior Database Marketing Analyst

2017 - 2018
eBay
  • 为旗舰营销活动开发目标脚本,重点是电子邮件, mobile push notification, social, and on-site channels. 这些活动的目标用户通常超过5000万,有时收益超过100美元,000 in iGMB annually.
  • Designed, developed, implemented, 维护用Python编写的多臂强盗算法,同时遵守eBay的营销标准和流程. The algorithm was measured to generate $5 mil. annually.
  • Trained an algorithm for send-time optimization. 这使得在实施了这一策略的活动中点击率提高了15%.
  • Assessed existing email, social, and mobile marketing campaigns in terms of KPIs such as iGMB, OR, and CTR.
  • 在Tableau中创建仪表板,报告我创建的不同营销算法的性能.
  • Created scripts that moved data between HIVE and Teradata servers.
  • 使用世界上最大的Teradata DWH,经常查询具有1000多亿行的表.
  • Communicated with stakeholders across multiple timezones.
Technologies: SQL, TensorFlow, Scikit-learn, Tableau, PySpark, Apache Hive, Python, Teradata, Python 3, Spark, Data Engineering

Machine Learning SW Developer

2016 - 2017
Valeo
  • 开发并训练了一种机器视觉算法,用于识别车辆前方的行人. 此后,该算法已在包括通用2019款雪佛兰在内的多款车型上实施.
  • Trained and algorithm for detection of dirt on the camera lens. 该算法在支持其他更复杂的自动驾驶功能方面发挥了至关重要的作用.
  • 评估用于算法训练的非结构化注释视频数据的质量.
  • 创建了一个脚本,用于在参与项目的多个团队之间同步结构化和非结构化数据.
  • 参加计算机科学会议,学习科学文献,了解机器学习和计算机科学的最新趋势. Knowledge exchange with other team-members.
  • 与来自法国和爱尔兰的队友和利益相关者进行沟通和联系.
Technologies: Protocol Buffers, Intel TBB, C++, OpenCV, SQL, MATLAB, Python, Python 3, Data Engineering

Credit Risk Analyst

2014 - 2015
Erste Group
  • Calculated risk parameters CCF, LGD and PD according to BASEL 2.
  • 由于改进了用于计算风险参数CCF的统计引擎,将Erste Bank子公司的总体准备金要求降低了7%以上, LGD and PD that I have introduced.
  • 在SAS中设计并训练了一个数学模型,用于预测客户违约时的总体损失. This helped Erste improve the repossession process and reduce expenses.
  • Performed ad-hoc stress-tests for Erste subsidiaries. The results were later submitted directly to the European National Bank.
  • Assessed of risk portfolio stability via bootstrapping and monte-carlo methods.
  • Created interactive dashboards for risk parameter reporting in MS SQL and Excel.
  • Developed a data quality testing system.
Technologies: Microsoft Excel, MATLAB, Microsoft SQL Server, SAS, SQL

Teaching and Research Assistant

2012 - 2014
University of Rochester
  • Led lab lectures for undergraduate students.
  • 开发实验自动化软件,分析实验数据.
  • Wrote several scientific papers that are available online.
Technologies: MATLAB

eBay App Push Notification Send Time Optimization Project

The project aimed to improve the click-through rates of mobile push notifications. 该算法的引入使手机推送通知的点击率提高了15%.

我决定通过开发一种ML算法来预测每个用户的最佳接触时间来实现这一目标. The algorithm was developed in Python and was trained using scikit-learn. Obtaining training data required the use of Hive and PySpark. 我成功地将算法应用到营销生产环境中,并指导营销分析师如何使用它.

Model for Dynamic Content Optimization and Customization

该项目的目的是通过使用机器学习来提高eBay优惠券活动的点击率. The development of the algorithm was successful, and it was measured to generate a 20% lift in click-through rate and IGMB.

The early version of the algorithm was based on the multi-armed bandit. Later versions made use of contextual NLP-based multi-armed bandit. The algorithm was developed using a combination of Teradata SQL and Python. 我还开发了一个交互式的Tableau仪表板,以监控算法的功能并测量算法带来的KPI提升.

Model for Pedestrian Detection Intended for Self-driving Vehicles

该项目旨在开发一种机器视觉算法,能够通过分析车载摄像头的输入来检测车辆前方的行人. 该算法现在功能齐全,并已嵌入到几款新车型中, including the GM 2019 Chevy.

我们决定使用的机器学习算法是AdaBoost级联分类器与深度神经网络相结合. We wrote the training application from scratch in C++. Training had to be multithreaded in order to be efficient. Testing and validation were done in Python. A large database of annotated video data was used for algorithm training.

Prediction Model

准确预测客户违约后的最终损失总额是降低与不同贷款产品相关风险的关键.

我开发了一个基于贷款价值比和抵押品价值的模型. It was done using a combination of SAS and Microsoft SQL Server. 该模型的开发需要大量的数据清理和数据质量测试.

Product Recommendation Algorithm

参与了基于协同过滤模型的推荐引擎的开发. 该引擎甚至能够推荐特定客户过去不一定购买的产品. The solution was implemented in PySpark and was based on MLlib, Spark's machine learning (ML) library.

ETL for Recommendation Algorithm

在PySpark中开发ETL,将数据从Amazon Redshift传输到Amazon S3数据湖. I also developed code for customer-level data aggregation and historicization. 最后,评估数据质量,调查和修复数据质量问题.
2012 - 2014

Master of Science Degree in Physics

University of Rochester - New York, USA

2008 - 2012

Bachelor's Degree in Physics

National University of Ireland, Galway - Galway, Ireland

JANUARY 2023 - JANUARY 2026

AWS Certified Developer

AWS

JANUARY 2023 - JANUARY 2026

AWS Certified Cloud Practitioner

AWS

Languages

SQL, Python 3, Python 2, C++14, Python, C++, SAS, Snowflake

Frameworks

Spark, Hadoop, Django

Libraries/APIs

PySpark, Scikit-learn, TensorFlow, OpenCV, Intel TBB, Amazon EC2 API, Python API

Tools

Amazon Elastic MapReduce (EMR), Apache Airflow, Git, Spark SQL, AWS Glue, Bitbucket, Tableau, MATLAB, Microsoft Excel, Jenkins, AWS CLI, Amazon EKS, Amazon Simple Queue Service (SQS), Terraform, Amazon Elastic Container Service (Amazon ECS), GitHub

Paradigms

Unit Testing, Agile, Continuous Integration (CI), ETL

Storage

Amazon S3 (AWS S3), Teradata, Redshift, Microsoft SQL Server, Apache Hive, PostgreSQL, Data Lakes

Industry Expertise

Marketing

Other

Data Analytics, Data Engineering, Recommendation Systems, Machine Learning, Data Quality Analysis, Deep Learning, Protocol Buffers, ETL Tools, Physics, FastAPI, Streaming Data

Platforms

iOS, Windows, Linux, Amazon EC2, Spark Core, Docker, Kubernetes, Amazon Web Services (AWS), Visual Studio Code (VS Code)

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring