Hi! I'm Dan Friedman.
I love learning about practical applications of data and techniques of software engineering and data analysis. The goal of this site is to help solidify my knowledge of these topics and hopefully make them easy to learn for others.
I have years of engineering and data science experience. I've worked for WeWork, consulted for the Chan Zuckerberg Initiative, Tesla Motors and Boosted Boards. I cofounded and co-organized the first MHacks - a hackathon at University of Michigan. I'm a graduate of Metis and the University of Michigan.
My technical skills: Python, Pandas, DBT, Airflow, SQL, AWS (S3, Lambda, Kinesis, Redshift, Glue, Athena), Docker, Terraform, data viz, analytics and ML
I have a plethora of interests in fields. Want to talk data engineering or science, new opportunities or something else, email me at email@example.com
Products I love and recommend, and would greatly appreciate your support too:
- dbt (data build tool): free & open source command line tool that enables data analysts and engineers to transform data in their data warehouses more effectively. dbt was initially created by Fishtown Analytics who also offers a SaaS product with a web IDE called dbt Cloud. They have an active open source community on GitHub and would love contributors. I recommend joining their active slack community too.
- pandas is an open source Python library that makes it easier to do data analysis through table-like data structures called dataframes & series. There are a large number and easy-to-use selection of methods on these data structures.
- Apache Airflow is an open source library to create and manage workflows - essentially an order for execution of a bash statement, python script, SQL operation or any number of operations.
- Apache Spark is an open source distributed general-purpose cluster-computing framework. Essentially you have one data store and use multiple machines & CPUs to run operations in parallel.
- Python is a programming language widely used for web applications, data science, and more. Python is high-level, interpreted and a general purpose programming language.
Software Development & Data Consulting
2017 - present
- Research for Chan Zuckerberg Initiative (CZI): inferred 30-40% of U.S. adults with criminal records, ~25M people, are eligible for criminal record expungement. Insights helped CZI lobby regulators for state and federal legislation of automatic record clearance solutions.
- Gathered continual insights on anonymous criminal record dataset of ~60K records to relay to CZI to ensure continued funding of project.
- Engineering for Tesla Motors: created Python library for drive simulation hardware to programmatically control car’s 12V battery; tests used to validate various aspects of Model 3’s design.
- Designed and taught curriculum for 11-week master’s business analytics course in Python at Santa Clara University for 30 students; received positive reviews from students.
- Created Random Forest classifier predicting trial rate conversion for personal trainer app with ROC AUC score of 0.88; model assisted team to interact with leads more efficiently.
- Discovered drop-offs in user onboarding flow for trainer signups on app; suggested recommendations that were adopted by developers to redesign flow.
WeWork - Data Scientist
- Designed database schema for Redshift tables utilizing star schema; constructed SQL queries to move data from staging to production tables (ELT); used Airflow to schedule workflow.
- Investigated user behavior flows in community manager app, found unexpected user patterns in event creation process and made suggestions on how to improve user experience
- Setup instrumentation, designed metrics and built dashboards to measure new features
TargetX - Data Scientist
2017 - 2018
- Designed and implemented lead scoring model using engagement in social network app that estimated student college enrollment likelihood; guided Director of Admissions on outreach.
- Implemented XGBoost classifier that predicted admitted college students’ likelihood to enroll.
- Analyzed trends in customer data and produced 30+ data visualizations using Python and Plotly that were featured in customer-facing product.
- Identified bottlenecks in email deliverability product through SQL queries and development of internal Chartio dashboards; insights used by support for diagnosis and helped engineers identify bugs and reduce time to send.
- Created metrics and visualizations to highlight product value for renewal sales decks; helped close multiple $15K+ annual contracts.
Boosted Boards - Engineering & Marketing
2014 - 2016 (took ~6 month sabbatical)
- Developed software for electrical test rack using multiple sensors & libraries; collected data and ran experiments to report on performance of lithium ion batteries in cold weather temperatures.
- Provided evidence for engineers to make hardware and software changes before production by identifying outliers and issues in riding telemetry.
- Created KPIs for engineering teams evaluating product performance from test riders highlighting engineering efforts towards hardware launch goals.
- Designed and executed on demo and rental offerings at newly partnered retail stores; trained team of contractors to manage on events; substantially increased revenue.