Caleb Johnson

Data Scientist

Experienced Machine Learning Engineer specializing in computer vision, NLP, and AI-driven software solutions.

Passionate about producing insights from chaos.

Work Experience

Machine Learning Engineer II

Oct 2024 - Present

Arturo

South Jordan, UT

  • Developed advanced computer vision models using YOLOv8 for geospatial analysis, improving aerial property intelligence.
  • Led the migration of legacy models to modern architectures, optimizing inference speed and cost.
  • Deployed and maintained machine learning pipelines on AWS, integrating MLOps best practices.
  • Conducted model evaluation and QA for large-scale data batches, securing key contracts with insurance providers.

Machine Learning Engineer

May 2022 - Oct 2024

Allstate

Chicago, IL

  • Developed a CNN-LSTM model in PyTorch to analyze dashcam footage and detect risky driving behaviors.
  • Optimized a multi-modal model for real-time analysis of lane changes, achieving a 95% F-score.
  • Implemented scalable data pipelines for processing terabytes of driving footage.

Data Science Intern

Oct 2021 - Feb 2022

KPMG

Salt Lake City, UT

  • Built and developed a deep learning model in PyTorch to classify the prevailing emotion of client inquiries. Created necessary processing pipelines and deployed it to AWS, working with the development team to integrate it into the company’s application
  • Designed a high‑quality information dashboard that provided supervisors up‑to‑date statistics on accountant behavior and performance.

Computer Vision Intern

Oct 2020 - Apr 2021

DataMachines

Remote

  • Enhanced the OSIRIS iris recognition algorithm package by developing a UNet+-based masking pipeline and a convolutional neural network to address rotational invariance in scans.

Java Software Developer

Feb 2020 - Jul 2021

DRAGN Labs

Provo, UT

  • Implemented a new data classification process utilizing Android/iOS clients to a hosted server that improved the average speed of classification by a factor of over 3. This enabled our team to finish the data classification process 4 months ahead of schedule.
  • Designed and trained a novel emotion classifier to compare communication trends online by their perceived level of identifiability. Leveraged this model to classify millions of records with a classification accuracy over 3% better than other models.
  • Organized the pipelining and high‑level classification of hundreds of thousands of Twitter accounts by their level of identifiability

Software Developer

Jul 2018 - Aug 2019

Record Linking Lab

Provo, UT

  • Collaborated with Ancestry.com to optimize a matching algorithm that improved correct parent classifications by over 20%
  • Implemented a document segmentation application to improve data categorization of marriage records by a factor of over 5 compared to human‑entry
  • Key contributor on the formation of the Longitudinal Intergenerational Family Electronic Microdatabase (LIFE‑M), a robust database that has been leveraged in papers by researchers at Yale, Michigan, and more

Education

M.S. in Computer Science

Aug 2021 - Apr 2023

University of Utah

Salt Lake City, UT

  • Pursuing an emphasis in data science and machine learning

B.S. (Honors) in Statistics and Computer Science

Aug 2017 - Apr 2021

Brigham Young University

Provo, UT

  • Graduated with a double major in CS and Stats, and a minor in Math
  • Successfully defended Honors Thesis, "The Communicative Effects of Anonymity Online: A Natural Language Processing Approach
  • Honors Program Student Leadership Council Member
  • Awarded Summer 2020 BYU Honors Summer Research Fellowship

Expertise

Professionally drive

Synergistically strategize customer directed resources rather than principle.

Seamlessly leverage

Quickly repurpose reliable customer service with orthogonal ideas. Competently.

Interactively incubate

Interactively myocardinate high standards in initiatives rather than next-generation.

Globally streamline

Dynamically initiate client-based convergence vis-a-vis performance based.

Skills

Machine Learning & Deep Learning
95%
Computer Vision & Geospatial Analysis
90%
MLOps & Cloud (AWS, GCP)
85%
Python, PyTorch, TensorFlow
90%
Data Engineering & SQL
80%

Interest

Books

Proactively extend market-driven e-tailers rather than enterprise-wide supply chains. Collaboratively embrace 24/7 processes rather than adaptive users. Seamlessly monetize alternative e-business.

Sports

Assertively grow optimal methodologies after viral technologies. Appropriately develop frictionless technology for adaptive functionalities. Competently iterate functionalized networks for best-of-breed services.

Art

Dramatically utilize superior infomediaries whereas functional core competencies. Enthusiastically repurpose synergistic vortals for customer directed portals. Interactively pursue sustainable leadership via.

Get in contact