Meet Isai – Isai Garcia-Baza

I am a data scientist with a strong background in applied machine learning, natural language processing, and statistical analysis. My work combines technical rigor with a problem-solving mindset. I enjoy taking complex, ambiguous challenges and designing clear, data-driven solutions that create real impact.

I thrive on open-ended challenges and am obsessive about solving the problems I take on. Whether the task calls for a quick, scrappy prototype to prove feasibility or a carefully engineered, production-ready pipeline, I can deliver. I am just as comfortable being the primary technical lead who builds solutions from the ground up as I am embedding into a highly technical team and contributing at a deep level. My approach is proactive and resourceful: I seek out documentation, track down maintainers when information is scarce, learn the data inside and out, and make sure I understand the problem space before proposing solutions. This “fire-and-forget” mentality means I don’t just complete assigned tasks, I drive them forward relentlessly until they’re solved, often uncovering new opportunities along the way. Above all, I aim to be the kind of data scientist who elevates any team I join: adaptable, persistent, and able to translate between technical execution and organizational impact.

Most recently, I developed end-to-end modeling pipelines using Python, Hugging Face libraries, scikit-learn, and imbalanced-learn, supported by SQL for data extraction and preparation. My work introduced new analytical capabilities to my division allowing non-technical team members to gain insight from text classifiers in a user-friendly interface. My presentation of this work to leadership was met with enthusiasm. Alongside this, I am collaborating with a professor on research that extracts and classifies images from digitized yearbooks. This project involves image segmentation, classification with vision–language models served both via APIs and locally hosted with Ollama, and the development of a pipeline to automatically ingest scanned yearbooks and extract images for downstream analysis. To handle the complexity of mapping outputs to detailed survey responses, I have applied prompt engineering and few-shot learning to align model behavior with human reviewers.