Zachary S. Brown, PhD

Former physicist turned data scientist and applied research leader. Currently leading generative AI initiatives for legal research at Thomson Reuters Labs. Specialize in impactful 0-1 ML/AI projects and ML modernization efforts. Deep experience with production generative AI systems, agentic workflows, modern neural architectures for NLP, and building ML systems at scale.


Employment History

Senior Manager, Applied Research | Thomson Reuters Labs

October 2024 — Present

  • Lead team of applied scientists developing and delivering generative AI products for legal research
  • Developed and delivered Thomson Reuters’ first agentic legal research system (Westlaw Advantage Deep Research) from prototype to product in four months - market leading product for legal deep research
  • Manage evaluation process for detailed manual and scalable automatic evaluation of long-form legal research reports and agent trajectories
  • Grew team from two to nine in one year to accommodate rapid evolution in ML and GenAI products

NLP Machine Learning Lead | Kensho (S&P Global)

December 2022 — September 2024

  • Applied research leadership for Kensho and S&P Global NLP products
  • Led development and delivery of S&P’s first generative AI solution to market (ChatIQ)
  • Architected production ML pipelines for financial document analysis and summarization
  • Established best practices for LLM development and evaluation frameworks

Speech & Language Processing Team Lead | Balto.ai

August 2021 — December 2022

  • Machine learning group lead for real-time NLP and speech processing systems
  • Led full migration of legacy speech and language processing model stack
  • Built and deployed systems for real-time transcription, intent detection, and agent assist features
  • Scaled systems to handle millions of customer interactions

NLP Research Engineer | Capital One

January 2020 — August 2021

  • Machine learning development and deployment for conversational AI and human dialogue systems
  • Developed and delivered modern architecture update for core entity recognition model for C1 Eno product
  • Modernized chatbot stack with transformer-based models

Data Science Practice Lead | Snagajob

July 2019 — January 2020

  • Definition and execution of enterprise data science strategy for information retrieval and recommendations

Lead Data Scientist | S&P Global Market Intelligence

November 2017 — July 2019

  • Data science team lead for document enrichment and information extraction pipelines
  • Optimized content enrichment pipeline systems

Lead Data Scientist | Unitedhealth Group

November 2016 — November 2017

Data Scientist | Snagajob

January 2016 — November 2016

Data Science Curriculum Developer | Cloudera

July 2015 — January 2016

Principal Analyst/Engineer | Capital One

February 2014 — July 2015


Projects

Impactful 0-1 ML/AI Projects

Westlaw Advantage Deep ResearchThomson Reuters Labs2024

First agentic legal research system deployed to market. Multi-agent RAG architecture for complex query decomposition and comprehensive legal research. Production deployment serving thousands of legal professionals. Built and launched in four months from prototype to product.

Technologies: LangChain, RAG, Multi-agent systems, Evaluation frameworks

ChatIQKensho (S&P Global)2023-2024

S&P Global’s first generative AI solution deployed to market. Production LLM systems for financial document understanding, analysis, and Q&A over complex financial datasets. Scaled to handle thousands of daily queries.

Technologies: LLMs, Financial NLP, Document analysis, Production ML

Capital One Eno ModernizationCapital One2020-2021

Modern architecture update for core entity recognition model in C1’s Eno chatbot product. Updated chatbot stack with transformer-based models, replacing legacy NLP systems.

Technologies: Transformers, Entity recognition, Chatbots, Production NLP

ML Modernization Efforts

Balto ML Backend OverhaulBalto.ai2021-2022

Full backend ML overhaul for real-time conversation intelligence platform. Complete migration of both speech-to-text (STT) and NLP model stacks. Enabled low-latency inference for live call guidance with sub-second response times.

Technologies: Real-time ML, Speech-to-text, NLP modernization, Low-latency inference

S&P Content Enrichment OptimizationS&P Global2017-2019

Document enrichment and information extraction pipeline optimizations. Improved throughput and accuracy for content processing systems at enterprise scale.

Technologies: Information extraction, Pipeline optimization, Document processing, Enterprise ML


Community & Extracurricular

RVATech Summit Founding Organizer2018 — Present

Founded and organize RVATech’s flagship technical conference. Originally launched as RVATech Data Summit (2018-2023), rebranded to RVATech AI Summit in 2024 to reflect evolution toward AI and generative AI topics. 300+ attendees annually, featuring speakers from industry and academia.

RVA Data Science Community Meetup Organizer2017 — 2021

Founded and organized Richmond Virginia Data Science Community Meetup. Built community of 1500+ data practitioners with monthly technical talks, workshops, and networking events. 100+ events organized.

Adjunct ProfessorVirginia Commonwealth University2019 — 2021

Taught graduate courses in VCU School of Business Decision Analytics program: DAPT615 (Emerging Technologies) and SCMA645 (Management Science TA). Focus on cloud computing, distributed systems, and modern ML frameworks.

PyData Virginia Organizing Committee2025

Organizing committee member for PyData Virginia conference in Charlottesville, bringing the global PyData conference series to the Commonwealth.

ContributorHuggingface DataSets2020

Open source contributions to Huggingface DataSets library.

Capstone Project MentorVCU Masters in Decision Analytics2016

Mentored graduate students on capstone projects for VCU’s Masters in Decision Analytics program.


Education

Doctor of Philosophy, PhysicsThe College of William & MaryJanuary 2015

Williamsburg, VA • Thesis: “Heavy hadron spectroscopy and interactions from Lattice QCD”

Master of Science, PhysicsThe College of William & MaryJanuary 2010

Williamsburg, VA

Bachelor of Science, PhysicsSUNY FredoniaMay 2008

Fredonia, NY


Technical Skills

Generative AI & LLMs: OpenAI, Anthropic, LangChain, Huggingface, PEFT, Prompt Engineering

NLP & ML: PyTorch, spaCy, AllenNLP, Prodigy, Transformers

Data Science: Python, Pandas, NumPy, SciPy, TensorBoard, PySpark, SparkML

ML Ops: Kubernetes, Seldon Core, Docker, ECR, Kafka, MLflow

Data Storage: Chroma, Redis, Elasticsearch, Snowflake, Postgres, S3