Zachary S. Brown, PhD
Former physicist turned data scientist and applied research leader. Currently leading generative AI initiatives for legal research at Thomson Reuters Labs. Specialize in impactful 0-1 ML/AI projects and ML modernization efforts. Deep experience with production generative AI systems, agentic workflows, modern neural architectures for NLP, and building ML systems at scale.
Employment History
Senior Manager, Applied Research | Thomson Reuters Labs
October 2024 — Present
- Lead team of applied scientists developing and delivering generative AI products for legal research
- Developed and delivered Thomson Reuters’ first agentic legal research system (Westlaw Advantage Deep Research) from prototype to product in four months - market leading product for legal deep research
- Manage evaluation process for detailed manual and scalable automatic evaluation of long-form legal research reports and agent trajectories
- Grew team from two to nine in one year to accommodate rapid evolution in ML and GenAI products
NLP Machine Learning Lead | Kensho (S&P Global)
December 2022 — September 2024
- Applied research leadership for Kensho and S&P Global NLP products
- Led development and delivery of S&P’s first generative AI solution to market (ChatIQ)
- Architected production ML pipelines for financial document analysis and summarization
- Established best practices for LLM development and evaluation frameworks
Speech & Language Processing Team Lead | Balto.ai
August 2021 — December 2022
- Machine learning group lead for real-time NLP and speech processing systems
- Led full migration of legacy speech and language processing model stack
- Built and deployed systems for real-time transcription, intent detection, and agent assist features
- Scaled systems to handle millions of customer interactions
NLP Research Engineer | Capital One
January 2020 — August 2021
- Machine learning development and deployment for conversational AI and human dialogue systems
- Developed and delivered modern architecture update for core entity recognition model for C1 Eno product
- Modernized chatbot stack with transformer-based models
Data Science Practice Lead | Snagajob
July 2019 — January 2020
- Definition and execution of enterprise data science strategy for information retrieval and recommendations
Lead Data Scientist | S&P Global Market Intelligence
November 2017 — July 2019
- Data science team lead for document enrichment and information extraction pipelines
- Optimized content enrichment pipeline systems
Lead Data Scientist | Unitedhealth Group
November 2016 — November 2017
Data Scientist | Snagajob
January 2016 — November 2016
Data Science Curriculum Developer | Cloudera
July 2015 — January 2016
Principal Analyst/Engineer | Capital One
February 2014 — July 2015
Projects
Impactful 0-1 ML/AI Projects
| Westlaw Advantage Deep Research | Thomson Reuters Labs | 2024 |
First agentic legal research system deployed to market. Multi-agent RAG architecture for complex query decomposition and comprehensive legal research. Production deployment serving thousands of legal professionals. Built and launched in four months from prototype to product.
Technologies: LangChain, RAG, Multi-agent systems, Evaluation frameworks
| ChatIQ | Kensho (S&P Global) | 2023-2024 |
S&P Global’s first generative AI solution deployed to market. Production LLM systems for financial document understanding, analysis, and Q&A over complex financial datasets. Scaled to handle thousands of daily queries.
Technologies: LLMs, Financial NLP, Document analysis, Production ML
| Capital One Eno Modernization | Capital One | 2020-2021 |
Modern architecture update for core entity recognition model in C1’s Eno chatbot product. Updated chatbot stack with transformer-based models, replacing legacy NLP systems.
Technologies: Transformers, Entity recognition, Chatbots, Production NLP
ML Modernization Efforts
| Balto ML Backend Overhaul | Balto.ai | 2021-2022 |
Full backend ML overhaul for real-time conversation intelligence platform. Complete migration of both speech-to-text (STT) and NLP model stacks. Enabled low-latency inference for live call guidance with sub-second response times.
Technologies: Real-time ML, Speech-to-text, NLP modernization, Low-latency inference
| S&P Content Enrichment Optimization | S&P Global | 2017-2019 |
Document enrichment and information extraction pipeline optimizations. Improved throughput and accuracy for content processing systems at enterprise scale.
Technologies: Information extraction, Pipeline optimization, Document processing, Enterprise ML
Community & Extracurricular
| RVATech Summit Founding Organizer | 2018 — Present |
Founded and organize RVATech’s flagship technical conference. Originally launched as RVATech Data Summit (2018-2023), rebranded to RVATech AI Summit in 2024 to reflect evolution toward AI and generative AI topics. 300+ attendees annually, featuring speakers from industry and academia.
| RVA Data Science Community Meetup Organizer | 2017 — 2021 |
Founded and organized Richmond Virginia Data Science Community Meetup. Built community of 1500+ data practitioners with monthly technical talks, workshops, and networking events. 100+ events organized.
| Adjunct Professor | Virginia Commonwealth University | 2019 — 2021 |
Taught graduate courses in VCU School of Business Decision Analytics program: DAPT615 (Emerging Technologies) and SCMA645 (Management Science TA). Focus on cloud computing, distributed systems, and modern ML frameworks.
| PyData Virginia Organizing Committee | 2025 |
Organizing committee member for PyData Virginia conference in Charlottesville, bringing the global PyData conference series to the Commonwealth.
| Contributor | Huggingface DataSets | 2020 |
Open source contributions to Huggingface DataSets library.
| Capstone Project Mentor | VCU Masters in Decision Analytics | 2016 |
Mentored graduate students on capstone projects for VCU’s Masters in Decision Analytics program.
Education
| Doctor of Philosophy, Physics | The College of William & Mary | January 2015 |
Williamsburg, VA • Thesis: “Heavy hadron spectroscopy and interactions from Lattice QCD”
| Master of Science, Physics | The College of William & Mary | January 2010 |
Williamsburg, VA
| Bachelor of Science, Physics | SUNY Fredonia | May 2008 |
Fredonia, NY
Technical Skills
Generative AI & LLMs: OpenAI, Anthropic, LangChain, Huggingface, PEFT, Prompt Engineering
NLP & ML: PyTorch, spaCy, AllenNLP, Prodigy, Transformers
Data Science: Python, Pandas, NumPy, SciPy, TensorBoard, PySpark, SparkML
ML Ops: Kubernetes, Seldon Core, Docker, ECR, Kafka, MLflow
Data Storage: Chroma, Redis, Elasticsearch, Snowflake, Postgres, S3
