AI Infrastructure

RLVR environments & training infrastructure -- Bay Area

Category 4: AI Infrastructure -- RLVR Environments & Training Infra

Companies building the tooling, compute, frameworks, and platforms that AI labs rely on to train, evaluate, and deploy models. Understanding this infrastructure is critical for eventually building and training an investment AI via RLVR or continual learning.

Bay Area focus: 6 companies in the main section. 4 non-Bay Area companies archived below for reference.

Bay Area Companies (6)

Click to jump to a company

6
Bay Area Companies
0
Researching
0
Interested
0
Applied
1

Scale AI Bay Area

RLHF data labeling platform -- backbone of LLM alignment
$13.8B
Valuation
1,200+
Employees
2016
Founded
San Francisco, CA
HQ

Key People

Alexandr Wang

Alexandr Wang

CEO & Co-Founder
Was youngest self-made billionaire. Dropped out of MIT at 19. Recruited to Meta leadership. Built Scale into the dominant RLHF data labeling platform used by virtually every major AI lab.

Product: RLHF Data Labeling Platform

  • Human preference data at scale -- provides the human preference data that virtually every major AI lab uses to align their LLMs via RLHF
  • Data annotation at massive scale using a global network of human labelers for training data curation
  • Evaluation and red-teaming services for model safety testing -- stress-testing LLMs for alignment failures
  • Large government/defense contracts (DoD, intelligence community) -- significant revenue from federal AI programs
  • Enterprise AI deployment tools for organizations adopting LLMs in production
  • Note: laid off ~14% of workforce in 2025, suggesting tighter operations and higher hiring bar

Compensation (ML Engineer)

  • ML Engineer: $250K-$400K total comp
  • Senior/Staff: $400K-$600K total comp
  • Equity at $13.8B valuation
  • Recent layoffs may mean higher bar for new hires
  • Source: levels.fyi, Glassdoor

Why It Matters

Scale AI is the backbone of RLHF -- the technique used to align every major LLM. Understanding how high-quality human preference data is generated at scale is essential for anyone working on RLVR. If you want to build an AI that learns from human feedback (or verifiable rewards), understanding the data pipeline is foundational. Scale's position at the center of the AI alignment ecosystem gives unique visibility into how every major lab approaches the problem.

Path to Entry

  • ML Engineer roles in data quality/annotation systems
  • Meta background is strong signal -- Alexandr Wang was recruited to Meta leadership
  • Post-layoff hiring bar is likely higher, but leaner team means more impact per engineer
  • San Francisco HQ, ideal location

Tracking

2

Anyscale (Ray) Bay Area

Ray distributed computing framework + RLlib -- the RL library used by OpenAI
$260M+
Total Raised
200+
Employees
2019
Founded
UC Berkeley
Origin (RISELab)
San Francisco, CA
HQ
RLlib
RL Framework

Key People

Robert Nishihara

Robert Nishihara

Co-founder
PhD in ML systems. Co-created Ray at UC Berkeley RISELab. Expert in distributed computing for machine learning workloads.
Philipp Moritz

Philipp Moritz

Co-founder
Distributed systems expert. Co-created Ray. Focused on making distributed computing accessible for ML practitioners.
Ion Stoica

Ion Stoica

Co-founder
UC Berkeley professor. Also co-founded Databricks. Created Apache Spark and Mesos. One of the most influential figures in distributed systems and data infrastructure.

Product: Ray Framework + RLlib

  • Ray -- open-source distributed computing framework used by OpenAI, Uber, Spotify, Instacart for scaling ML workloads
  • RLlib -- one of the most popular RL libraries, purpose-built for scalable reinforcement learning at production scale
  • Ray Train for distributed model training orchestration across GPU clusters
  • Ray Serve for model serving and inference at scale
  • Ray Data for distributed data processing pipelines
  • Most directly relevant company for RLVR infrastructure -- RLlib is purpose-built for RL at scale, exactly the tooling needed for training an investment AI with reinforcement learning
  • OpenAI uses Ray for training their models -- proven at the frontier of AI

Compensation (ML Engineer)

  • ML Engineer: $250K-$400K total comp
  • Senior: $400K-$550K total comp
  • Pre-IPO equity with significant upside potential
  • Source: levels.fyi, Glassdoor

Why It Matters

Arguably the most directly relevant company for RLVR infrastructure. RLlib is purpose-built for scalable reinforcement learning -- exactly the kind of framework needed to train an investment AI with verifiable rewards. OpenAI uses Ray for training, validating it as frontier-grade infrastructure. Understanding distributed RL systems from the inside would be invaluable for eventually building custom RLVR training pipelines. The Berkeley pedigree (Ion Stoica also co-founded Databricks and created Spark) means world-class systems engineering culture.

Path to Entry

  • Strong fit for ML Engineer with distributed systems experience from Meta
  • RLlib team is the ideal placement -- directly working on scalable RL infrastructure
  • Contributing to Ray open-source as an entry signal demonstrates competence
  • San Francisco HQ, ideal location

Tracking

3

Weights & Biases Bay Area

Standard MLOps experiment tracker -- acquired by CoreWeave (2025)
Acquired by CoreWeave in 2025. Combined entity offers GPU compute + ML observability. SF office maintained. Now CoreWeave equity (post-IPO).
$250M
Total Raised
~250
Employees
2017
Founded
San Francisco, CA
HQ
70K+
Organizations
CoreWeave
Acquirer (2025)

Key People

Lukas Biewald

Lukas Biewald

CEO & Co-founder
Previously founded CrowdFlower/Appen (data labeling pioneer). Stanford CS. Built W&B into the standard experiment tracking tool for ML research and production.
Chris Van Pelt

Chris Van Pelt

Co-founder
Co-founded W&B. Engineering leadership building the core experiment tracking and visualization platform.
Shawn Lewis

Shawn Lewis

Co-founder
Co-founded W&B. Technical leadership on the platform's core infrastructure.

Product: MLOps Platform

  • Experiment tracking and visualization -- log metrics, hyperparameters, model outputs during training runs in real time
  • Sweeps -- automated hyperparameter search and optimization
  • Model registry and artifact management -- version and manage trained model weights
  • Dataset versioning -- track and version training datasets
  • Collaborative ML reports -- share experiment results and analysis with teams
  • Used by 70K+ organizations -- the standard experiment tracker for ML training including RLHF/RLVR runs
  • Now part of CoreWeave (acquired 2025) -- combined entity offers GPU compute + ML observability, a full-stack AI training platform

Compensation (ML Engineer)

  • ML Engineer: $250K-$400K total comp
  • Now CoreWeave equity (post-IPO)
  • SF office maintained post-acquisition
  • Source: levels.fyi, Glassdoor

Why It Matters

Weights & Biases is the standard experiment tracking tool for ML training. Every RLHF/RLVR training run, every reward model iteration, every RL optimization loop -- tracking it all happens through W&B. Understanding the observability layer of model training gives deep insight into what makes training succeed or fail. Now part of CoreWeave, the combined entity offers compute + observability -- a full-stack view of the AI training pipeline. SF office is maintained.

Path to Entry

  • Now part of CoreWeave -- apply through CoreWeave hiring
  • ML Engineer roles on platform/infrastructure side
  • SF office maintained post-acquisition
  • Meta ML infrastructure experience is directly relevant

Tracking

4

Fireworks AI Bay Area

Fast LLM inference platform -- founded by ex-Meta PyTorch team
$77M+
Total Raised
~50+
Employees
2022
Founded
Redwood City, CA
HQ

Key People

Lin Qiao

Lin Qiao

CEO & Co-founder
Previously on Meta's PyTorch team, led engineering at Meta's AI infrastructure org. Deep expertise in ML systems, model optimization, and production inference at scale.

Product: LLM Inference Platform

  • Low-latency, high-throughput serving of open-source models (Llama, Mistral, etc.) -- optimized inference stack
  • Quantization and optimization techniques for inference -- reducing model size while preserving quality
  • Custom model fine-tuning and deployment -- end-to-end from fine-tune to production serving
  • Function calling and structured output -- reliable tool use and JSON output from LLMs
  • Multi-model routing -- intelligently route requests across different models based on task requirements
  • Founded by ex-Meta PyTorch team -- direct expertise in the core ML framework stack

Compensation (ML Engineer)

  • Startup comp, estimated $250K-$400K total
  • Early-stage equity at $77M+ raised stage -- significant upside potential
  • Small team (~50+) means high individual impact
  • Source: estimated based on stage and Bay Area market

Why It Matters

Inference optimization is crucial for RLVR -- fast model serving is needed during the RL training loop where the policy model must be evaluated repeatedly. Understanding how to make LLM inference fast and efficient is directly applicable to building a training pipeline for an investment AI. The ex-Meta PyTorch team founding means direct cultural fit for Ravi -- these are people who built the infrastructure he uses daily at Meta. Small team in the Bay Area with strong technical pedigree.

Path to Entry

  • Strong Meta connection -- founded by ex-Meta PyTorch/AI infra team
  • ML Infrastructure roles focused on inference optimization
  • Small team (~50+) -- Bay Area, high individual impact
  • Ravi's Meta ML background is an ideal cultural and technical fit

Tracking

5

Lambda Labs Bay Area

GPU cloud for AI training -- workstations, clusters, and Lambda Stack
$500M+
Total Raised
200+
Employees
2012
Founded
San Francisco, CA
HQ

Key People

Stephen Balaban

Stephen Balaban

CEO & Founder
Founded Lambda in 2012, well before the current AI boom. Early bet on GPU computing for ML that has proven prescient. Built Lambda into a leading GPU cloud and workstation provider for AI researchers and startups.

Product: GPU Cloud & Workstations

  • GPU cloud instances for AI training and inference -- A100, H100 clusters on demand
  • GPU workstations and servers for on-premise use -- pre-built hardware for ML teams
  • Lambda Stack -- pre-configured ML software suite (PyTorch, TensorFlow, CUDA all pre-installed and tested together)
  • Competitive pricing vs. major cloud providers (AWS, GCP, Azure) -- often 50%+ cheaper for GPU compute
  • Popular with researchers and startups for on-demand GPU access without the complexity of hyperscaler clouds

Compensation (ML/Infra Engineer)

  • ML/Infra Engineer: $200K-$350K total comp
  • Equity at $500M+ raised stage
  • SF-based
  • Source: Glassdoor

Why It Matters

Lambda Labs is the GPU compute layer for RLVR training. Understanding GPU cluster management, scheduling, and optimization is fundamental to training any serious ML model. Lambda's position as a GPU cloud provider gives deep exposure to the compute infrastructure that powers AI training. For building an investment AI, you need to understand how to provision and manage the compute resources for large-scale RL training runs. SF-based with a strong engineering culture.

Path to Entry

  • Cloud infrastructure engineering roles
  • ML platform roles focused on GPU orchestration
  • San Francisco based
  • Meta infrastructure experience is relevant for large-scale systems

Tracking

6

Replicate Bay Area

Run ML models via cloud API + Cog (Docker for ML) -- created by Docker Compose creator
$58M+
Total Raised
~50+
Employees
2019
Founded
San Francisco, CA
HQ

Key People

Ben Firshman

Ben Firshman

CEO & Co-founder
Created Docker Compose -- one of the most widely-used developer tools in the world. Applying the same philosophy of making complex infrastructure accessible to ML model deployment.
Andreas Jansson

Andreas Jansson

Co-founder
Built Cog -- the open-source tool for packaging ML models into containers. Like Docker for ML models, solving the reproducibility and deployment problem.

Product: ML Model Cloud API + Cog

  • Run open-source ML models via simple cloud API -- no infrastructure management needed
  • Cog -- open-source tool for packaging ML models into containers (like Docker for ML models), solving reproducibility
  • GPU infrastructure for inference -- auto-scaling GPU allocation based on demand
  • Model marketplace with community-contributed models -- discover and run models instantly
  • Fine-tuning workflows -- customize open-source models with your own data
  • Popular for image generation (Stable Diffusion), LLMs, audio models

Compensation (ML/Infra Engineer)

  • Startup comp, estimated $200K-$350K total
  • Early-stage equity
  • Small team (~50+), high individual impact
  • Source: estimated based on stage and Bay Area market

Why It Matters

Replicate solves the model serving and reproducibility problem for ML -- critical for the inference side of RLVR. Cog (Docker for ML models) addresses one of the biggest pain points in ML: making models reproducible and deployable. For building an investment AI, you need to version, serve, and iterate on model checkpoints rapidly. The creator of Docker Compose bringing the same philosophy to ML infrastructure is a strong signal. Small team in SF with strong engineering culture.

Path to Entry

  • Small team (~50+) -- infrastructure engineering, GPU orchestration, ML deployment
  • Docker/container experience valued -- Cog is the core product
  • San Francisco based
  • Meta infrastructure and ML experience is directly relevant

Tracking

Archived -- Not Bay Area (4 companies)

Hugging Face New York / Paris

The "GitHub of ML" -- 500K+ models, TRL library, Transformers
$4.5B
Valuation
500K+
Models Hosted
~250+
Employees
NYC / Paris
HQ

Why Archived

HQ in New York and Paris -- not Bay Area. The "GitHub of ML" with 500K+ models and 100K+ datasets. Transformers library is the de facto standard for working with pre-trained models. TRL (Transformer Reinforcement Learning) library is directly relevant for RLHF/RLVR. Also PEFT/LoRA for efficient fine-tuning. Founded by Clement Delangue, Julien Chaumond, Thomas Wolf. Extremely relevant technically but requires relocation to NYC or Paris.

CoreWeave Livingston, NJ

GPU cloud infrastructure -- IPO'd 2025, acquired W&B
$75B+
Market Cap
IPO 2025
Stage
Livingston, NJ
HQ
NVIDIA
Key Backer

Why Archived

HQ in Livingston, New Jersey. IPO'd in 2025, $75B+ market cap. GPU cloud infrastructure purpose-built for AI workloads. NVIDIA-backed. Acquired Weights & Biases in 2025. Originally a crypto mining operation, pivoted to AI compute. Founded by Michael Intrator, Brian Venturo, Brannin McBee. Note: W&B acquisition means some Bay Area presence through the SF W&B office, but core operations are NJ-based.

Lightning AI New York

PyTorch Lightning framework -- 160M+ downloads, LitGPT
$108M+
Total Raised
~100+
Employees
New York
HQ
160M+
Downloads

Why Archived

HQ in New York. PyTorch Lightning framework with 160M+ downloads -- the most popular high-level wrapper for PyTorch training. LitGPT for LLM training and fine-tuning. Lightning Studios cloud IDE for ML development. ~100+ employees. Founded by William Falcon (PhD under Yann LeCun at NYU). Strong technical foundation but requires relocation to NYC.

Back to top | Back to Dashboard