AWB Investment Research Strategy

"The Painting" — A progressive, layered approach to finding 10x investments with downside protection. Last updated: 2026-03-11.

Contents

  1. The Goal
  2. Current State — What We Have
  3. Storage Strategy — The Single Source of Truth
  4. The Layers — Progressive Deepening
  5. The Funnel — From 8,049 to ~10
  6. Prioritization Logic
  7. Execution Pipeline — What to Do Next
  8. Cost Budget
  9. Phase 3: The Watchlist
  10. Operating Principles

1The Goal

We are looking for:

Companies where we can buy at a price that gives us (a) very little chance of losing money AND (b) potential for 10x in 5 years if AGI plays out.

This means finding the intersection of two things:

  1. Downside protection — the stock trades at or below a "floor price" based on tangible assets, guaranteed cash flows, or liquidation value. Not just book value from the balance sheet, but real asset value: cash, real estate, equipment, owned subsidiaries, contracted revenue, IP with clear market value.
  2. AGI upside — the company is a strong AGI beneficiary (scored 7+ in our framework). If AGI arrives by 2027-2030 as we expect, this company's earnings could grow 5-20x.

The output is a small watchlist (5-15 companies) with specific entry prices. We monitor prices and act when they hit our targets.

2Current State — What We Have

Data Assets

AssetStatusLocationCoverage
Company Universe DONE awb/data/universe/universe.db 8,049 companies (CIK, ticker, name, exchange)
10-K Extracts DONE awb/data/exports/10k_all/ 3,648 companies (10-K text extracts, up to 20K chars each)
AGI Impact Scores DONE agi_scores/results/{TICKER}.json
agi_scores/agi_scores_all.json
3,648 companies scored (5 dimensions + holistic score 1-10 + category + reasoning)
Returns Backtest DONE agi_scores/returns_data.json 59 of 68 score 9-10 companies (1yr, 2yr, 3yr returns vs SPY)
Leopold Portfolio DONE /tmp/salp_13f/ 5 quarterly 13F filings parsed. Full position analysis with P&L.
Investor Tracker DONE outputs/investor-tracker-*.html 5 investors profiled (Li Lu, Weschler, Combs, Pabrai, Leopold)
META Deep Dive DONE awb/data/exports/META_analysis/ Full 6-agent analysis + 3 price targets. Template for future deep dives.
Floor Price Pilot DONE outputs/floor-price-analysis-10-companies.html 10 random companies. Proof of concept only.
AWB SQLite DB EMPTY awb/data/awb.db Schema exists (companies, financials, facts, analyses) but only 7 companies, 0 financials. Not being used.
Structured Financials NOT STARTED No structured balance sheet / income statement data at scale

Problem: Data is Scattered

Results are spread across JSON files, SQLite databases (2 of them, both underused), HTML reports, markdown files, and /tmp. There is no single place where "everything we know about Company X" lives. This means:

3Storage Strategy — The Single Source of Truth

Decision: Use awb.db as the master database

Every piece of analysis we produce gets stored in the SQLite database at awb/data/awb.db. One database. One source of truth. Every session reads from and writes to this database.

Database Schema

The existing schema is close to what we need but requires these changes:

-- 1. COMPANIES TABLE (exists, needs population)
--    Import all 8,049 companies from universe.db
--    Add columns for quick filtering

-- 2. NEW: company_metrics — the "painting" table
--    Each row = one metric for one company
--    This is where ALL layers of analysis accumulate
CREATE TABLE company_metrics (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    ticker TEXT NOT NULL,
    cik TEXT,
    metric_name TEXT NOT NULL,     -- e.g. 'agi_score', 'tangible_book_value', 'floor_price'
    metric_value REAL,             -- numeric value (NULL if text-only)
    metric_text TEXT,              -- text value (for reasoning, category, etc.)
    metric_json TEXT,              -- JSON for complex data (timelines, breakdowns)
    source TEXT,                   -- 'agi_scoring_v1', 'yfinance', 'sec_xbrl', 'deep_dive_v1'
    confidence TEXT,               -- 'high', 'medium', 'low'
    computed_at TIMESTAMP,
    cost_usd REAL,                 -- what it cost to compute this
    UNIQUE(ticker, metric_name, source)  -- no duplicates
);

-- 3. NEW: watchlist — the final output
CREATE TABLE watchlist (
    ticker TEXT PRIMARY KEY,
    entry_price REAL,              -- our target buy price
    current_price REAL,
    floor_price REAL,
    bull_case_5yr REAL,            -- 5-year bull case price
    agi_score INTEGER,
    conviction TEXT,               -- 'high', 'medium', 'low'
    thesis TEXT,                   -- 1-2 sentence investment thesis
    last_updated TIMESTAMP
);

What Gets Stored Where

Data TypeStorageWhy
All metrics (scores, financials, valuations)company_metrics tableQueryable, deduplicated, versionable
Deep-dive analyses (long text)analyses table (existing)Already designed for this
Key facts/signalsfacts table (existing)Structured findings from analyses
Final watchlistwatchlist tableThe output — what we actually monitor
Raw data files (10-K text, XML)Flat files on diskToo large for SQLite, used as inputs
HTML reportsoutputs/ directoryHuman-readable outputs, synced to Google Drive

Deduplication Rule

Before computing any metric, check if it already exists:

SELECT * FROM company_metrics
WHERE ticker = ? AND metric_name = ?
AND computed_at > datetime('now', '-30 days')

If a result exists and is less than 30 days old, skip. If older than 30 days, recompute (prices change, filings update). Each metric has a source field so we can track which version of the analysis produced it.

Migration Plan (One-Time)

  1. Import 8,049 companies from universe.db into awb.db companies table
  2. Import 3,648 AGI scores from agi_scores_all.json into company_metrics
  3. Import Leopold portfolio data into company_metrics (as a "smart_money_signal" metric)
  4. Import the 10-company floor price pilot results

4The Layers — Progressive Deepening

Each layer adds more detail to our picture of each company. Earlier layers are cheap and broad; later layers are expensive and narrow. We paint the most important parts first.

#LayerCoverageCostMetrics ProducedStatus
0 Universe
Who exists?
8,049 Free ticker, name, CIK, exchange, SIC code DONE
1 AGI Score
Does AGI help or hurt?
3,648 $355 agi_score (1-10), 5 dimension scores, category, confidence, reasoning DONE
2 Financial Snapshot
What do the numbers say?
~350
(score 7+)
~$0
yfinance API
market_cap, revenue, net_income, total_assets, total_liabilities, total_debt, cash, book_value, tangible_book_value, fcf, pe_ratio, pb_ratio, ev_ebitda, dividend_yield, shares_outstanding NEXT
3 Asset-Based Floor
What's the downside?
~350 ~$35
Haiku batch
floor_price, floor_confidence, floor_methodology, tangible_book_per_share, net_current_asset_value, liquidation_value_estimate NEXT
4 Smart Money Signal
Who else owns this?
~350 ~$0
SEC EDGAR 13F
leopold_holds, leopold_shares, leopold_avg_cost, top_13f_holders, insider_buying_ratio, institutional_ownership_pct NEXT
5 Quick Valuation
What's cheap vs expensive?
~100
(passed filters)
~$10
Haiku
price_to_tangible_book, ev_to_fcf, price_vs_floor, margin_of_safety_pct, cheapness_rank LATER
6 5-Year Cash Flow Model
What's the upside?
~30 ~$30
Sonnet
bull_revenue_2031, bull_fcf_2031, bull_price_2031, base_price_2031, bear_price_2031, entry_price_for_10x, current_price_vs_10x_entry LATER
7 Deep Dive
Full 6-agent analysis
~10 ~$50
Opus
Full META-style analysis: business moat, management, risks, 3 price targets, AGI impact deep-dive LATER
8 Watchlist
Monitor and act
5-15 ~$0 entry_price, current_price, alert_threshold, thesis, conviction LATER

Total estimated cost to reach the watchlist: ~$125

Layer 0-1 already done ($355 spent). Layers 2-8 cost approximately $75-125 more. The funnel narrows aggressively, so expensive analysis is only done on a handful of companies.

5The Funnel — From 8,049 to ~10

8,049 companies — Full universe
3,648 — Have 10-K filings (scored for AGI impact)
Filter: AGI score ≥ 7
~350 — AGI beneficiaries (Layers 2-4 applied here)
Filter: trades near/below floor OR P/TB < 2 OR smart money signal
~50-100 — Cheap + AGI tailwind (Layer 5 applied here)
Filter: margin of safety ≥ 30% AND 10x entry price plausible
~20-30 — Candidates for cash flow modeling (Layer 6)
Filter: 10x entry price ≤ current price × 1.5
5-15 companies — WATCHLIST (Deep Dive + Monitor)

6Prioritization Logic

Within each layer, we prioritize companies using a composite score:

Priority Score = weighted combination of:

SignalWeightWhy
AGI Score (1-10)30%Higher score = more AGI tailwind = more upside potential
Proximity to Floor (current price / floor price)25%Closer to floor = more downside protection = safer entry
Smart Money Signal (held by tracked investors)20%Leopold, Li Lu, etc. holding = validation of thesis
Asset Density (tangible book / market cap)15%Higher = more asset backing per dollar of market cap
Simplicity (can we understand the business?)10%Simpler businesses = more reliable floor estimates

Special Priority Boost

7Execution Pipeline — What to Do Next

Step 0: Database Migration DO FIRST

One-time setup. Consolidate all existing data into awb.db.

  • Create company_metrics and watchlist tables
  • Import 8,049 companies from universe.db
  • Import 3,648 AGI scores from JSON into company_metrics
  • Import Leopold holdings as smart money signals

Cost: $0   Time: ~30 minutes

Step 1: Financial Snapshot (Layer 2) DO SECOND

Pull structured financial data for all ~350 score-7+ companies via yfinance.

  • Market cap, revenue, net income, total assets, total liabilities, cash, debt
  • Tangible book value (total equity minus goodwill minus intangibles)
  • FCF, P/E, P/B, EV/EBITDA
  • Store all in company_metrics

Cost: $0   Time: ~15 minutes (yfinance API calls)

Step 2: Asset-Based Floor Screening (Layer 3) DO THIRD

For each of the ~350 companies, compute a rough floor price. Two approaches in parallel:

  1. Quantitative floor (automated): tangible book value per share, net current asset value (NCAV), cash per share. Flag anything trading below 2x tangible book.
  2. AI-assisted floor (Haiku batch): feed financial snapshot + AGI score + 10-K excerpt to Haiku. Ask: "What is the approximate floor price for this company? Consider real asset value, contracted revenue, owned subsidiaries, cash, and debt. Output a number and confidence level."

Cross-reference both approaches. Where they disagree, flag for manual review.

Cost: ~$35   Time: ~1 hour

Step 3: Smart Money Overlay (Layer 4)

For the ~350 companies, check if any tracked investors hold them.

  • Leopold's current 25 positions (already have this data)
  • Pull 13F filings for Li Lu, Weschler, Combs, Pabrai (4 more investors)
  • Flag any overlaps (multiple tracked investors holding same stock)
  • Compute insider buy/sell ratio from Form 4 data

Cost: $0   Time: ~30 minutes (SEC EDGAR API)

Step 4: First Cut — Filter to ~50-100 candidates

Apply the priority score. Keep companies that meet ANY of:

  • Trading within 50% of floor price (cheap on assets)
  • Held by 2+ tracked investors
  • AGI score 9+ AND P/TB < 3
  • Leopold holds AND currently underwater

Cost: $0   Time: 5 minutes (SQL query)

Step 5: 5-Year Cash Flow Models (Layer 6)

For the ~20-30 highest priority candidates, build bull/base/bear scenarios.

  • Model revenue growth under "AGI buildout plays out" scenario
  • Project FCF margins 5 years out
  • Apply terminal multiple (15-25x FCF)
  • Compute: "What price today gives me 10x in 5 years?"
  • Use Sonnet for this — good enough quality, reasonable cost

Cost: ~$30   Time: ~2-3 hours

Step 6: Deep Dives (Layer 7)

Full META-style 6-agent analysis for the final ~10 companies.

  • Business & competitive analysis
  • Financial deep dive (every balance sheet line item)
  • Management & governance assessment
  • Risk & regulatory analysis
  • AGI impact deep dive
  • Valuation with 3 price targets

Cost: ~$50   Time: ~4-6 hours

Step 7: Build Watchlist (Layer 8)

Compile the final watchlist with entry prices. Set up monitoring.

  • One-page summary per company: thesis, entry price, floor price, bull case, risks
  • Weekly price check script (automated)
  • Alert when any watchlist stock drops within 10% of entry price

Cost: $0   Time: 1 hour

8Cost Budget

PhaseWhatModelCompaniesEst. Cost
Already spentAGI scoring (Layer 1)Sonnet3,648$355
Step 0DB migration$0
Step 1Financial snapshotyfinance~350$0
Step 2Floor screeningHaiku batch~350$35
Step 3Smart money overlaySEC API~350$0
Step 4Filter / prioritizeSQL query$0
Step 55-year cash flow modelsSonnet~30$30
Step 6Deep divesOpus~10$50
Step 7Watchlist setup5-15$0
Total remaining~$115
Grand total (including Layer 1)~$470

9Phase 3: The Watchlist

What the watchlist looks like

TickerEntry PriceCurrentFloorBull 5yrUpsideConvictionThesis
EXAMPLE$12.50$18.40$10.00$12510xHighAGI infrastructure with $X/share in tangible assets...
Populated after Steps 5-7 complete

Monitoring Cadence

Position Sizing

When we do buy, position sizing follows the floor price confidence:

10Operating Principles

Never Lose Work

Every analysis result goes into awb.db. Every metric has a source tag, timestamp, and cost. If we redo an analysis with a better model, the old result stays (different source tag). We can always compare versions.

Never Redo Work

Before computing anything, check if it exists in the database. If it's less than 30 days old, skip. If it's older, we can choose to recompute or keep. The UNIQUE(ticker, metric_name, source) constraint prevents accidental duplicates.

Cheap Before Expensive

Free data (yfinance, SEC EDGAR) before paid data (API calls). Haiku ($0.01/company) before Sonnet ($0.10/company) before Opus ($5/company). Only escalate model quality when the cheaper model's output is the bottleneck.

Broad Before Deep

Score 3,648 companies at low depth before deep-diving 10 companies. The funnel ensures we don't waste expensive analysis on companies that fail basic screens. Each layer narrows the funnel by 3-5x.

Quantitative Before Qualitative

Pull numbers first (financials, ratios, prices). Then apply judgment (floor estimates, thesis evaluation). Numbers are fast to compute and easy to filter. Judgment is expensive and should only be applied to the survivors.

Trust but Verify

AI-generated floor prices are estimates. Before putting real money in, we manually verify the deep dive: read the 10-K ourselves, check the balance sheet line by line, validate the thesis. The AI does the screening; we do the final diligence.

What About RL Fine-Tuning?

Considered and rejected for now. The core problem: RL needs a reward signal, and for investments, the reward is whether the investment makes money — which we only know years later. Any proxy reward would just encode our existing beliefs, gaining nothing over good prompting.

The real bottleneck is not model capability (Claude is already excellent at reading 10-Ks and reasoning about businesses). The bottleneck is having the right data in the right structure and applying disciplined screening. That's what this strategy addresses.

If we wanted to revisit later, the most promising angle would be fine-tuning on historical 10-K → 5-year-return pairs (supervised learning, not RL). But we'd need ~20 years of 10-K data mapped to outcomes, and the resulting model would still be backward-looking. Better to invest the time in fundamental analysis.

Immediate Next Session: Start with Steps 0-2

The next Claude Code session should:

  1. Step 0: Create tables, migrate data into awb.db (30 min)
  2. Step 1: Pull financial snapshots for 350 companies via yfinance (15 min)
  3. Step 2: Compute quantitative floors (tangible book, NCAV) for 350 companies (15 min)
  4. Generate: HTML report showing the intersection of "AGI score 7+ AND cheap on assets"

By end of that session, we'll have the first filtered list of candidates. Total cost: ~$0 (no API calls needed for Steps 0-2 quantitative part).