Hacker News Top 30 — 2026-04-21

1.The Vercel breach: OAuth attack exposes risk in platform environment variables

Sourcehttps://www.trendmicro.com/en_us/research/26/d/vercel-breach-oauth-supply-chain.html

SiteTrend Micro

AuthorBy: Peter Girnus Apr 20, 2026 Read time: ( words)

Published2026-04-20

HN activity132 points · 53 comments

Length6.4K words (~28 min read)

Languageen-US

An OAuth supply chain compromise at Vercel exposed how trusted third party apps and platform environment variables can bypass traditional defenses and amplify blast radius. This article examines the attack chain, underlying design tradeoffs, and what it reveals about modern PaaS and software supply chain risk.

Key takeaways

A compromised third‑party OAuth application enabled long‑lived, password‑independent access to Vercel’s internal systems, demonstrating how OAuth trust relationships can bypass traditional perimeter defenses.
The impact was amplified by Vercel’s environment variable model, where credentials not explicitly marked as sensitive were readable with internal access, exposing customer secrets at platform scale.
A publicly reported leaked‑credential alert predating disclosure highlights detection‑to‑notification latency as a critical risk factor in platform breaches.
This incident fits a broader 2026 convergence pattern (LiteLLM, Axios) in which attackers consistently target developer‑stored credentials across CI/CD, package registries, OAuth integrations, and deployment platforms.
Effective defense requires architectural change: treating OAuth apps as third‑party vendors, eliminating long‑lived platform secrets, and designing for the assumption of provider‑side compromise.

Developing situation — last updated Monday, April 20, 2026

This analysis reflects what is publicly known about the Vercel OAuth supply chain compromise at the time of publication. The incident remains under active investigation by Vercel and affected parties, and key details — including the full scope of downstream impact, the precise initial access vector, and attribution — may evolve as additional information becomes available. Where gaps exist, we have noted them explicitly rather than speculating. Defensive recommendations and detection guidance are based on the confirmed attack chain and established supply chain compromise patterns; organizations should act on these now rather than waiting for a complete picture. We will update this analysis as new technical details, vendor disclosures, or third-party research emerge.

In an intrusion that began around June 2024 and was disclosed in April 2026, attackers leveraged a compromise of Context.ai's Google Workspace OAuth application to gain a foothold into Vercel's internal systems, exposing environment variables for an undisclosed but reportedly limited subset of customer projects. Vercel is a cloud deployment and hosting platform widely used for front‑end and serverless applications.

On April 19, 2026, Vercel published its security bulletin and CEO Guillermo Rauch posted a detailed thread on X confirming the attack chain and naming Context.ai as the compromised third party.

The incident is significant because it demonstrates how OAuth supply-chain trust relationships create lateral movement paths that bypass traditional perimeter defenses, and because Vercel's environment variable sensitivity model left non-sensitive credentials not encrypted at rest, making it readable to an attacker with internal access.

This analysis examines the attack chain, evaluates the platform design decisions that amplified blast radius, contextualizes the breach against a rising wave of supply chain compromises (LiteLLM, Axios, Codecov, CircleCI), and provides actionable detection and hardening guidance for organizations operating on Vercel and similar PaaS platforms.

What this incident reveals

What makes this incident notable is not its sophistication, the techniques used are well-established, but for three broader implications that make it especially significant:

OAuth amplification. A single OAuth trust relationship cascaded into a platform-wide exposure event affecting downstream customers who had no direct relationship with the compromised vendor.
AI-accelerated tradecraft. The CEO publicly attributed the attacker's unusual velocity to AI augmentation — an early, high-profile data point in the 2026 discourse around AI-accelerated adversary tradecraft.
Detection-to-disclosure latency. At least one public customer report suggests credentials were being flagged as leaked in the wild nine days before Vercel's disclosure — raising questions about detection-to-disclosure latency in platform breaches.

Incident timeline

The attack spanned approximately 22 months from the initial OAuth compromise to Vercel's public disclosure. This dwell time is consistent with other OAuth-based intrusions, where attackers leverage legitimate application permissions that rarely trigger standard detection controls.

Data	Event	Verification status
~June 2024	Context.ai's Google Workspace OAuth application compromised	CONFIRMED — Rauch statement
June 2024 – 2025	Attacker maintains persistent access via compromised OAuth token	CONFIRMED — Vercel bulletin
Late 2024 – Early 2025	Attacker pivots from Context.ai OAuth access to a Vercel employee's Google Workspace account	CONFIRMED — Rauch statement
Early - mid-2025	Internal Vercel systems accessed; customer environment variable enumeration begins	CONFIRMED — Vercel bulletin
~February 2025	ShinyHunters-affiliated actor allegedly begins selling Vercel data on BreachForums	UNVERIFIED — threat actor claims only
April 10, 2026	OpenAI notifies a Vercel customer of a leaked API key (per customer account on X)	REPORTED — single source
April 19, 2026	Vercel publishes security bulletin; Rauch posts detailed thread on X naming Context.ai	CONFIRMED
April 19, 2026 onward	Customer notification, credential rotation guidance, and dashboard changes rolled out	CONFIRMED

Table 1. Summary of key events and their confirmation status

A key observation from the timeline is that the dwell time from initial OAuth compromise and public disclosure spanned approximately 22 months. While extended dwell time is not unusual for sophisticated intrusions—Codecov breach went undetected for around 2 months and CircleCI for weeks—it shows the difficulty of detecting OAuth-based lateral movement that uses legitimate application permissions.

Compounding this issue, Google Workspace OAuth audit logs are retained six months by default on many subscription tiers, meaning forensic visibility into the earliest compromise activity was likely gone before investigators could even look.

Attack chain

The attack exploited a trust chain that is endemic to modern SaaS environments: third-party OAuth applications granted access to corporate Google Workspace accounts.

Stage 1: Third-Party OAuth compromise (T1199)

Context.ai, a company providing AI analytics tooling, had a Google Workspace OAuth application authorized by Vercel employees. The attacker compromised this OAuth application — the exact mechanism of Context.ai's compromise has not been publicly disclosed. In his post on X, Rauch stated that Vercel has “reached out to Context to assist in understanding the full scale of the incident,” phrasing that suggests Context may not have detected the compromise itself.

This is the critical initial access vector. OAuth applications, once authorized, maintain persistent access tokens that:

Do not require the user's password
Survive password rotations
Often have broad scopes (email, drive, calendar access)
Are rarely audited after initial authorization

Stage 2: Workspace account takeover (T1550.001)

Using the compromised OAuth application's access, the attacker pivoted to a Vercel employee's Google Workspace account. This provided email access (potential for further credential harvesting), internal document access via Google Drive, calendar visibility into meetings and linked resources, and potential access to other OAuth-connected services.

Stage 3: Internal system access (T1078)

From the compromised Workspace account, the attacker pivoted into Vercel's internal systems. Rauch described the escalation as “a series of maneuvers that escalated from our colleague's compromised Vercel Google Workspace account.” The specific lateral movement technique — whether via SSO federation, harvested credentials from email/drive, or another OAuth-connected internal tool — has not been disclosed.

Stage 4: Environment variable enumeration (T1552.001)

The attacker accessed Vercel's internal systems with sufficient privileges to enumerate customer project environment variables. As per Rauch's public statement: Vercel stores all customer environment variables fully encrypted at rest, but the platform offers a capability to designate variables as “non-sensitive.” Through enumeration of these non-sensitive variables, the attacker obtained further access.

Stage 5: Potential downstream exploitation (T1078.004)

Exposed environment variables commonly contain credentials for downstream services. A single public customer report by Andrey Zagoruiko (April 19, 2026) described receiving an OpenAI leaked-key notification on April 10 for an API key that, according to the report, only existed only in Vercel—suggesting that at least one exposed credential was detected in the wild prior to Vercel’s disclosure.

This report introduces a potential detection-to-disclosure anomaly, which warrants closer examination and is explored in the following section.

Disclosure timeline anomaly

A public reply to Guillermo Rauch's April 19 thread on X surfaced a timeline detail that deserves independent attention. A Vercel customer, Andrey Zagoruiko, reported receiving a leaked-key notification from OpenAI on April 10, 2026—for an API key that, according to the customer, had never existed outside Vercel.

OpenAI's leaked-credential detection system typically triggers when an API key is found in a public location where it should not appear in (e.g., GitHub, paste sites, and similar sources). The pathway from a Vercel environment variable to an OpenAI notification is not trivially explained. Notably, the date creates a nine-day window between the earliest public evidence of exposure and Vercel's disclosure.

Figure 3. Disclosure timeline anomaly showing a nine‑day gap between apparent credential exposure and public notification.

What the 9-day gap means and what it does not

It is important to note that this is a single public report, not a forensic finding. It should not be interpreted as proof that Vercel knew about the compromise on April 10.

It is, however, evidence that at least one credential was detected in the wild before customers were formally notified to rotate secrets. This distinction matters for three audiences:

Regulators: Under GDPR, the 72-hour breach notification clock starts when a controller becomes aware of a breach. The question of when Vercel became aware is now public.
Auditors: SOC 2 and ISO 27001 assessors will examine the detection-to-notification latency as part of continuous-monitoring evidence.
Customers: Organizations whose credentials may have been exposed cannot assume the exposure window ended on April 19. It may have begun being actively exploited well before.

From an incident-response planning perspective, this data point also validates a practical point: unsolicited leaked-credential notifications from providers, such as OpenAI, Anthropic, GitHub, AWS, Stripe, and the likes, are now a primary early-warning channel for platform breaches. Security teams should treat them as high-priority signals, not routine noise.

AI-accelerated tradecraft (CEO Assessment)

In his April 19 thread on X, Vercel CEO Guillermo Rauch explicitly stated:

“We believe the attacking group to be highly sophisticated and, I strongly suspect, significantly accelerated by AI. They moved with surprising velocity and in-depth understanding of Vercel.”

This is a noteworthy on-record claim from a CEO of an affected platform and should be evaluated carefully. Attribution based on "velocity" is inherently interpretive, but it warrants attention for several reasons which we discuss in this section.

What "AI-accelerated" could plausibly look like in evidence

If Rauch’s assessment reflects something real rather than post-hoc rationalization, the underlying forensic signals would likely include one or more of the following:

Enumeration speed that exceeds manual pace. Scripting alone accounts for some of this, but LLM-driven reconnaissance can parallelize schema discovery, endpoint probing, and credential format recognition faster than manual query construction.
Contextual query construction. Queries that appear aware of Vercel-specific terminology (project slugs, deployment target names, env var naming conventions) without obvious prior reconnaissance.
Adaptive behavior in response to errors. LLM-assisted attackers tend to recover from API errors and rate-limits more fluently than static scripts, shifting strategy on the fly.
Prompt-engineered social artifacts. Phishing lures, commit messages, or support tickets that read as locally authentic rather than translated or templated.

Why this matters beyond the Vercel incident

Regardless of whether Rauch's assessment holds up to formal forensic review, the category itself—AI-augmented adversary operations—is no longer simply speculative. Microsoft's April 2026 publication on AI-enabled device-code phishing (Storm-2372 successor campaigns) documented live threat actors using generative AI for dynamic code generation, hyper-personalized lures, and backend automation orchestration. The implication is that telemetry baselines calibrated against human-paced attacker behavior may generate false negatives against AI-accelerated operators.

Detection-engineering implication

If AI-accelerated attackers compress the timeline of enumeration and lateral movement, detection rules tuned on dwell-time and velocity thresholds from older incident data may under-alert. In particular, teams should consider revisiting thresholds on: unique-resource enumeration rate per session, error-to-success ratio recovery curves, and diversity of query patterns within a short window.

The environment variable design problem

The most consequential aspect of this breach is not the initial access vector — OAuth compromises are a known and studied risk. It is Vercel's environment variable sensitivity model, which created a default-insecure configuration for customer secrets.

How Vercel environment variables worked at the time of the breach

Vercel projects use environment variables to inject configuration and secrets into serverless functions and build processes. These variables have a "sensitive" flag that controls access restrictions, as seen in Table 2.

Property	Default (Non-sensitive)	Sensitive
Default state	ON (all new vars)	Must be explicitly enabled
Visible in dashboard	Yes	Masked after creation
Accessible via internal APIs	Yes	Restricted
Encrypted at rest	No (according to Rauch)	Yes, with additional restrictions
Accessible to attacker in this breach	Yes	Appears not

Table 2. Comparison of Vercel environment variable handling based on sensitivity flag.

The critical design choice

The sensitive flag is off by default. Every DATABASE_URL, API_KEY, STRIPE_SECRET_KEY, or AWS_SECRET_ACCESS_KEY added by a developer who did not explicitly toggle this flag was stored unencrypted at rest in Vercel's internal access model.

Any security control that requires explicit opt-in for every individual secret, with no guardrails or defaults, will have a low adoption rate in practice.

Vercel's response

Rauch confirmed that Vercel has already rolled out dashboard changes: an overview page for environment variables and an improved UI for sensitive variable creation and management. These changes improve discoverability, but as of this writing do not change the default — developers must still opt in per variable. Whether Vercel will flip the default remains an open question that customers should press on.

Comparison to industry peers

The industry trend is toward purpose-built secret storage, such as Vault, AWS Secrets Manager, Doppler, and Infisical, rather than environment variable stores with sensitivity tiers. This breach validates that architectural choice.

Table 3 summarizes how Vercel’s environment variable based approach compares to common practices among similar platforms.

Platform	Default secret handling	Auto-detection
Vercel	Non-sensitive by default; manual flag	No
AWS SSM Parameter Store	Supports SecureString type	No (but distinct API)
HashiCorp Vault	All secrets encrypted with ACL	N/A (purpose-built)
GitHub Actions	All secrets masked in logs	No (but separate secrets UI)
Netlify	Environment variables with secret toggle	No

Table 3. Comparison of Vercel’s environment variable–based secret handling with industry peer platforms that employ dedicated secret management systems.

Credential fan-Out: Quantifying downstream risk

The term “credential fan-out” describes how a single platform breach cascades into exposure across every downstream service authenticated by credentials stored on that platform.

Figure 5. Illustration of credential fan-out and how one platform breach can turn into many

For this particular case, we summarize in Table 4 what Vercel project environment variables may typically include and their downstream impact.

Category	Example variables	Downstream impact
Database	DATABASE_URL, POSTGRES_PASSWORD	Full data access
Cloud	AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY	Cloud account compromise
Payment	STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET	Financial data, refund fraud
Auth	AUTH0_SECRET, NEXTAUTH_SECRET	Session forgery, account takeover
Email	SENDGRID_API_KEY, POSTMARK_TOKEN	Phishing from trusted domains
Monitoring	DATADOG_API_KEY, SENTRY_DSN	Telemetry manipulation
Source	GITHUB_TOKEN, NPM_TOKEN	Supply chain injection
AI/ML	OPENAI_API_KEY, ANTHROPIC_API_KEY	API abuse, cost generation

Table 4. Environment variables commonly stored in Vercel projects and the potential downstream impact if exposed.

A single Vercel project commonly contains 10 to 30 environment variables. At an organization scale, a portfolio of 50 projects could have 500 to 1,500 credentials within the platform. Each credential is a potential pivot point into an entirely separate system with its own blast radius.

This is the multiplier that elevates a platform breach from a confidentiality event into a potential cascade across the software supply chain.

Why OAuth trust relationships bypass perimeter defenses

A fundamental reason this attack succeeded for approximately 22 months is that OAuth-based intrusion bypasses most of the controls that would catch a traditional credential-based attack.

Every defensive control in the left column is something security teams rely on to detect or block account compromise. Every one of those controls is either irrelevant or already satisfied in the OAuth-app compromise path. This asymmetry is the reason OAuth governance is emerging as a distinct security discipline, separate from identity and access management.

Figure 6. Comparison of traditional credential based attack paths and OAuth application compromise, illustrating how OAuth trust relationships bypass perimeter security controls and enable silent lateral movement.

OAuth governance as a vendor-risk function

Most organizations treat OAuth grants as a developer self-service problem: each employee authorizes the tools they need, with minimal central review. This incident argues OAuth grants should be treated as third-party risk management — every authorized OAuth app is effectively a vendor with persistent access to corporate data, and should be vendor-reviewed, periodically re-authorized, and monitored for anomalous use.

Threat actor claims and attribution

Threat actor claims on underground forums are inherently unreliable. The following is documented for awareness and threat tracking, not as confirmed fact. Attribution in breach scenarios is notoriously difficult, and forum claims are frequently exaggerated, fabricated, or made by parties tangentially related to an incident.

ShinyHunters-affiliated claims

A threat actor claiming affiliation with the ShinyHunters group posted on BreachForums alleging possession of Vercel data.

Claimed data	Quantity
Employee records	~580
Source code repositories	Not specified
API keys and internal tokens	Not specified
GitHub and NPM tokens	Not specified
Internal communications	Not specified
Linear workspace access	Not specified

Table 5. Summary of claimed data and their quantity, all of which remain unverified.

Several factors complicate attribution of the incident to the actor claiming ShinyHunters affiliation:

Known ShinyHunters members have publicly denied involvement to BleepingComputer.
A $2M ransom demand was allegedly communicated via Telegram — a common monetization pattern for both legitimate and fabricated breach claims.
The ShinyHunters brand has been used by multiple, potentially unrelated actors since the group's original campaigns.
Vercel's security bulletin does not reference these claims; Rauch's thread addresses the attack chain but not the forum posting directly.

Supply chain release path: Vercel's position

Rauch directly addressed the highest-impact scenario stating that “We've analyzed our supply chain, ensuring Next.js, Turbopack, and our many open source projects remain safe for our community.”

Independent verification of release-path integrity is ongoing at the time of writing. Organizations using Next.js, Turbopack, or other Vercel open source projects should continue to monitor package integrity signals (checksums, signing, provenance attestations) as standard practice.

Without independent verification of the forum-claimed data, those claims should be treated as unconfirmed. The OAuth-based attack chain described by Vercel is technically sound and does not require the scope of access claimed by the forum poster, suggesting the claims may be exaggerated, may represent a separate unrelated incident, or may be fabricated.

MITRE ATT&CK Mapping

The confirmed attack chain maps cleanly to established MITRE ATT&CK techniques, as summarized in Table 6. The mapping reflects behaviors explicitly described in Vercel’s disclosure and aligns with well‑understood OAuth abuse patterns rather than novel exploitation.

Tactic	Technique	ID	Application
Initial Access	Trusted Relationship	T1199	Context.ai OAuth app as trusted third party
Persistence	Application Access Token	T1550.001	OAuth token survives password rotation
Credential Access	Valid Accounts	T1078	Compromised employee Workspace credentials
Discovery	Account Discovery	T1087	Internal system and project enumeration
Credential Access	Unsecured Credentials: Credentials in Files	T1552.001	Non-sensitive env vars accessible
Lateral Movement	Valid Accounts: Cloud Accounts	T1078.004	Potential use of exposed cloud credentials
Collection	Data from Information Repositories	T1213	Env var enumeration across projects

Table 6. MITRE ATT&CK technique mapping for the Vercel incident.

Based on this mapping, the pivot from OAuth application access to internal system access (T1199 to T1078) is the highest-value detection point.

Organizations should therefore monitor for anomalous OAuth application behavior, particularly applications accessing resources outside their expected scope or from unexpected IP ranges.

The supply chain siege: LiteLLM, Axios and a converged pattern

The Vercel breach did not occur in isolation. The period from March to April 2026 has seen an unprecedented concentration of software supply chain attacks, suggesting either coordinated campaign activity or—more likely—convergent discovery by multiple threat actors of the same structural weakness: the trust boundaries between package registries, CI/CD systems, OAuth providers, and deployment platforms.

Figure 7. Convergence of three distinct supply‑chain attack vectors on a single target: developer‑stored credentials and secrets.

March 24, 2026: LiteLLM PyPI supply chain compromise

Malicious PyPI packages litellm versions 1.82.7 and 1.82.8 were published using stolen CI/CD publishing credentials from Trivy (Aqua Security's vulnerability scanner). The attack targeted LiteLLM, a widely-used LLM proxy with ~3.4 million daily downloads.

Initial access: Attacker (tracked as "TeamPCP") compromised Trivy's CI/CD pipeline credentials, which had PyPI publishing permissions.
Payload: Three-stage backdoor targeting 50+ credential types across major cloud providers, with Kubernetes DaemonSet persistence for lateral movement.
Dwell time: Malicious packages were live for approximately 40 minutes to 3 hours before detection and removal.
CVE involved: CVE-2026-33634.

March 31, 2026: Axios npm supply chain compromise

The npm package axios (70–100 million weekly downloads) was compromised via maintainer account hijacking. Malicious versions 1.14.1 and 0.30.4 injected a dependency on plain-crypto-js@4.2.1, which contained a cross-platform Remote Access Trojan (RAT).

Initial access: Maintainer account hijacked (mechanism not disclosed; credential stuffing or phishing suspected).
Scale: 135 endpoints detected contacting attacker command-and-control infrastructure.
Dwell time: 2–3 hours before detection.
Attribution: Microsoft attributed the attack to Sapphire Sleet, a North Korean state-sponsored threat actor.

The convergence pattern

Three attacks in three weeks. Three different vectors. The same target: the credentials and secrets that developers store in their toolchains.

Incident	Date	Vector	Target asset	Dwell time
LiteLLM	Mar 24, 2026	CI/CD credential theft → PyPI	Developer credentials, API keys	40 min – 3 hrs
Axios	Mar 31, 2026	Maintainer account hijack → npm	Developer workstations (RAT)	2–3 hrs
Vercel	Apr 19, 2026	OAuth app compromise → platform	Customer env vars (credentials)	~22 months

Table 7. Summary of recent supply chain adjacent incidents targeting developer credentials and secret storage layers.

What previous platform breaches reveal

The Vercel breach follows a well-documented pattern of platform-level compromises that expose customer secrets at scale.

Codecov bash uploader breach (January – April 2021)

What happened: Attackers modified Codecov's Bash Uploader script (used in CI/CD pipelines) to exfiltrate environment variables from customers' CI environments. The compromise went undetected for approximately two months. 29,000+ customers potentially affected, including Twitch, HashiCorp, and Confluent.

Parallel to Vercel: Both incidents expose customer credentials stored as environment variables through a platform compromise.

CircleCI security incident (January 2023)

What happened: An attacker stole an employee's SSO session token via malware on a personal device, used it to access internal CircleCI systems, and exfiltrated customer secrets and encryption keys. CircleCI recommended all customers rotate every secret stored on the platform.

Parallel to Vercel: Nearly identical pattern — employee account compromise → internal system access → customer secret exfiltration.

Snowflake customer credential attacks (May–June 2024)

Threat actor UNC5537 used credentials obtained from infostealer malware to access Snowflake customer accounts that lacked MFA. Over 165 organizations affected, including Ticketmaster, Santander Bank, and AT&T.

Okta support system breach (October 2023)

Attackers accessed Okta's customer support case management system using stolen credentials, viewing HAR files that contained session tokens for Okta customers including Cloudflare, 1Password, and BeyondTrust.

Pattern summary

The pattern is clear. Platform-level access to customer secrets is a systemic risk that has been exploited repeatedly across CI/CD, identity, data warehouse, and deployment platforms. Each incident follows the same arc: initial access through a trust relationship or credential, lateral movement to internal systems, and exfiltration of customer secrets at scale.

Incident	Year	Initial vector	Customer asset exposed	Detection lag
Codecov	2021	Supply chain (script modification)	CI env vars	~2 months
Okta	2023	Stolen support credentials	Session tokens (HAR files)	Weeks
CircleCI	2023	SSO session token theft	Secrets + encryption keys	Weeks
Snowflake	2024	Infostealer credentials (no MFA)	Customer data	Months
Vercel	2024–2026	OAuth app compromise	Deployment env vars	~22 months

Table 8. Pattern of recent platform level breaches illustrating repeated exposure of customer secrets following trust based initial access and prolonged detection latency.

What remains unknown

Despite the volume of public reporting, executive statements, and third party commentary surrounding this incident, material gaps remain in the public record. A rigorous analysis requires not only examining what is known but explicitly acknowledging what has not been disclosed or independently verified.

The following unresolved questions represent significant gaps in publicly available information that are directly relevant to understanding the root cause, scope, and impact of this incident:

How Context.ai was compromised. The root cause of the OAuth application compromise has not been disclosed. Rauch's statement that Vercel has "reached out to Context to assist" suggests the scope may still be unclear to Context.ai itself.
When Vercel first detected anomalous activity. The April 10 OpenAI notification received by a Vercel customer raises this question sharply. Vercel has not published an internal-detection timeline.
Why the nine-day gap between the earliest public evidence of credential abuse and Vercel's disclosure. Multiple explanations are plausible (coordinated disclosure, ongoing investigation, customer notifications in progress); the public record does not resolve which applies.
Number of affected customers. Rauch described the impact as "quite limited"; a specific count has not been disclosed.
Whether the ShinyHunters forum claims represent the same attacker. Whether the claims match the confirmed attack chain or a separate incident remains unverified.
Context.ai's current status and downstream-customer notifications. Whether Context.ai has published its own incident report or notified other customers is unknown.
Full scope of internal access. Beyond environment variables, what other internal Vercel systems or data the attacker accessed during the 22-month dwell time.

Detection and hunting guidance

This section provides practical detection and hunting guidance for organizations potentially affected by the incident.

For Vercel customers (Immediate)

1. Audit all environment variables by entering the following code in Vercel projects to verify the configuration

# List all env vars across all Vercel projects via CLI
vercel env ls --environment production
vercel env ls --environment preview
vercel env ls --environment development

# Check which variables are NOT marked as sensitive
# (Vercel CLI does not currently expose the sensitive flag —
# check via dashboard or API)

2. Search for unauthorized usage of exposed credentials

Query cloud provider CloudTrail/audit logs for API calls using exposed access keys from unexpected IP ranges or user agents.
- AWS CloudTrail: Filter on eventSource containing sts.amazonaws.com, iam.amazonaws.com, s3.amazonaws.com. Search for userIdentity.accessKeyId matching any rotated Vercel-stored access key. Flag any sourceIPAddress outside your known CIDR ranges or any userAgent containing python-requests, curl, Go-http-client, or unfamiliar automation strings. Time window: June 2024 – present.
- GCP Audit Logs: Query protoPayload.authenticationInfo.principalEmail for service accounts whose keys were stored in Vercel. Filter protoPayload.requestMetadata.callerIp against your known ranges. Look for protoPayload.methodName containing storage.objects.get, compute.instances.list, or iam.serviceAccountKeys.create from unexpected sources.
- Azure Activity Logs: Filter on caller matching any application ID or service principal whose credentials were in Vercel env vars. Flag callerIpAddress outside expected ranges. Priority queries: Microsoft.Storage/storageAccounts/listKeys, Microsoft.Compute/virtualMachines/write, Microsoft.Authorization/roleAssignments/write.
Database access logs: For every database whose connection string was stored as a Vercel environment variable, query connection logs for the full exposure window (June 2024 – April 2026). Search for connections originating from IPs outside your application's known egress ranges (Vercel edge IPs, your VPN, your office). Flag connections using the exposed credentials that occurred outside normal deployment windows. For PostgreSQL: query pg_stat_activity and log_connections logs. For MySQL: query the general log or audit plugin. For MongoDB Atlas: query the Project Activity Feed for DATA_EXPLORER and CONNECT events from unknown IPs.
Payment processors: For Stripe, check the Dashboard → Developers → Logs for API calls using the exposed key. Filter for source_ip outside your servers. Look for /v1/charges, /v1/transfers, /v1/payouts, and /v1/customers calls you don't recognize. For Braintree/Adyen, query the equivalent API transaction logs. Priority: any api_key that was stored in Vercel as a non-sensitive env var and has not yet been rotated. Audit email sending service logs for unexpected sends.
Check for unsolicited leaked-credential notifications from OpenAI, Anthropic, GitHub, AWS, Stripe, and similar providers during the exposure window. These automated detection systems are now a primary early-warning channel for this class of breach.

3. Rotate AND redeploy

A critical operational detail to note is that a rotating Vercel environment variable does not retroactively invalidate old deployments. According to Vercel's documentation, prior deployments continue using the old credential value until they are redeployed.

Rotation without redeploy leaves the compromised credential live in any previous deployment artifact that is still reachable. Every credential rotation must be followed by a redeploy of every environment that used that variable, or the old deployments must be explicitly disabled.

Priority order for rotation:
Cloud provider credentials (AWS, GCP, Azure).
Database connection strings.
Payment processor keys.
Authentication secrets (JWT secrets, session keys).
Third-party API keys.
Monitoring and logging tokens.

For security teams (Proactive)

OAuth application audit — Google Workspace

Admin Console → Security → API Controls → Third-party app access.
Review all authorized OAuth applications.
Flag applications with broad scopes (Drive, Gmail, Calendar).
Investigate applications from vendors without active business relationships.
Monitor for OAuth token usage from unexpected IP ranges.
Search for the known-bad OAuth Client ID: 110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj.apps.googleusercontent.com

Detection Logic for SIEM Implementation

The following detection patterns map to the confirmed attack chain stages. Each pattern describes the observable behavior, the log source to instrument, and the conditions that should trigger investigation. Organizations should translate these into rules native to their SIEM platform (Sigma, Splunk SPL, KQL, Chronicle YARA-L) after validating field names against their specific log source schemas.

OAuth application anomalies (Stages 1–2)

Monitor Google Workspace token and admin audit logs for three patterns. First, any token refresh or authorization event associated with the known-bad OAuth Client ID (110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj.apps.googleusercontent.com) should trigger an immediate alert, this is the compromised Context.ai application.

Second, any OAuth application authorization event that grants broad scope (including full mail access, Drive read/write, calendar access) warrants review against your active vendor inventory; applications that are no longer in active business use should be revoked. Third, token usage from any authorized OAuth application where the source IP falls outside your expected corporate and vendor CIDR ranges should be flagged for investigation, as this may indicate token theft or application compromise.

Internal system access and lateral movement (Stage 3, T1078)

Once attackers control a compromised Google Workspace account, they pivot into internal systems that trust that identity. Detection should focus on four indicators:

Anomalous SSO/SAML authentication events. Monitor your identity provider logs for the compromised Workspace account authenticating into internal applications (Vercel dashboard, CI/CD platforms, internal tooling) from unfamiliar IP addresses, geolocations, or device fingerprints — particularly first-time access to systems that account had never previously touched.
Email and Drive credential harvesting. Review Google Workspace audit logs for bulk email search queries (keywords like "API key," "secret," "token," "password," ".env"), unusual Google Drive file access patterns (opening shared credential stores, engineering runbooks, or infrastructure documentation), and mail forwarding rule creation on the compromised account.
OAuth-connected internal tool access. The compromised Workspace identity likely had existing OAuth grants to internal tools (Slack, Jira, GitHub, internal dashboards). Monitor those downstream services for session creation or API activity tied to the compromised user that occurs outside normal working hours or from infrastructure inconsistent with the user's historical access pattern.
Privilege escalation attempts. Watch for the compromised identity requesting elevated permissions, joining new groups or roles, or accessing admin consoles it had not previously used. In Google Workspace specifically, monitor for Directory API calls, delegation changes, or attempts to enumerate other users' OAuth tokens.

Environment variable enumeration (Stage 4)

Monitor Vercel team audit logs for unusual patterns of environment variable access. The specific event types will depend on Vercel's audit log schema, but the target behavior is any API call that reads, lists, or decrypts environment variables at a volume or frequency inconsistent with normal deployment activity.

Baseline your normal deployment cadence first — CI/CD pipelines legitimately read environment variables at build time — then alert access patterns that deviate from that baseline in volume, timing, or source identity. Pay particular attention to any environment variable access originating from user accounts rather than service accounts, or from accounts that do not normally interact with the projects being accessed.

Downstream credential abuse (Stage 5)

For every credential that was stored as a non-sensitive Vercel environment variable during the exposure window (June 2024 – April 2026), query the corresponding service's access logs for usage from unexpected sources. In AWS, this means CloudTrail queries filtered on the specific access key IDs, looking for API calls from IP addresses outside your known application, CI/CD, and corporate ranges.

In GCP and Azure, the equivalent is audit log queries filtered on the relevant service account or application identity. For SaaS APIs (Stripe, OpenAI, Anthropic, SendGrid, Twilio), check the provider's dashboard or API logs for key usage from unrecognized IPs or during time windows when your application was not active. Any credential showing usage that cannot be attributed to your own infrastructure should be treated as compromised, rotated immediately, and investigated for what actions the attacker performed with it.

Third-Party credential leak notifications

Configure monitoring for unsolicited leaked-credential notifications from providers that operate automated secret scanning, including but not limited to GitHub (secret scanning partner program), AWS (compromised key detection), OpenAI, Anthropic, Stripe, and Google Cloud. These notifications are now a primary early-warning channel for platform-level credential exposure. Any such notification for a key that exists only in a deployment platform should be treated as a potential indicator of platform compromise, not routine key hygiene noise.

Threat hunting

Google Workspace Admin Console — manual search steps:

Admin Console → Reports → Audit and Investigation → OAuth Log Events
Filter: Application Name = "Context.ai" OR Client ID = 110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj.apps.googleusercontent.com
Date range: January 2024 – present
Export all results. Any hits = immediate revocation and incident investigation.

Google Workspace — all third-party OAuth apps with broad scopes:

Admin Console → Security → API Controls → Third-party app access → Manage Google Services
Sort by: "App access" → "Unrestricted"
For each app: verify (a) active vendor relationship exists, (b) scopes are justified by business use, (c) last-used date is recent. Any app not used in 90+ days: revoke.

Defensive recommendations

This section outlines defensive recommendations based on the confirmed attack tactics from this incident.

Immediate actions (0–48 hours)

Rotate all Vercel environment variables that were not marked as sensitive, regardless of whether you believe they were accessed. The cost of unnecessary rotation is trivial compared to the cost of a compromised credential.
Redeploy every environment after rotation — rotation alone does not invalidate old deployments.
Enable the sensitive flag on all environment variables containing any form of credential, token, key, or secret. Audit every project.
Audit OAuth application authorizations in your Google Workspace (or Microsoft Entra) admin console. Revoke access for any application that is no longer actively used.
Review access logs for all services whose credentials were stored as Vercel environment variables, covering the period June 2024 through present.

Short-term hardening (1–4 weeks)

Migrate secrets to a dedicated secrets manager (HashiCorp Vault, AWS Secrets Manager, Doppler, Infisical). Inject secrets at runtime rather than storing them as platform environment variables.
Implement OIDC-based authentication for CI/CD and deployment pipelines where supported, eliminating long-lived credentials entirely.
Deploy OAuth application monitoring — commercial solutions (Nudge Security, Grip Security, Valence Security) or Google Workspace's built-in OAuth app management.
Establish credential rotation automation — secrets should rotate on a defined schedule (30–90 days) regardless of incident status.
Treat OAuth grants as vendor relationships — add them to your third-party risk inventory alongside contracted vendors.

Architectural changes (1–6 months)

Adopt a zero-trust posture for environment variables — assume that any secret stored in a deployment platform may be exposed in a platform-level breach. Design systems so that a single credential exposure does not cascade.
Implement least-privilege scoping for all credentials — database credentials should have minimum required permissions, API keys should be scoped to specific operations, cloud credentials should use role-based temporary credentials rather than long-lived access keys.
Establish third-party vendor security review for any OAuth application or integration that accesses corporate identity systems. Include periodic re-review of existing authorizations.
Include PaaS platforms in your SBOM/ASPM inventory — this breach argues deployment platforms should be treated as tier-1 supply-chain dependencies, not external services.

Recommended monitoring

Audit Google Workspace Admin Console for the above OAuth Client ID.
Monitor Vercel audit logs for unexpected env.read or env.list API calls.
Review CloudTrail, GCP Audit Logs, and Azure Activity Logs for usage of credentials stored as Vercel env vars from unexpected IPs or user agents during June 2024 – April 2026.
Monitor for any of the LiteLLM or Axios-related IOCs published by their respective advisories if those packages are in your dependency tree.
Watch for unsolicited leaked-credential notifications from major API providers during the exposure window.

Regulatory and compliance implications

Organizations affected by credential exposure through the Vercel breach should evaluate notification obligations under:

GDPR (EU): If exposed credentials provided access to systems containing EU personal data, the 72-hour breach notification clock may have started upon confirmation of exposure. The April 10 OpenAI notification raises the question of whether some organizations' awareness predates Vercel's April 19 disclosure.
CCPA/CPRA (California): Exposure of credentials providing access to consumer data may trigger notification requirements.
PCI DSS: If payment processor credentials (Stripe, Braintree, Adyen) were exposed, PCI incident response procedures and forensic investigation requirements may apply.
SOC 2: Organizations with SOC 2 obligations should document the incident, credential rotation actions taken, and updated controls in their continuous monitoring evidence.
SEC Cybersecurity Rules (8-K): Public companies determining the breach is material have a 4-business-day disclosure obligation.

The challenge is that many organizations may not yet know whether the exposed credentials were actually used for unauthorized access — but regulatory frameworks often trigger on exposure, not confirmed exploitation.

Conclusion

The Vercel breach is not an isolated incident — it is the latest manifestation of a structural vulnerability in how the software industry manages secrets and trust relationships. In the span of three weeks, we have seen:

LiteLLM: CI/CD credentials stolen → malicious packages harvesting developer secrets at scale.
Axios: Maintainer account hijacked → RAT deployed to millions of developer environments.
Vercel: OAuth application compromised → platform-level access to customer deployment secrets, with at least one public report suggesting downstream credential abuse detected in the wild prior to disclosure.

Each attack targets a different link in the software supply chain. Together, they paint a picture of an ecosystem where credentials are the universal target and trust relationships are the universal attack surface. The cascade the industry has warned about is no longer purely theoretical.

The defensive path forward is clear, if not easy:

Stop storing long-lived credentials in platform environment variables. Use dedicated secret managers with runtime injection.
Stop trusting OAuth applications implicitly. Audit, monitor, and periodically re-authorize.
Stop assuming your platform provider's internal security posture. Design for the scenario where they are breached.
Start rotating credentials proactively — and remember to redeploy afterward.
Treat leaked-credential notifications from third-party providers as high-priority early-warning signals, not routine noise.

The organizations that will weather the next platform breach are those that assumed it would happen and built their credential architecture accordingly.

Indicators of Compromise (IoCs)

Confirmed IoC

Type	Value	Context
OAuth Client ID	110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj.apps.googleusercontent.com	Compromised Context.ai OAuth application

↑ top

4.Cal.diy: open-source community edition of cal.com

Sourcehttps://github.com/calcom/cal.diy

SiteGitHub

Submitterpetecooper (Hacker News)

Submitted2026-04-21 17:58 UTC (Hacker News)

HN activity49 points · 11 comments

Length4.1K words (~18 min read)

Languageen

Scheduling infrastructure for absolutely everyone. - calcom/cal.diy

Warning

Use at your own risk. Cal.diy is the open source community edition of Cal.com and it is intended for users who want to self-host their own Cal.diy instance. It is strictly recommended for personal, non-production use. Please review all installation and configuration steps carefully. Self-hosting requires advanced knowledge of server administration, database management, and securing sensitive data. Proceed only if you are comfortable with these responsibilities.

Tip

For any commercial and enterprise-ready scheduling infrastructure, use Cal.com, not Cal.diy; hosted by us or get invited to on-prem enterprise access here: https://cal.com/sales

Cal.diy

About Cal.diy

Cal.diy is the community-driven, fully open-source scheduling platform — a fork of Cal.com with all enterprise/commercial code removed.

Cal.diy is 100% MIT-licensed with no proprietary "Enterprise Edition" features. It's designed for individuals and self-hosters who want full control over their scheduling infrastructure without any commercial dependencies.

What's different from Cal.com?

No enterprise features — Teams, Organizations, Insights, Workflows, SSO/SAML, and other EE-only features have been removed
No license key required — Everything works out of the box, no Cal.com account or license needed
100% open source — The entire codebase is licensed under MIT, no "Open Core" split
Community-maintained — Contributions are welcome and go directly into this project (see CONTRIBUTING.md)

Note: Cal.diy is a self-hosted project. There is no hosted/managed version. You run it on your own infrastructure.

Built With

Getting Started

To get a local copy up and running, please follow these simple steps.

Prerequisites

Here is what you need to be able to run Cal.diy.

Node.js (Version: >=18.x)
PostgreSQL (Version: >=13.x)
Yarn (recommended)

If you want to enable any of the available integrations, you may want to obtain additional credentials for each one. More details on this can be found below under the integrations section.

Development

Setup

Clone the repo (or fork https://github.com/calcom/cal.diy/fork)
```
git clone https://github.com/calcom/cal.diy.git
```
If you are on Windows, run the following command on gitbash with admin privileges:
> git clone -c core.symlinks=true https://github.com/calcom/cal.diy.git
Go to the project folder
```
cd cal.diy
```
Install packages with yarn
```
yarn
```
Set up your .env file
- Duplicate .env.example to .env
- Use openssl rand -base64 32 to generate a key and add it under NEXTAUTH_SECRET in the .env file.
- Use openssl rand -base64 24 to generate a key and add it under CALENDSO_ENCRYPTION_KEY in the .env file.

Windows users: Replace the packages/prisma/.env symlink with a real copy to avoid a Prisma error (unexpected character / in variable name):
# Git Bash / WSL
rm packages/prisma/.env && cp .env packages/prisma/.env

Setup Node If your Node version does not meet the project's requirements as instructed by the docs, "nvm" (Node Version Manager) allows using Node at the version required by the project:
```
nvm use
```
You first might need to install the specific version and then use it:
```
nvm install && nvm use
```
You can install nvm from here.

Quick start with `yarn dx`

Requires Docker and Docker Compose to be installed

Will start a local Postgres instance with a few test users - the credentials will be logged in the console

yarn dx

Default credentials created:

Email	Password	Role
`free@example.com`	`free`	Free user
`pro@example.com`	`pro`	Pro user
`trial@example.com`	`trial`	Trial user
`admin@example.com`	`ADMINadmin2022!`	Admin user
`onboarding@example.com`	`onboarding`	Onboarding incomplete

You can use any of these credentials to sign in at http://localhost:3000

Tip: To view the full list of seeded users and their details, run yarn db-studio and visit http://localhost:5555

Development tip

Add export NODE_OPTIONS="--max-old-space-size=16384" to your shell script to increase the memory limit for the node process. Alternatively, you can run this in your terminal before running the app. Replace 16384 with the amount of RAM you want to allocate to the node process.
Add NEXT_PUBLIC_LOGGER_LEVEL={level} to your .env file to control the logging verbosity for all tRPC queries and mutations.
Where {level} can be one of the following:

0 for silly
1 for trace
2 for debug
3 for info
4 for warn
5 for error
6 for fatal

When you set NEXT_PUBLIC_LOGGER_LEVEL={level} in your .env file, it enables logging at that level and higher. Here's how it works:

The logger will include all logs that are at the specified level or higher. For example: \
- If you set NEXT_PUBLIC_LOGGER_LEVEL=2, it will log from level 2 (debug) upwards, meaning levels 2 (debug), 3 (info), 4 (warn), 5 (error), and 6 (fatal) will be logged. \
- If you set NEXT_PUBLIC_LOGGER_LEVEL=3, it will log from level 3 (info) upwards, meaning levels 3 (info), 4 (warn), 5 (error), and 6 (fatal) will be logged, but level 2 (debug) and level 1 (trace) will be ignored. \

echo 'NEXT_PUBLIC_LOGGER_LEVEL=3' >> .env

for Logger level to be set at info, for example.

Gitpod Setup

Click the button below to open this project in Gitpod.
This will open a fully configured workspace in your browser with all the necessary dependencies already installed.

Manual setup

Configure environment variables in the .env file. Replace <user>, <pass>, <db-host>, and <db-port> with their applicable values
```
DATABASE_URL='postgresql://<user>:<pass>@<db-host>:<db-port>'
```
If you don't know how to configure the DATABASE_URL, then follow the steps here to create a quick local DB
1. Download and install postgres in your local (if you don't have it already).
2. Create your own local db by executing createDB <DB name>
3. Now open your psql shell with the DB you created: psql -h localhost -U postgres -d <DB name>
4. Inside the psql shell execute \conninfo. And you will get the following info.
5. Now extract all the info and add it to your DATABASE_URL. The url would look something like this postgresql://postgres:postgres@localhost:5432/Your-DB-Name. The port is configurable and does not have to be 5432.
If you don't want to create a local DB. Then you can also consider using services like railway.app, Northflank or render.
Copy and paste your DATABASE_URL from .env to .env.appStore.
Set up the database using the Prisma schema (found in packages/prisma/schema.prisma)

In a development environment, run:
```
yarn workspace @calcom/prisma db-migrate
```
In a production environment, run:
```
yarn workspace @calcom/prisma db-deploy
```
Run mailhog to view emails sent during development

NOTE: Required when E2E_TEST_MAILHOG_ENABLED is "1"
```
docker pull mailhog/mailhog
docker run -d -p 8025:8025 -p 1025:1025 mailhog/mailhog
```
Run (in development mode)
```
yarn dev
```

Setting up your first user

Approach 1

Open Prisma Studio to look at or modify the database content:
```
yarn db-studio
```
Click on the User model to add a new user record.
Fill out the fields email, username, password, and set metadata to empty {} (remembering to encrypt your password with BCrypt) and click Save 1 Record to create your first user.

New users are set on a TRIAL plan by default. You might want to adjust this behavior to your needs in the packages/prisma/schema.prisma file.
Open a browser to http://localhost:3000 and login with your just created, first user.

Approach 2

Seed the local db by running

cd packages/prisma
yarn db-seed

The above command will populate the local db with dummy users.

E2E-Testing

Be sure to set the environment variable NEXTAUTH_URL to the correct value. If you are running locally, as the documentation within .env.example mentions, the value should be http://localhost:3000.

# In a terminal just run:
yarn test-e2e

# To open the last HTML report run:
yarn playwright show-report test-results/reports/playwright-html-report

Resolving issues

E2E test browsers not installed

Run npx playwright install to download test browsers and resolve the error below when running yarn test-e2e:

Executable doesn't exist at /Users/alice/Library/Caches/ms-playwright/chromium-1048/chrome-mac/Chromium.app/Contents/MacOS/Chromium

Upgrading from earlier versions

Pull the current version:
```
git pull
```
Check if dependencies got added/updated/removed
```
yarn
```
Apply database migrations by running one of the following commands:

In a development environment, run:
```
yarn workspace @calcom/prisma db-migrate
```
(This can clear your development database in some cases)

In a production environment, run:
```
yarn workspace @calcom/prisma db-deploy
```
Check for .env variables changes
```
yarn predev
```
Start the server. In a development environment, just do:
```
yarn dev
```
For a production build, run for example:
```
yarn build
yarn start
```
Enjoy the new version.

Deployment

Docker

The Docker image can be found on DockerHub at https://hub.docker.com/r/calcom/cal.diy.

Note for ARM Users: Use the {version}-arm suffix for pulling images. Example: docker pull calcom/cal.diy:v5.6.19-arm.

Requirements

Make sure you have docker & docker compose installed on the server / system. Both are installed by most docker utilities, including Docker Desktop and Rancher Desktop.

Note: docker compose without the hyphen is now the primary method of using docker-compose, per the Docker documentation.

Running Cal.diy with Docker Compose

Clone the repository

git clone --recursive https://github.com/calcom/cal.diy.git

Change into the directory
```
cd cal.diy
```
Prepare your configuration: Rename .env.example to .env and then update .env
```
cp .env.example .env
```
Most configurations can be left as-is, but for configuration options see Important Run-time variables below.

Required Secret Keys

Before starting, you must generate secure values for NEXTAUTH_SECRET and CALENDSO_ENCRYPTION_KEY. Using the default secret placeholder in production is a security risk.

Generate NEXTAUTH_SECRET (cookie encryption key):
```
openssl rand -base64 32
```
Generate CALENDSO_ENCRYPTION_KEY (must be 32 bytes for AES256):
```
openssl rand -base64 24
```
Update your .env file with these values:
```
NEXTAUTH_SECRET=<your_generated_secret>
CALENDSO_ENCRYPTION_KEY=<your_generated_key>
```
Push Notifications (VAPID Keys) If you see an error like:
```
Error: No key set vapidDetails.publicKey
```
This means your environment variables for Web Push are missing. You must generate and set NEXT_PUBLIC_VAPID_PUBLIC_KEY and VAPID_PRIVATE_KEY.

Generate them with:
```
npx web-push generate-vapid-keys
```
Then update your .env file:
```
NEXT_PUBLIC_VAPID_PUBLIC_KEY=your_public_key_here
VAPID_PRIVATE_KEY=your_private_key_here
```
Do not commit real keys to .env.example — only placeholders.

Update the appropriate values in your .env file, then proceed.
(optional) Pre-Pull the images by running the following command:
```
docker compose pull
```
Start Cal.diy via docker compose

To run the complete stack, which includes a local Postgres database, Cal.diy web app, and Prisma Studio:
```
docker compose up -d
```
To run Cal.diy web app and Prisma Studio against a remote database, ensure that DATABASE_URL is configured for an available database and run:
```
docker compose up -d calcom studio
```
To run only the Cal.diy web app, ensure that DATABASE_URL is configured for an available database and run:
```
docker compose up -d calcom
```
Note: to run in attached mode for debugging, remove -d from your desired run command.
Open a browser to http://localhost:3000, or your defined NEXT_PUBLIC_WEBAPP_URL. The first time you run Cal.diy, a setup wizard will initialize. Define your first user, and you're ready to go!

Note for first-time setup (Calendar integration): During the setup wizard, you may encounter a "Connect your Calendar" step that appears to be required. If you do not wish to connect a calendar at this time, you can skip this step by navigating directly to the dashboard at <NEXT_PUBLIC_WEBAPP_URL>/event-types. Calendar integrations can be added later from the Settings > Integrations page.

Updating Cal.diy

Stop the Cal.diy stack
```
docker compose down
```
Pull the latest changes
```
docker compose pull
```
Update env vars as necessary.
Re-start the Cal.diy stack
```
docker compose up -d
```

Building from source with Docker

Clone the repository

git clone https://github.com/calcom/cal.diy.git

Change into the directory
```
cd cal.diy
```
Rename .env.example to .env and then update .env

For configuration options see Build-time variables below. Update the appropriate values in your .env file, then proceed.
Build the Cal.diy docker image:

Note: Due to application configuration requirements, an available database is currently required during the build process.

a) If hosting elsewhere, configure the DATABASE_URL in the .env file, and skip the next step

b) If a local or temporary database is required, start a local database via docker compose.
```
docker compose up -d database
```
Build Cal.diy via docker compose (DOCKER_BUILDKIT=0 must be provided to allow a network bridge to be used at build time. This requirement will be removed in the future)
```
DOCKER_BUILDKIT=0 docker compose build calcom
```
Start Cal.diy via docker compose

To run the complete stack, which includes a local Postgres database, Cal.diy web app, and Prisma Studio:
```
docker compose up -d
```
To run Cal.diy web app and Prisma Studio against a remote database, ensure that DATABASE_URL is configured for an available database and run:
```
docker compose up -d calcom studio
```
To run only the Cal.diy web app, ensure that DATABASE_URL is configured for an available database and run:
```
docker compose up -d calcom
```
Note: to run in attached mode for debugging, remove -d from your desired run command.
Open a browser to http://localhost:3000, or your defined NEXT_PUBLIC_WEBAPP_URL. The first time you run Cal.diy, a setup wizard will initialize. Define your first user, and you're ready to go!

Configuration

Important Run-time variables

These variables must also be provided at runtime

Variable	Description	Required	Default
DATABASE_URL	database url with credentials - if using a connection pooler, this setting should point there	required	`postgresql://unicorn_user:magical_password@database:5432/calendso`
NEXT_PUBLIC_WEBAPP_URL	Base URL of the site. NOTE: if this value differs from the value used at build-time, there will be a slight delay during container start (to update the statically built files).	optional	`http://localhost:3000`
NEXTAUTH_URL	Location of the auth server. By default, this is the Cal.diy docker instance itself.	optional	`{NEXT_PUBLIC_WEBAPP_URL}/api/auth`
NEXTAUTH_SECRET	Cookie encryption key. Must match build variable. Generate with: `openssl rand -base64 32`	required	`secret`
CALENDSO_ENCRYPTION_KEY	Authentication encryption key (32 bytes for AES256). Must match build variable. Generate with: `openssl rand -base64 24`	required	`secret`

Build-time variables

If building the image yourself, these variables must be provided at the time of the docker build, and can be provided by updating the .env file. Currently, if you require changes to these variables, you must follow the instructions to build and publish your own image.

Variable	Description	Required	Default
DATABASE_URL	database url with credentials - if using a connection pooler, this setting should point there	required	`postgresql://unicorn_user:magical_password@database:5432/calendso`
MAX_OLD_SPACE_SIZE	Needed for Nodejs/NPM build options	required	4096
NEXTAUTH_SECRET	Cookie encryption key	required	`secret`
CALENDSO_ENCRYPTION_KEY	Authentication encryption key	required	`secret`
NEXT_PUBLIC_WEBAPP_URL	Base URL injected into static files	optional	`http://localhost:3000`
NEXT_PUBLIC_WEBSITE_TERMS_URL	custom URL for terms and conditions website	optional
NEXT_PUBLIC_WEBSITE_PRIVACY_POLICY_URL	custom URL for privacy policy website	optional
CALCOM_TELEMETRY_DISABLED	Allow Cal.diy to collect anonymous usage data (set to `1` to disable)	optional

Troubleshooting

SSL edge termination

If running behind a load balancer which handles SSL certificates, you will need to add the environmental variable NODE_TLS_REJECT_UNAUTHORIZED=0 to prevent requests from being rejected. Only do this if you know what you are doing and trust the services/load-balancers directing traffic to your service.

Failed to commit changes: Invalid 'prisma.user.create()'

Certain versions may have trouble creating a user if the field metadata is empty. Using an empty json object {} as the field value should resolve this issue. Also, the id field will autoincrement, so you may also try leaving the value of id as empty.

CLIENT_FETCH_ERROR

If you experience this error, it may be the way the default Auth callback in the server is using the WEBAPP_URL as a base url. The container does not necessarily have access to the same DNS as your local machine, and therefore needs to be configured to resolve to itself. You may be able to correct this by configuring NEXTAUTH_URL=http://localhost:3000/api/auth, to help the backend loop back to itself.

docker-calcom-1  | @calcom/web:start: [next-auth][error][CLIENT_FETCH_ERROR]
docker-calcom-1  | @calcom/web:start: https://next-auth.js.org/errors#client_fetch_error request to http://testing.localhost:3000/api/auth/session failed, reason: getaddrinfo ENOTFOUND testing.localhost {
docker-calcom-1  | @calcom/web:start:   error: {
docker-calcom-1  | @calcom/web:start:     message: 'request to http://testing.localhost:3000/api/auth/session failed, reason: getaddrinfo ENOTFOUND testing.localhost',
docker-calcom-1  | @calcom/web:start:     stack: 'FetchError: request to http://testing.localhost:3000/api/auth/session failed, reason: getaddrinfo ENOTFOUND testing.localhost\n' +
docker-calcom-1  | @calcom/web:start:       '    at ClientRequest.<anonymous> (/calcom/node_modules/next/dist/compiled/node-fetch/index.js:1:65756)\n' +
docker-calcom-1  | @calcom/web:start:       '    at ClientRequest.emit (node:events:513:28)\n' +
docker-calcom-1  | @calcom/web:start:       '    at ClientRequest.emit (node:domain:489:12)\n' +
docker-calcom-1  | @calcom/web:start:       '    at Socket.socketErrorListener (node:_http_client:494:9)\n' +
docker-calcom-1  | @calcom/web:start:       '    at Socket.emit (node:events:513:28)\n' +
docker-calcom-1  | @calcom/web:start:       '    at Socket.emit (node:domain:489:12)\n' +
docker-calcom-1  | @calcom/web:start:       '    at emitErrorNT (node:internal/streams/destroy:157:8)\n' +
docker-calcom-1  | @calcom/web:start:       '    at emitErrorCloseNT (node:internal/streams/destroy:122:3)\n' +
docker-calcom-1  | @calcom/web:start:       '    at processTicksAndRejections (node:internal/process/task_queues:83:21)',
docker-calcom-1  | @calcom/web:start:     name: 'FetchError'
docker-calcom-1  | @calcom/web:start:   },
docker-calcom-1  | @calcom/web:start:   url: 'http://testing.localhost:3000/api/auth/session',
docker-calcom-1  | @calcom/web:start:   message: 'request to http://testing.localhost:3000/api/auth/session failed, reason: getaddrinfo ENOTFOUND testing.localhost'
docker-calcom-1  | @calcom/web:start: }

Railway

You can deploy Cal.diy on Railway. The team at Railway also have a detailed blog post on deploying on their platform.

Northflank

You can deploy Cal.diy on Northflank. The team at Northflank also have a detailed blog post on deploying on their platform.

Vercel

Currently Vercel Pro Plan is required to be able to Deploy this application with Vercel, due to limitations on the number of serverless functions on the free plan.

Render

Elestio

License

Cal.diy is fully open source, licensed under the MIT License.

Unlike Cal.com's "Open Core" model, Cal.diy has no commercial/enterprise code. The entire codebase is available under the same open-source license.

Enabling Content Security Policy

Set CSP_POLICY="non-strict" env variable, which enables Strict CSP except for unsafe-inline in style-src . If you have some custom changes in your instance, you might have to make some code change to make your instance CSP compatible. Right now it enables strict CSP only on login page and on other SSR pages it is enabled in Report only mode to detect possible issues. On, SSG pages it is still not supported.

Integrations

Obtaining the Google API Credentials

Open Google API Console. If you don't have a project in your Google Cloud subscription, you'll need to create one before proceeding further. Under Dashboard pane, select Enable APIS and Services.
In the search box, type calendar and select the Google Calendar API search result.
Enable the selected API.
Next, go to the OAuth consent screen from the side pane. Select the app type (Internal or External) and enter the basic app details on the first page.
In the second page on Scopes, select Add or Remove Scopes. Search for Calendar.event and select the scope with scope value .../auth/calendar.events, .../auth/calendar.readonly and select Update.
In the third page (Test Users), add the Google account(s) you'll be using. Make sure the details are correct on the last page of the wizard and your consent screen will be configured.
Now select Credentials from the side pane and then select Create Credentials. Select the OAuth Client ID option.
Select Web Application as the Application Type.
Under Authorized redirect URI's, select Add URI and then add the URI <Cal.diy URL>/api/integrations/googlecalendar/callback and <Cal.diy URL>/api/auth/callback/google replacing Cal.diy URL with the URI at which your application runs.
The key will be created and you will be redirected back to the Credentials page. Select the newly generated client ID under OAuth 2.0 Client IDs.
Select Download JSON. Copy the contents of this file and paste the entire JSON string in the .env file as the value for GOOGLE_API_CREDENTIALS key.

Adding google calendar to Cal.diy App Store

After adding Google credentials, you can now Google Calendar App to the app store. You can repopulate the App store by running

cd packages/prisma
yarn seed-app-store

You will need to complete a few more steps to activate Google Calendar App. Make sure to complete section "Obtaining the Google API Credentials". After that do the following

Add extra redirect URL <Cal.diy URL>/api/auth/callback/google
Under 'OAuth consent screen', click "PUBLISH APP"

Obtaining Microsoft Graph Client ID and Secret

Open Azure App Registration and select New registration
Name your application
Set Who can use this application or access this API? to Accounts in any organizational directory (Any Azure AD directory - Multitenant)
Set the Web redirect URI to <Cal.diy URL>/api/integrations/office365calendar/callback replacing Cal.diy URL with the URI at which your application runs.
Use Application (client) ID as the MS_GRAPH_CLIENT_ID attribute value in .env
Click Certificates & secrets create a new client secret and use the value as the MS_GRAPH_CLIENT_SECRET attribute

Obtaining Zoom Client ID and Secret

Open Zoom Marketplace and sign in with your Zoom account.
On the upper right, click "Develop" => "Build App".
Select "General App" , click "Create".
Name your App.
Choose "User-managed app" for "Select how the app is managed".
De-select the option to publish the app on the Zoom App Marketplace, if asked.
Now copy the Client ID and Client Secret to your .env file into the ZOOM_CLIENT_ID and ZOOM_CLIENT_SECRET fields.
Set the "OAuth Redirect URL" under "OAuth Information" as <Cal.diy URL>/api/integrations/zoomvideo/callback replacing Cal.diy URL with the URI at which your application runs.
Also add the redirect URL given above as an allow list URL and enable "Subdomain check". Make sure, it says "saved" below the form.
You don't need to provide basic information about your app. Instead click on "Scopes" and then on "+ Add Scopes". On the left,
1. click the category "Meeting" and check the scope meeting:write:meeting.
2. click the category "User" and check the scope user:read:settings.
Click "Done".
You're good to go. Now you can easily add your Zoom integration in the Cal.diy settings.

Obtaining Daily API Credentials

Open Daily.co and create an account.
From within your dashboard, go to the developers tab.
Copy your API key.
Now paste the API key to your .env file into the DAILY_API_KEY field in your .env file.
If you have the Daily Scale Plan set the DAILY_SCALE_PLAN variable to true in order to use features like video recording.

Obtaining Basecamp Client ID and Secret

Visit the 37 Signals Integrations Dashboard and sign in.
Register a new application by clicking the Register one now link.
Fill in your company details.
Select Basecamp 4 as the product to integrate with.
Set the Redirect URL for OAuth <Cal.diy URL>/api/integrations/basecamp3/callback replacing Cal.diy URL with the URI at which your application runs.
Click on done and copy the Client ID and secret into the BASECAMP3_CLIENT_ID and BASECAMP3_CLIENT_SECRET fields.
Set the BASECAMP3_CLIENT_SECRET env variable to {your_domain} ({support_email}).

Obtaining HubSpot Client ID and Secret

Open HubSpot Developer and sign into your account, or create a new one.
From within the home of the Developer account page, go to "Manage apps".
Click "Create legacy app" button top right and select public app.
Fill in any information you want in the "App info" tab
Go to tab "Auth"
Now copy the Client ID and Client Secret to your .env file into the HUBSPOT_CLIENT_ID and HUBSPOT_CLIENT_SECRET fields.
Set the Redirect URL for OAuth <Cal.diy URL>/api/integrations/hubspot/callback replacing Cal.diy URL with the URI at which your application runs.
In the "Scopes" section at the bottom of the page, make sure you select "Read" and "Write" for scopes called crm.objects.contacts and crm.lists.
Click the "Save" button at the bottom footer.
You're good to go. Now you can see any booking in Cal.diy created as a meeting in HubSpot for your contacts.

Obtaining Webex Client ID and Secret

See Webex Readme

Obtaining ZohoCRM Client ID and Secret

Open Zoho API Console and sign into your account, or create a new one.
From within the API console page, go to "Applications".
Click "ADD CLIENT" button top right and select "Server-based Applications".
Fill in any information you want in the "Client Details" tab
Go to tab "Client Secret" tab.
Now copy the Client ID and Client Secret to your .env file into the ZOHOCRM_CLIENT_ID and ZOHOCRM_CLIENT_SECRET fields.
Set the Redirect URL for OAuth <Cal.diy URL>/api/integrations/zohocrm/callback replacing Cal.diy URL with the URI at which your application runs.
In the "Settings" section check the "Multi-DC" option if you wish to use the same OAuth credentials for all data centers.
Click the "Save"/ "UPDATE" button at the bottom footer.
You're good to go. Now you can easily add your ZohoCRM integration in the Cal.diy settings.

Rate Limiting with Unkey

Cal.diy uses Unkey for rate limiting. This is an optional feature and is not required for self-hosting.

If you want to enable rate limiting:

Sign up for an account at unkey.com
Create a Root key with permissions for ratelimit.create_namespace and ratelimit.limit
Copy the root key to your .env file into the UNKEY_ROOT_KEY field

Note: If you don't configure Unkey, Cal.diy will work normally without rate limiting enabled.

Contributing

We welcome contributions! Whether it's fixing a typo, improving documentation, or building new features, your help makes Cal.diy better.

Important: Cal.diy is a community fork. Contributions to this repo do not flow to Cal.com's production platform. See CONTRIBUTING.md for details.

Check out our Contributing Guide for detailed steps.
Join the discussion on GitHub Discussions.
Please follow our coding standards and commit message conventions to keep the project consistent.

Even small improvements matter — thank you for helping us grow!

Good First Issues

We have a list of help wanted that contain small features and bugs which have a relatively limited scope. This is a great place to get started, gain experience, and get familiar with our contribution process.

Contributors

Translations

Don't code but still want to contribute? Join our Discussions and help translate Cal.diy into your language.

Acknowledgements

Cal.diy is built on the foundation created by Cal.com and the many contributors to the original project. Special thanks to:

↑ top

8.Show HN: GoModel – an open-source AI gateway in Go

Sourcehttps://github.com/ENTERPILOT/GOModel/

SiteGitHub

Submittersantiago-pl (Hacker News)

Submitted2026-04-21 14:11 UTC (Hacker News)

HN activity126 points · 44 comments

Length1.1K words (~5 min read)

Languageen

High-performance AI gateway written in Go - unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, Groq, xAI & Ollama. LiteLLM alternative with observability, guardrails & streaming. ...

A high-performance AI gateway written in Go, providing a unified OpenAI-compatible API for OpenAI, Anthropic, Gemini, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, Ollama, and more.

Animated GoModel AI gateway dashboard showing usage analytics, token tracking, and estimated cost monitoring

Quick Start - Deploy the AI Gateway

Step 1: Start GoModel

docker run --rm -p 8080:8080 \
  -e LOGGING_ENABLED=true \
  -e LOGGING_LOG_BODIES=true \
  -e LOG_FORMAT=text \
  -e LOGGING_LOG_HEADERS=true \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

Pass only the provider credentials or base URL you need (at least one required):

docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e ZAI_API_KEY="your-zai-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  -e ORACLE_API_KEY="your-oracle-key" \
  -e ORACLE_BASE_URL="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1" \
  -e ORACLE_MODELS="openai.gpt-oss-120b,xai.grok-3" \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  enterpilot/gomodel

⚠️ Avoid passing secrets via -e on the command line - they can leak via shell history and process lists. For production, use docker run --env-file .env to load API keys from a file instead.

Step 2: Make your first API call

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5-chat-latest",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

That's it! GoModel automatically detects which providers are available based on the credentials you supply.

Supported LLM Providers

Example model identifiers are illustrative and subject to change; consult provider catalogs for current models. Feature columns reflect gateway API support, not every individual model capability exposed by an upstream provider.

Provider	Credential	Example Model	Chat	`/responses`	Embed	Files	Batches	Passthru
OpenAI	`OPENAI_API_KEY`	`gpt-4o-mini`	✅	✅	✅	✅	✅	✅
Anthropic	`ANTHROPIC_API_KEY`	`claude-sonnet-4-20250514`	✅	✅	❌	❌	✅	✅
Google Gemini	`GEMINI_API_KEY`	`gemini-2.5-flash`	✅	✅	✅	✅	✅	❌
Groq	`GROQ_API_KEY`	`llama-3.3-70b-versatile`	✅	✅	✅	✅	✅	❌
OpenRouter	`OPENROUTER_API_KEY`	`google/gemini-2.5-flash`	✅	✅	✅	✅	✅	✅
Z.ai	`ZAI_API_KEY` (`ZAI_BASE_URL` optional)	`glm-5.1`	✅	✅	✅	❌	❌	✅
xAI (Grok)	`XAI_API_KEY`	`grok-2`	✅	✅	✅	✅	✅	❌
Azure OpenAI	`AZURE_API_KEY` + `AZURE_BASE_URL` (`AZURE_API_VERSION` optional)	`gpt-4o`	✅	✅	✅	✅	✅	✅
Oracle	`ORACLE_API_KEY` + `ORACLE_BASE_URL`	`openai.gpt-oss-120b`	✅	✅	❌	❌	❌	❌
Ollama	`OLLAMA_BASE_URL`	`llama3.2`	✅	✅	✅	❌	❌	❌

✅ Supported ❌ Unsupported

For Z.ai's GLM Coding Plan, set ZAI_BASE_URL=https://api.z.ai/api/coding/paas/v4. For Oracle, set ORACLE_MODELS=openai.gpt-oss-120b,xai.grok-3 when the upstream /models endpoint is unavailable.

Alternative Setup Methods

Running from Source

Prerequisites: Go 1.26.2+

Create a .env file:
```
cp .env.template .env
```
Add your API keys to .env (at least one required).
Start the server:
```
make run
```

Docker Compose

Infrastructure only (Redis, PostgreSQL, MongoDB, Adminer - no image build):

docker compose up -d
# or: make infra

Full stack (adds GoModel + Prometheus; builds the app image):

cp .env.template .env
# Add your API keys to .env
docker compose --profile app up -d
# or: make image

Service	URL
GoModel API	http://localhost:8080
Adminer (DB UI)	http://localhost:8081
Prometheus	http://localhost:9090

Building the Docker Image Locally

docker build -t gomodel .
docker run --rm -p 8080:8080 --env-file .env gomodel

OpenAI-Compatible API Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions (streaming supported)
`/v1/responses`	POST	OpenAI Responses API
`/v1/embeddings`	POST	Text embeddings
`/v1/files`	POST	Upload a file (OpenAI-compatible multipart)
`/v1/files`	GET	List files
`/v1/files/{id}`	GET	Retrieve file metadata
`/v1/files/{id}`	DELETE	Delete a file
`/v1/files/{id}/content`	GET	Retrieve raw file content
`/v1/batches`	POST	Create a native provider batch (OpenAI-compatible schema; inline `requests` supported where provider-native)
`/v1/batches`	GET	List stored batches
`/v1/batches/{id}`	GET	Retrieve one stored batch
`/v1/batches/{id}/cancel`	POST	Cancel a pending batch
`/v1/batches/{id}/results`	GET	Retrieve native batch results when available
`/p/{provider}/...`	GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS	Provider-native passthrough with opaque upstream responses
`/v1/models`	GET	List available models
`/health`	GET	Health check
`/metrics`	GET	Prometheus metrics (when enabled)
`/admin/api/v1/usage/summary`	GET	Aggregate token usage statistics
`/admin/api/v1/usage/daily`	GET	Per-period token usage breakdown
`/admin/api/v1/usage/models`	GET	Usage breakdown by model
`/admin/api/v1/usage/log`	GET	Paginated usage log entries
`/admin/api/v1/audit/log`	GET	Paginated audit log entries
`/admin/api/v1/audit/conversation`	GET	Conversation thread around one audit log entry
`/admin/api/v1/models`	GET	List models with provider type
`/admin/api/v1/models/categories`	GET	List model categories
`/admin/dashboard`	GET	Admin dashboard UI
`/swagger/index.html`	GET	Swagger UI (when enabled)

Gateway Configuration

GoModel is configured through environment variables and an optional config.yaml. Environment variables override YAML values. See .env.template and config/config.example.yaml for the available options.

Key settings:

Variable	Default	Description
`PORT`	`8080`	Server port
`GOMODEL_MASTER_KEY`	(none)	API key for authentication
`ENABLE_PASSTHROUGH_ROUTES`	`true`	Enable provider-native passthrough routes under `/p/{provider}/...`
`ALLOW_PASSTHROUGH_V1_ALIAS`	`true`	Allow `/p/{provider}/v1/...` aliases while keeping `/p/{provider}/...` canonical
`ENABLED_PASSTHROUGH_PROVIDERS`	`openai,anthropic,openrouter,zai`	Comma-separated list of enabled passthrough providers
`STORAGE_TYPE`	`sqlite`	Storage backend (`sqlite`, `postgresql`, `mongodb`)
`METRICS_ENABLED`	`false`	Enable Prometheus metrics
`LOGGING_ENABLED`	`false`	Enable audit logging
`GUARDRAILS_ENABLED`	`false`	Enable the configured guardrails pipeline

Quick Start - Authentication: By default GOMODEL_MASTER_KEY is unset. Without this key, API endpoints are unprotected and anyone can call them. This is insecure for production. Strongly recommend setting a strong secret before exposing the service. Add GOMODEL_MASTER_KEY to your .env or environment for production deployments.

Response Caching

GoModel has a two-layer response cache that reduces LLM API costs and latency for repeated or semantically similar requests.

Layer 1 - Exact-match cache

Hashes the full request body (path + Workflow + body) and returns a stored response on byte-identical requests. Sub-millisecond lookup. Activate by environment variables: RESPONSE_CACHE_SIMPLE_ENABLED and REDIS_URL.

Responses served from this layer carry X-Cache: HIT (exact).

Layer 2 - Semantic cache

Embeds the last user message via your configured provider’s OpenAI-compatible /v1/embeddings API (cache.response.semantic.embedder.provider must name a key in the top-level providers map) and performs a KNN vector search. Semantically equivalent queries - e.g. "What's the capital of France?" vs "Which city is France's capital?" - can return the same cached response without an upstream LLM call.

Expected hit rates: ~60–70% in high-repetition workloads vs. ~18% for exact-match alone.

Responses served from this layer carry X-Cache: HIT (semantic).

Supported vector backends: qdrant, pgvector, pinecone, weaviate (set cache.response.semantic.vector_store.type and the matching nested block).

Both cache layers run after guardrail/workflow patching so they always see the final prompt. Use Cache-Control: no-cache or Cache-Control: no-store to bypass caching per-request.

See DEVELOPMENT.md for testing, linting, and pre-commit setup.

Roadmap to 0.2.0

Must Have

Intelligent routing
Broader provider support: Oracle model configuration via environment variables, plus Cohere, Command A, Operational, and DeepSeek V3
Budget management with limits per user_path and/or API key
Editable model pricing for accurate cost tracking and budgeting
Full support for the OpenAI /responses and /conversations lifecycle
Prompt cache visibility showing how much of each prompt was cached by the provider
Guardrails hardening: better UI, simpler architecture, easier custom guardrails, and response-side guardrails before output reaches the client
Passthrough for all providers, beyond the current OpenAI and Anthropic beta
Fix failover charts in the dashboard

Should Have

Cluster mode

Community

Join our Discord to connect with other GoModel users.

Star History

↑ top

10.Edit store price tags using Flipper Zero

Sourcehttps://github.com/i12bp8/TagTinker

SiteGitHub

Submittertrueduke (Hacker News)

Submitted2026-04-19 09:26 UTC (Hacker News)

HN activity186 points · 189 comments

Length843 words (~4 min read)

Languageen

Flipper Zero app for ESL research using IR. All based on https://www.furrtek.org/?a=esl - i12bp8/TagTinker

Infrared ESL Research Toolkit for Flipper Zero
_{Protocol study • Signal analysis • Controlled display experiments on authorized hardware}

_{Owner-authorized lab display experiment}

Important

TagTinker is a research tool.

It is intended only for protocol study, signal analysis, and controlled experiments on hardware you personally own or are explicitly authorized to test.

This repository does not authorize access to, modification of, or interference with any third-party deployment, commercial installation, or retail environment.

Warning

Strictly prohibited uses include:

Testing against deployed third-party systems
Use in retail or commercial environments
Altering prices, product data, or operational displays
Interfering with business operations
Bypassing pairing, authorization, or security controls
Any unauthorized, unlawful, or harmful activity

Overview

TagTinker is a Flipper Zero app for educational research into infrared electronic shelf-label protocols and related display behavior on authorized test hardware.

It is focused on:

protocol observation and replay analysis
controlled display experiments
monochrome image preparation workflows
local tooling for research and interoperability testing

This README intentionally avoids deployment-oriented instructions and excludes guidance for interacting with live commercial systems.

Features

Text, image, and test-pattern display experiments
Local web-based image preparation utility (tools/tagtinker.html)
Signal and response testing for authorized bench hardware
Small, modular codebase suitable for further research
Research-first project structure with clear scope boundaries

FAQ

Where is the .fap release?

The Flipper app is source-first. Build the .fap yourself from this repository with ufbt so it matches your firmware and local toolchain.

What if it crashes or behaves oddly?

The maintainer primarily uses TagTinker on Momentum firmware with asset packs disabled and has not had issues in that setup. If you are using a different firmware branch, custom asset packs, or a heavily modified device setup, start by testing from a clean baseline.

What happens if I pull the battery out of the tag?

Many infrared ESL tags store their firmware, address, and display data in volatile RAM (not flash memory) to save cost and energy.
If you remove the battery or let it fully discharge, the tag will lose all programming and become unresponsive ("dead"). It usually cannot be recovered without the original base station.

I found a bug or want to contribute — how can I get in touch?

You can contact me on:

Discord: @i12bp8
Telegram: @i12bp8

I'm currently traveling, so response times may be slower than usual. Feel free to open issues or Pull Requests anyway — contributions (bug fixes, improvements, documentation, etc.) are very welcome and will help keep the project alive while I'm away.

How It Works

TagTinker is built around the study of infrared electronic shelf-label communication used by fixed-transmitter labeling systems.

At a high level:

tags receive modulated infrared transmissions rather than ordinary consumer-IR commands
communication is based on addressed protocol frames containing command, parameter, and integrity fields
display updates are carried as prepared payloads for supported monochrome graphics formats
local tooling in this project helps researchers prepare assets and perform controlled experiments on authorized hardware

This project is intended to help researchers understand:

signal structure
frame and payload behavior
display data preparation constraints
safe, authorized bench-testing workflows

For the underlying reverse-engineering background and deeper protocol research, see:

Furrtek’s ESL research: https://www.furrtek.org/?a=esl
PrecIR reference implementation: https://github.com/furrtek/PrecIR

Project Scope

TagTinker is limited to home-lab and authorized research use, including:

infrared protocol study
signal timing and frame analysis
controlled experiments on owned or authorized hardware
monochrome asset preparation for testing
educational diagnostics and interoperability research

It is not a retail tool, operational tool, or field-use utility.

Responsible Use

You are solely responsible for ensuring that any use of this software is lawful, authorized, and appropriate for your environment.

The maintainer does not authorize, approve, or participate in any unauthorized use of this project, and disclaims responsibility for misuse, damage, disruption, legal violations, or any consequences arising from such use.

If you do not own the hardware, or do not have explicit written permission to test it, do not use this project on it.

Any unauthorized use is outside the intended scope of this repository and is undertaken entirely at the user’s own risk.

No Affiliation

This is an independent research project.

It is not affiliated with, endorsed by, authorized by, or sponsored by any electronic shelf-label vendor, retailer, infrastructure provider, or system operator.

Any references to external research, public documentation, or reverse-engineering work are included strictly for educational and research context.

Credits

This project is a port and adaptation of the excellent public reverse-engineering work by furrtek / PrecIR and related community research.

License

Licensed under the GNU General Public License v3.0 (GPL-3.0).
See the LICENSE file for details.

Warranty Disclaimer

This software is provided “AS IS”, without warranty of any kind, express or implied.

In no event shall the authors or copyright holders be liable for any claim, damages, or other liability arising from the use of this software.

Maintainer Statement

This repository is maintained as a narrowly scoped educational research project.

The maintainer does not authorize, encourage, condone, or accept responsibility for use against third-party devices, deployed commercial systems, retail infrastructure, or any environment where the user lacks explicit permission.

Research responsibly.

↑ top

12.Trellis AI (YC W24) Is hiring engineers to build self-improving agents

Sourcehttps://www.ycombinator.com/companies/trellis-ai/jobs/SvzJaTH-member-of-technical-staff-product-engineering-full-time

SiteY Combinator

Submittermacklinkachorn (Hacker News)

Submitted2026-04-21 17:01 UTC (Hacker News)

HN activity1 points · 0 comments

Length636 words (~3 min read)

Languageen

Trellis builds and deploys computer use agents to get patients access to life-saving medicine. Our computer-use AI agents process billions of dollars worth of therapies annually with patients in all fifty states. We do this by automating document intake, prior authorizations, and appeals at scale to streamline operations and accelerate care. We classify medical referrals, understand chart notes, and automate contract and reimbursement search to provide patients with accurate coverage determinations and cost responsibility. Think of us as the Stripe of healthcare billing and reimbursements. Trellis is a spinout from Stanford AI Lab and is backed by leading investors including YC, General Catalyst, Telesoft Partners, and executives at Google and Salesforce. 🧍🏻‍♂️Why work with us - Real impact at massive scale: We serve patients in all fifty states and are scaling to hundreds of healthcare locations. You'll directly see the number of patients who received treatment because of the agents you built. - Work with industry experts: Apply your AI alongside healthcare operations leaders who have overseen 50+ healthcare locations, gaining deep domain expertise while building cutting-edge technology. - Be at the forefront of AI in healthcare: Build production-grade agentic systems that make critical healthcare decisions, backed by robust evaluation frameworks. - Direct customer engagement: Work closely with F500 customers and the founding team. You'll wear multiple hats from technical architecture to customer success. - Extreme ownership: Own key parts of Trellis's technical infrastructure and have opportunities to launch new initiatives that process billions in healthcare transactions. - World-class team: Join team members who have won international physics olympiads, published economics research, were founding engineers at unicorn startups, and taught AI classes to hundreds of Stanford graduate students. - Incredible growth and traction: We've grown revenue 10x in the past few months alone and have XX% market share in the specialty healthcare markets we serve. What you'll build - Agentic frameworks for healthcare decision-making: Design and implement AI systems that autonomously navigate complex reimbursement logic and prior authorization workflows. - 24/7 AI co-workers: Build and deploy long-running agent workers that triage and process healthcare data around the clock, functioning as reliable digital teammates for care teams. - Production-grade AI systems: Develop your agents within our comprehensive evaluation suite, ensuring production-ready performance from day one. Requirements - Experience architecting, developing, and testing full-stack code end-to-end - Expertise in programming languages such as Python, Go and ML/NLP libraries such as PyTorch, TensorFlow, Transformers - Being proactive and a fast-learner with bias for action - Experience working with relational and non-relational databases, especially Postgres - Experience with data and ML infrastructure - Open source contributions and projects are a big plus - Experience with cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes) is a plus

AI for streamlining healthcare paperwork

Member of Technical Staff, Product Engineering (full-time)

$100K - $225K•0.10% - 1.50%•San Francisco, CA, US

Job type

Full-time

Role

Engineering, Full stack

Experience

Any (new grads ok)

Visa

US citizen/visa only

Apply to Trellis AI and hundreds of other fast-growing YC startups with a single profile.

Apply to role ›

About the role

Trellis builds and deploys computer use agents to get patients access to life-saving medicine.

Our computer-use AI agents process billions of dollars worth of therapies annually with patients in all fifty states. We do this by automating document intake, prior authorizations, and appeals at scale to streamline operations and accelerate care. We classify medical referrals, understand chart notes, and automate contract and reimbursement search to provide patients with accurate coverage determinations and cost responsibility. Think of us as the Stripe of healthcare billing and reimbursements.

Trellis is a spinout from Stanford AI Lab and is backed by leading investors including YC, General Catalyst, Telesoft Partners, and executives at Google and Salesforce.

🧍🏻‍♂️Why work with us

Real impact at massive scale: We serve patients in all fifty states and are scaling to hundreds of healthcare locations. You'll directly see the number of patients who received treatment because of the agents you built.
Work with industry experts: Apply your AI alongside healthcare operations leaders who have overseen 50+ healthcare locations, gaining deep domain expertise while building cutting-edge technology.
Be at the forefront of AI in healthcare: Build production-grade agentic systems that make critical healthcare decisions, backed by robust evaluation frameworks.
Direct customer engagement: Work closely with F500 customers and the founding team. You'll wear multiple hats from technical architecture to customer success.
Extreme ownership: Own key parts of Trellis's technical infrastructure and have opportunities to launch new initiatives that process billions in healthcare transactions.
World-class team: Join team members who have won international physics olympiads, published economics research, were founding engineers at unicorn startups, and taught AI classes to hundreds of Stanford graduate students.
Incredible growth and traction: We've grown revenue 10x in the past few months alone and have XX% market share in the specialty healthcare markets we serve.

What you'll build

Agentic frameworks for healthcare decision-making: Design and implement AI systems that autonomously navigate complex reimbursement logic and prior authorization workflows.
24/7 AI co-workers: Build and deploy long-running agent workers that triage and process healthcare data around the clock, functioning as reliable digital teammates for care teams.
Production-grade AI systems: Develop your agents within our comprehensive evaluation suite, ensuring production-ready performance from day one.

Requirements

Experience architecting, developing, and testing full-stack code end-to-end
Expertise in programming languages such as Python, Go and ML/NLP libraries such as PyTorch, TensorFlow, Transformers
Being proactive and a fast-learner with bias for action
Experience working with relational and non-relational databases, especially Postgres
Experience with data and ML infrastructure
Open source contributions and projects are a big plus
Experience with cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes) is a plus

About Trellis AI

Trellis helps healthcare providers treat more patients, faster—while eliminating pre-service paperwork.

We automate document intake, prior authorizations, and appeals at scale to streamline operations and accelerate care.

Our AI agent is trained on millions of clinical data points and converts messy, unstructured documents into clean, structured data directly in your EHR.

With Trellis, leading healthcare providers and pharmaceutical companies were able to:

Reduce time to treatment by over 90%
Improve prior authorization approval and reimbursement rates
Leverage structured data to enhance drug program performance and clinical decision-making

Administrative costs account for over 20% of U.S. healthcare spending—delaying care, draining revenue, and driving staff burnout while having less visibility into patient care than ever before. We built Trellis to tackle this head on.

Founded:2024

Batch:W24

Team Size:25

Status:Active

Founders

Mac Klinkachorn

Founder

Jacky Lin

Founder

Similar Jobs

Trackstar

Pax

zudo.work

Adaptional

Overstand Labs

Variance

Kinro

Alinea

HappyRobot

Dex

Solum Health

Landeed

Deep24

TraceRoot.AI

Adam

Dover

Exa

Closure

GroundControl

BoldVoice

↑ top

13.Running a Minecraft Server and More on a 1960s Univac Computer

Sourcehttps://farlow.dev/2026/04/17/running-a-minecraft-server-and-more-on-a-1960s-univac-computer

Sitefarlow.dev

AuthorNathan Farlow

Published2026-04-17

HN activity146 points · 24 comments

Length6.1K words (~27 min read)

Languageen

18 bit registers, 90kb of memory, ones’ complement, powered by RISC-V.

Check it out! Here I am running a Minecraft server on a 1960s UNIVAC 1219B computer:

Nathan standing next to the UNIVAC 1219B with a laptop running Minecraft.

Here’s a NES emulator rendering the first frame of Pinball:

An ASCII rendering of the NES Pinball title frame on teletype paper.

… and a selfie printed using the “overstrike” technique:

An ASCII portrait of Nathan printed on the teletype.

We ran a ton more crazy stuff, including:

OCaml programs (!)
A webserver
Curve25519 + AES encryption
A BASIC interpreter
ELIZA
Games like Oregon Trail, Wordle, and Battleship

… and so much more! All this on a 250khz computer with only 90kb RAM from the 1960s. I live for this kind of stuff! I’m obsessed with running code in weird places and smashing technical limitations. This project is my most ambitious project so far, taking about 8 months of work from myself and others.

The source for the project is here. Also see TheScienceElf’s video on this project!

The UNIVAC is a weird machine

The UNIVAC 1219B is a super weird machine and is hostile to modern programming in almost every way:

18 bit words. Memory addresses and values are 18 bits! Not even a power of two.
Ones’ complement arithmetic, kinda. Modern computers use two’s complement to represent signed integers. This computer uses ones’ complement, but with annoying differences around signed zero that we had to reverse engineer.
Just a few registers. One 36 bit register A can be individually addressed by AU:AL. You get that and another 18 bit B register.
Only 40,960 words of memory. That’s only 90kb total memory to split between our code and the memory it needs at runtime.
Banked memory. These 40,960 words of memory are split into 10 banks. You have to configure which bank your instructions address in advance.

The computer’s original purpose was to be used by the Navy to read in radar signals and direct artillery. It really is an amazing feat of engineering. The computer is shown on the left in the image below. To its right is the (currently semi-functional) magnetic tape unit.

The UNIVAC 1219B and its tape drive at the Vintage Computer Federation museum.

Nearby is the teletype, which is how we interface with the computer. You can type to the UNIVAC and it can type back; everything is printed to the same sheet of paper. It’s the stdin and stdout.

A Model 35 Teletype with its dust cover, paper spool, and keyboard.

Only two UNIVAC 1219s exist today, both rescued from Johns Hopkins University by folks from the Vintage Computer Federation. This is the only one that is operational.

Before we started this project, all the programs that existed were hand-written in UNIVAC assembly. We’re going to change that by getting C compiling!

The first encounter at VCF East 2025

The first time I came across the computer was during a trip to VCF East in April 2025. Bill and Steven were running demo programs on the machine. Duane, Bill, and Steven had done a ton of amazing work to rescue and restore this computer over the last 10 years.

Seeing this thing in person was genuinely inspiring: the flashing lights, the tacking of the teletype, the smell of the oil… I knew then that I needed to get some crazy code running on this thing. Something much more than fizzbuzz. I wanted a NES emulator. I wanted OCaml. How far could we push this hardware?

We need an emulator and assembler

The first things we need are an assembler for the UNIVAC assembly language and an emulator to run that assembled program. Luckily for us, Duane had written an assembler for UNIVAC assembly in BASIC (!) and an emulator in VB.NET many years ago.

Soon after VCF was over, TheScienceElf took a stab at writing a new assembler and emulator in Rust by consulting the scans of the incredible manuals and using Duane’s implementations as a reference.

The Rust emu was fast. It was 400x faster than the real UNIVAC hardware and 40,000x faster than the VB.NET emulator. This speed turned out to be entirely necessary to power the fuzz testing I’ll discuss later.

Both emulators weren’t hardware accurate at this point, but it was good enough to start!

Wee as a first attempt at a C compiler

Now that we have an emulator, how can we get C code running in it?

The fastest way to prove out a C compiler was to use wee, an old project of mine. It’s a tiny instruction set I’ve used previously to compile C to weird places.

It worked, but holy moly it was bad. A trivial fizzbuzz program took up ~27k words, or about 67% of the total memory of the computer. It took a full minute to compute the first 100 fizzbuzz lines. Since my goal was to get real and complex programs running, this was clearly not viable.

A RISC-V emulator is the move

We have to do something smarter than wee. There are many options, so let me clarify my main two goals:

I want to run real, big, interesting programs. I want to compile straight from github and let it rip on the machine. It’s less important that these real programs run maximally fast.
I must maintain my sanity.

We need to use a real compiler, like LLVM or GCC

I need all of the following to accomplish the goal of running real programs:

Full C standard library. In this case I used picolibc.
Soft float and other legalizations. I need all the types and operations to work. Floats, doubles, int32, int64, everything. Even though the UNIVAC doesn’t have hardware to do this natively.
Dead code elimination + size optimization. We need to pack things tightly into 90kb of space.
Other languages. I want to support more than C, like Rust, C++, Zig, etc.

Directly compiling to the UNIVAC won’t cut it

Writing an LLVM or GCC backend for the UNIVAC would be absolutely nightmarish and would violate my second goal to maintain my sanity. The ones’ complement arithmetic, 18-bit words, and banked memory would all be painful to hack into modern compilers.

And even if we did, to actually benefit from direct compilation, your C ints would be 18-bit ones’ complement ints. That’s technically allowed by the C spec (at least until C23 mandated two’s complement), but in practice, real code often assumes >=32-bit two’s complement, so off-the-shelf programs would break.

So emulate a target GCC already supports, like RISC-V

The idea is to use GCC to compile C to RISC-V, and then emulate that RISC-V on the UNIVAC by writing a RISC-V emulator in UNIVAC assembly.

Think about how nice this is:

One and done. Write the emulator once and never look at UNIVAC assembly again.
You can fuzz it. You can have high confidence that the emulator is correct by generating random RISC-V programs, running them through the emulator and a reference emulator, and comparing the final state of the registers.
Incremental dopamine. I read a blog post many years ago that stuck with me about structuring projects in a way that gives incremental dopamine throughout the implementation. If you try to write the whole project and only test things at the end, you may burn out before you’re positively rewarded by seeing something work. The base RISC-V instruction set has only 38 instructions we care about, which means there’s a clear end goal. We can check them off as we implement them and they pass the fuzz tests.
Dense binaries. We can encode a RISC-V instruction efficiently into 2 18-bit UNIVAC words to efficiently pack them into our limited memory. This also reserves us the option in the future to implement the compressed extension or add additional bespoke compression methods.

Emulation is slower, but that’s fine

The real downside of this approach is the runtime penalty to decode and emulate each instruction. After all the optimizations, it takes ~40 UNIVAC instructions to emulate 1 RISC-V instruction. That means that our 250khz UNIVAC computer can run a ~6khz RISC-V computer.

… and that’s pretty good! The real obstacle to running real, complex programs is that 40kw of memory. This emulation gives us the best space efficiency along with its other benefits.

Building the toolchain

Here’s the high level flow of the toolchain:

Write C.
Compile to RISC-V with GCC.
Re-encode each instruction into a UNIVAC-efficient format, 2 words per RISC-V instruction.
Append these re-encoded instructions to the emulator’s source.
Assemble the program into a .76 tape file to be loaded onto the machine.

Writing ~1000 lines of UNIVAC assembly for the RISC-V emulator isn’t going to be easy; you have to have good tooling before doing this. Before I ever started writing this program, I spent a couple weeks preparing:

An emacs major mode.
OCaml tooling for parsing, emulating, and re-encoding RISC-V, with round-trip fuzzing.
Differential fuzzer that checked my UNIVAC RISC-V emulator against a ground truth (mini-rv32ima).
Efficient test case reducer (using a port of Lithium).

And oh boy this investment paid dividends.

Claude Code can’t write UNIVAC assembly yet

Claude Code is great – it wrote the entire emacs major mode for me given the instruction docs. I use it frequently for code editing tasks as I write OCaml. To my dismay though, even with the docs, emulator, and differential fuzzer, Claude Code fell on its face when writing UNIVAC assembly. I can’t really blame it. UNIVAC assembly is just really weird.

No matter what I did, at this point of the project, Claude Code could not internalize the UNIVAC’s idiosyncrasies, like its ones’ complement arithmetic, the fact that left shift is circular and right shift is arithmetic, and the weird instruction special cases, like CPAL behaving differently with 0.

I can write UNIVAC assembly, though

There are moments in all programmers’ lives where you have to just lock in and grind it out. So I rolled up my sleeves, and in a matter of a few days, I typed the ~1000 lines of UNIVAC assembly to implement the 38 RISC-V instructions we needed from the base set. It was honestly an enjoyable experience!

A UNIVAC assembly file open in emacs with syntax highlighting.

The emacs major mode enables syntax highlighting and provides help text that shows the timing of the instruction.

The fuzz testing caught bugs and reduced them to a minimal repro instantly. Once the fuzzer passed for an instruction, I happily moved on; I didn’t care about efficiency at this point, just correctness.

The first C program works!

Once all the fuzz tests were passing, I ran my first C program. It…. almost worked! There was a small bug in how RISC-V memory addresses translated to UNIVAC memory addresses. I updated my fuzzer so that it would catch the bug, fixed it, and all the C programs just worked from that point on! I thanked my past self profusely for writing the fuzzer.

This was an amazing moment. Fizzbuzz worked. A BASIC interpreter worked. Even smolnes, a NES emulator, was working!

…the only catch is that it would take 20 hours to render the first frame of Pinball on the real computer (3 minutes in the emulator). We don’t have 20 hours to wait at the museum unfortunately, so is the NES idea doomed?

Not even close; we just have to optimize the hell out of this thing.

Now make it 30x faster

Our UNIVAC emulator keeps track of the total time it would take to run the programs on the real machine. This gives us a number to optimize against.

There were two numbers I focused on optimizing:

The runtime of all fuzzed programs, which gives a good average metric across all instructions
The NES demo, a representative benchmark I actually cared about making as fast as possible

Move work from runtime to encode time

The most important optimization of all is to re-encode the RISC-V instructions into a format that’s maximally efficient for the UNIVAC. A RISC-V instruction is 32 bits. Our re-encoding takes this 32 bit instruction, does some transformation, and writes the result into two 18 bit words for the UNIVAC emulator to use.

I was blown away when I read the RISC-V spec and learned how it encodes immediates: the bits are scrambled within the instruction!

The RISC-V JAL instruction encoding diagram from the official spec.

...huh? (source)

You need to spend a ton of cycles bit shifting and masking in order to reconstruct the immediate in your software emulator. Apparently this is convenient and efficient for hardware implementations? We can’t spare those cycles though, so the obvious idea is to unscramble the bits ahead of time and write them down in the right order in the re-encoding we give to the UNIVAC emu.

It’s the same story for the opcode. Deciding on how to emulate a RISC-V instruction can sometimes require you to check various non-contiguous bits in the instruction. Our encoding just assigns a convenient opcode number to each instruction.

Beyond unscrambling immediates, if there is anything that an instruction handler does immediately, bake that into the instruction directly. For example, some handlers need to immediately compute immediate * 2. May as well just store immediate * 2 instead of immediate.

The most extreme version of this are the SRLI and SRAI instructions. On the UNIVAC, we can’t shift by a variable amount. The solution is to dynamically create a shift instruction at runtime in a self-modifying-code-like way, and then execute it. But the work of creating said UNIVAC instruction can actually be done ahead of time! For SRLI/SRAI, we straight up package a UNIVAC instruction directly in the payload to later be extracted, written to RAM, and executed.

These transformations technically mean that we lose the ability to support RISC-V programs that depend on self-modifying code. But that’s a fine tradeoff for this massive speed gain.

Make the hot path faster

Classic optimization ideas still apply on the UNIVAC:

Delete dead code. A clever thing I did here was repurpose my test case minimizer to delete as many UNIVAC instructions as possible from the emulator such that the fuzz tests continued to pass. That found code I could just delete!

Jump tables. The most efficient instruction dispatch method turned out to be jump tables based on the opcode.

Instruction reordering and register liveness. The fewer times you have to store and reload registers to/from memory, the better.

Inline code. Subroutine calls have jump + return overhead; inline small functions to skip that.

Add an OCaml macro system to manage inlining

Inlining code buys you speed but will become unmaintainable if you don’t have a macro system to save you from copy + pasting code all over the place. I wrote a simple OCaml macro system: any OCaml you write between triple backticks can inject contents directly into the file. How fun 🐪

Here’s an example of reducing code duplication:

An OCaml helper function defined at the top of an asm file and called from two instruction handlers below.

And here’s an example where I use OCaml to generate a lookup table with 32 entries:

A short OCaml expression in an asm file that generates a 32-entry data table.

Add some fast syscalls, use good compiler flags

Most of these C programs have global variables we need to set to 0 at startup in the .bss section. That takes time, so I add a memclear syscall that will do this quickly in UNIVAC assembly to optimize startup time.

I also added a .noinit linker annotation to opt some big global buffers out of .bss initialization that didn’t really need it.

On the compiler flag side, -O3 does help speed things up, but not drastically compared to -Os. The UNIVAC lacks fancy hardware like caches, branch predictors, and the like for compilers to take advantage of.

Claude Code micro-optimizes massively in parallel

Having both a comprehensive fuzzer and a numeric metric to optimize against is a perfect environment for LLMs to do great work on a project.

There was a ton of low hanging fruit in my initial implementation around instruction reordering, dead code, etc. I had a Claude Code workflow that spawned 10 subagents in parallel, each in its own worktree, to independently explore and test different optimization ideas.

A terminal showing 10 Claude Code agents running in parallel in separate worktrees.

10 Claude Code subagents trying to optimize the emulator in parallel.

The main agent would merge them together assuming they met some criteria I wrote about maintainability/quality. (Don’t just inline everything, pretty please). I’d look at the final result and weigh the complexity-maintainability tradeoff before merging.

This worked well! After many iterations, I got a ~20% total speedup from this method alone.

I had to strengthen my fuzzer a couple times when the LLM would break something and the fuzzer didn’t catch it. I’d like to propose Murphy’s law of vibe-optimizing:

When LLMs optimize a program, in the limit, if any part of the system is not codified by tests, there will be a bug introduced there.

Claude Code writes the multiplication handler in Python

Another way we could get big speedup on some C programs is to implement the multiply instruction in our emulator. The base RISC-V instruction set doesn’t have multiply; the compiler can work around it by emitting adds and shifts.

I set Claude Code on the job, but this is a big ask. We need to emulate two’s complement 32 bit multiplication with weird 18 bit ones’ complement operations. Even with fuzz testing, the ability to trace the program execution, docs, 1000+ lines of high quality example asm, and many parallel attempts, Claude Code still failed.

That’s when I had the following idea: for each of the UNIVAC arithmetic instructions, implement them as Python functions. Then, ask Claude Code to write a Python program that emulates 32 bit multiplication with these functions. I’ll give it some fuzz tests, too.

The motivation here is that:

Claude Code is more familiar with Python
It can write nested expressions rather than simple asm statements
It can assign results to variables and write helper functions
It can use standard Python debugging techniques

Sure enough, with enough parallelism and time, it was able to write this Python script!

I then prompted the simpler task: translate the Python program to UNIVAC assembly. And it worked!

The multiply handler is 676 lines of inscrutable UNIVAC assembly, making up ~43% of the entire emulator. It’s a gross monstrosity, but it offers a 6x speedup for multiplication-heavy programs like primality checking and elliptic curve crypto, so it stays for now.

30x speedup

All in, NES frame time dropped from ~20 hours to ~40 minutes (30x speedup!). This was finally short enough that we could run it at the museum over lunch.

It was about time TheScienceElf and I reached out to Duane, Bill, and Steven to tell them what we had done. We hadn’t really talked to them since our visit, and since I had already sunk so much time into this project, I suddenly worried what I’d do if the computer had broken since the last time we talked.

I sent the email off and announced our UNIVAC 1219 Rust emulator, the C toolchain, and the fact that we could run real programs. So, can we visit the museum and try it out?

Everyone loved it! We made a plan to visit the museum in January.

In the weeks leading up to our trip, Duane was a massive help on technical questions given his 25+ years of UNIVAC experience. He answered questions about the computer’s ones’ complement edge cases, the IO channel setup, TTY character encoding, the bootstrap loading process, and much more. Thank you Duane!

Museum Visit #1: Hardware debugging and loading code

The day finally came. TheScienceElf, Steven, Bill, and I rolled up to the museum on a January morning. Duane was on call remotely. We booted up the UNIVAC, but there was trouble. The WAIT light came on.

The UNIVAC indicator panels with one lamp lit on the channel 4 row.

WAIT light is on due to spurious activity on channel 4.

The computer refuses to execute any instructions when the WAIT light is on. Apparently this has been a known issue for a while; the strategy in the past was to wait until the machine warms up for it to go away. After we waited 30 minutes and the light was still on, we were giving up hope. We gave Duane a call for help. Bill, Duane, and Steven traced the circuits in the manual and decided to disconnect IO channel 4 altogether. That worked! No more interrupt light! We think channel 4 had some bad hardware that was causing spurious activity and therefore interrupts.

Now for the fun part: we need to figure out how to load our programs. The usual way this works is to:

Manually push buttons and levers on the front panel to program in ~30 instructions into the computer memory. This is the paper tape bootstrap program, capable of loading a program from the paper tape reader.
Next, load the LECPAC roll of tape into the tape reader. LECPAC is a utility program that has useful debugging and program loading features.
Push some buttons and flip some levers to configure LECPAC to read from channel 7, the serial IO channel. Duane did amazing work to develop a Teensy project that converts the UNIVAC’s parallel IO interface to serial so that we could connect our computers and send/receive data.
Run the LECPAC loading routine to read our program in from serial!

A Teensy-based RS-232 to parallel adapter.

Duane's UNIVAC IO <-> serial adapter. A Teensy gives us a regular serial port so we can talk to the UNIVAC from our laptops.

But we were having trouble with step 4: we were just loading garbage data. We tried every permutation of USB cable, serial cable, and laptop we had. Nothing was getting through.

Duane emailed us a small program, only 8 instructions, to debug the serial input. We keyed it in by hand using the front panel. The program would wait for a character on the serial channel and display the result in the accumulator, whose value would then be shown in the lights on the front panel.

We used this program to experiment with different serial configurations until we sent the letter “A” and saw the correct value appear in AL. And we found the serial configuration we needed!

We loaded “Hunt the Wumpus”, a known good program written by Duane, to test the loading process over serial with our laptop. It worked! But when we tried to load our own programs from our toolchain, they failed to load. Why??

We diffed our tape files against Wumpus and realized we needed to pad the beginning of our tape file with zeros… for some reason. With that fix, our programs loaded into memory successfully!

Now for the moment of truth. We set the PC register to the start address of our C “hello world” program, hit the run switch, and…. nothing. The program was stuck for some reason that we didn’t understand. We loaded up another of our programs, a program to compute pi, and started it. Instead of printing pi, it printed a random sequence of garbage:

Teletype paper showing a line of random ASCII characters instead of the digits of pi.

Definitely not pi.

We used the front levers to step through about 20 instructions and compared against our emulator trace. It was looking good, but after spending all day on the hardware and loader, we ran out of time to find the divergence. We used LECPAC to take some core dumps for offline analysis and called it a day. (A real core dump! This thing actually uses core memory!)

What a great success though! We fixed a hardware issue and figured out how to load our programs. The next time we come back, it will be all software debugging, and that’s what I’m best at.

Fuzzing and tracing to get the emulator matching the hardware

We scheduled another trip for a month out. How can we prepare in the meantime? If the computer is just spitting garbage at you, what do you do?

We need to gain confidence that our Rust emulator matches the hardware. This is when I wrote some of my favorite programs of all:

A fuzzing program generates instruction “fingerprints”

I wrote a diagnostic program in UNIVAC assembly that takes each arithmetic instruction (ADDAL, ADDA, SUBAL, SUBA, etc.), runs it hundreds of times with pseudorandom inputs, accumulates the results into a hash, and prints the hash to the teletype. The hash is a fingerprint for the instruction’s behavior. If you run the same program on two different implementations, matching hashes mean the implementations agree. Different hashes mean there’s a divergence somewhere.

The output is one opcode per line, each followed by its octal hash:

ADDAL: 614424 223254
ADDA: 020656 635560
ADDAB: 401323 107167
SUBAL: 633336 720540
SUBA: 235365 124723
...

That’s all well and great, but when the fingerprint differs, what do you do? Why did it differ? And on what inputs? You can’t know. That’s where the software tracer comes in:

A tracer to run UNIVAC instructions one at a time

This is the wildest program that I wrote. It’s a software tracer, written in UNIVAC assembly, that runs another UNIVAC program instruction by instruction, printing the full machine state (PC, instruction, AU, AL, B, SR, ICR) between every step to the serial port. The idea is that we can diff this printout with our emulated trace and identify exactly when and why the trace differed.

Exactly how to write this software tracer is a mind-bending challenge. In short, the tracer maintains its own PC pointing into the target program. For each step, it copies the current instruction from the target into its own memory. It saves the full machine state, executes the copied instruction, then saves the state again and prints the result. Some instructions, like jumps, have to be modified to point to the tracer’s handlers, but the CPU still evaluates the jump condition itself, so the tracer doesn’t reimplement conditional logic.

Here’s the tracer running over the first few instructions of an example program. Each row is the machine state captured before that instruction executes:

PC     INSN   AU     AL     B      SR     ICR
050000 340007 000000 000000 000000 000000 000000
050007 507300 000000 000000 000000 000000 000000
050010 507200 000000 000000 000000 000000 000000
050011 701234 000000 000000 000000 000000 000000
050012 100001 000000 001234 000000 000000 000000
050013 440003 123456 001234 000000 000000 000000
050014 460003 123456 001234 000000 000000 000000
050015 120001 123456 001234 000000 000000 000000
050016 140006 123456 123456 000000 000000 000000
050017 507200 123456 123555 000000 000000 000000
050020 360144 123456 123555 000000 000000 000000
050021 420003 123456 123555 000144 000000 000000
050022 507203 123456 123555 000144 000000 000000
050023 360310 123456 123555 000141 000000 000003

There’s actually nothing stopping us from making this an interactive program and hooking GDB up to the real hardware this way. It would be totally doable to set breakpoints, inspect memory, modify registers, single-step, etc.

Museum Visit #2: The first working programs

A month after our first visit, equipped with some legendary debugging programs, we made our way back to the museum. We need to start by proofing out the most basic primitives. Can we even print text to the teletype correctly?

We started with a handwritten “HI” program in UNIVAC assembly. It worked on the first try! Now it was time to run our instruction fingerprinting program. The fingerprints came streaming out, and sure enough, there was a difference from our emulator! The four 36-bit add/sub instructions were printing different fingerprints.

Teletype output listing one opcode per line, each followed by an octal hash.

The fingerprinting program reports hashes for each instruction.

I sicced Claude Code on the hardware fingerprints and let it brute force various interpretations of the manual until we had something that matched.

After we fixed this emulator difference, we ran the asm pi program in the emu. And it printed the same garbage that we saw on the hardware!!! This means that our emulator is probably accurate now. I have never been so happy to see garbage!

Teletype paper above a laptop terminal, both showing the same garbage output from the pi program.

At this point we fixed the pi program and RISC-V emulator to work with the new interpretation of the 36 bit ops.

…and just like that, all of our programs worked. Hello world, fizzbuzz, Oregon Trail, BASIC, Figlet, ELIZA. A sudoku solver compiled from OCaml using C_of_ocaml. AES encryption. Baseball. Blackjack. Enigma encrypter and cracker. Wordle. All working! No need for the software tracer, even!

Teletype paper showing HELLO WORLD followed by fizzbuzz output.

Hello world and fizzbuzz were the first C programs to work on the UNIVAC.

Teletype paper showing an ELIZA conversation about being sad that the snow hasn't melted.

An ELIZA session. Come come, elucidate your thoughts.

A BASIC prime sieve listing and its output on the teletype.

Interactive BASIC interpreter running a prime sieve program.

We started the NES emulator over lunch. We came back thrilled to see that it printed the first visible frame!

A teletype printout of an NES Pinball frame held next to a laptop showing the same frame.

We also seized this opportunity to dump the full ASCII table to the teletype to learn its character set:

An ASCII table printed on the teletype with decimal, octal, and glyph columns.

This trip was so successful that we had some time to try out my most ambitious goal of the whole project: can we host a Minecraft server? I brought a PoC that I knew worked in the emulator.

A workbench with the UNIVAC teletype on the left, a Raspberry Pi in the middle, and a Mac laptop on the right.

The network setup: a raspberry pi runs the pppd bridge between Mac and UNIVAC serial.

We got as far as a PPP and TCP handshake happening, but didn’t get data through end to end.

Networking on the UNIVAC

My initial dream was to get a NES emulator working on the UNIVAC. But ever since we had accomplished that, I set my sights higher to try and host a Minecraft server on the computer. This is my most cursed idea yet, very technically hard, and it requires all the tools and knowledge we have spent the last many months building.

It’s important to me that we don’t cheat anything, so let’s lay out our goals:

For our PoC, I only care that our Minecraft client can login. So we only need to implement the Minecraft login protocol.
All interesting logic must happen on UNIVAC. No cheating.

The approach is to forward IP packets to the UNIVAC via PPP, and the UNIVAC itself implements all the PPP/IP/TCP/Minecraft protocols. In my setup, my Mac laptop connects to a port on the pi which simply forwards IP packets via pppd over serial to the UNIVAC. I’m pretty sure the Mac itself can directly run pppd, but I’m most comfortable with Linux, so I had the pi as an intermediary.

But how is it possible for the UNIVAC to run all these protocols? I’m not even sure 90kb of memory is enough to store the code of a full TCP implementation, let alone run one.

Here’s the key idea: throw all error handling out of TCP. Assume that only one connection happens at a time, all packets arrive in full and in order, and suddenly TCP is extremely simple! As long as you turn TCP into UDP, you can run it on the UNIVAC.

The Minecraft login implementation is derived from bareiron. With all these pieces together, you should be able to log into a world without blocks and fall to your death before disconnecting.

So why didn’t it work in the previous visit? I hypothesized that the issue was that the UNIVAC was dropping incoming packets on the floor as it was writing its packets out. This would actually not be a problem if we had correctly implemented all the TCP error handling, but we didn’t, so it’s critical that we don’t let this happen.

Fixing this means that we have to get our hands dirty and understand the concurrent IO features of the UNIVAC. The UNIVAC’s IO interface is roughly DMA: the hardware writes the incoming bytes into the buffer in memory you point it to. The IO interface has a mode called “Continuous Data Mode”, or “CDM”. We can configure CDM to restart the DMA at the start of the buffer once the buffer is full.

This gives us a ringbuffer primitive. We can separately track the last byte that we read from our program, and so long as we don’t fall behind more than our 4kb buffer size, we won’t drop bytes on the floor even if we’re busy processing or sending data on another channel.

Overstrike selfie art on the TTY

In the downtime before we went back to the museum, TheScienceElf was working on improving the accuracy of the emulated TTY. He sent me a screenshot correctly showing the TTY typing over the same character at the end of the line, just like what we saw at the museum when we forgot to put newlines in our output:

The emulated TTY printing pi digits and overwriting itself when the line wraps.

I was with my partner at the time and we had the idea that if we could type over the same character many times, we could achieve higher resolution ascii art. More variables to control = higher resolution. (Unfortunately we would be 50 years too late to this idea)

On the model 35 TTY, when you want to go to the next line, you send a carriage return (\r) followed by a newline (\n). \r moves the cursor back to the left, and \n moves the cursor down one row. If you only send \r, you’re able to type over the same line again, typing over what already was written.

I wrote a Python script that converts an image into a string of characters to send out to the teletype to do this. The algorithm is as follows:

Render each printable Model 35 character into a bitmap of ink coverage.
Divide the target image into a grid of cells, one per character position.
For each cell, greedily pick the character that most reduces perceptual error. Repeat up to some max strikes per cell. If ink overlaps, then set pixel darkness according to Beer-Lambert (0.5 -> 0.75 -> 0.875). Edges detected in the image are weighted higher in error calculation.
Spread residual error to neighboring cells via Floyd-Steinberg diffusion.

Take a close look at the image. You can see how several chars typed over each other contribute more darkness + texture:

A close-up of the overstrike portrait showing individual character cells.

Museum Visit #3: Minecraft, webserver, and a selfie

On our final trip to the museum, the UNIVAC came online right away with no hardware issues. I immediately had a good feeling about how the day was going to go.

We started the day by running a couple experiments that we brought with us:

We tested all cases of add/sub with -0 and +0. This is when we confirmed that the UNIVAC deviates from the typical ones’ complement scheme by normalizing -0 to +0 in the non-carry path.
We ran a memory check to confirm that we have exactly 40,960 words.

Then we ran some other programs we brought: TheScienceElf’s neat pi and Euler programs, my SHA-256 program, and Steven’s Battleship program all worked.

Teletype paper showing the SHA-256 utility hashing three inputs.

The SHA-256 utility hashing three inputs (HELLO WORLD, YAY, UNIVAC).

But now the moment of truth, will our concurrent IO changes work? Can we get Minecraft running?

As usual, we have to start simple and build up our primitives. I came prepared with a simple test program that printed the ringbuffer stats as it ran to confirm that my understanding of the manual was correct. And sure enough, it worked exactly like I expected over serial. The write pointer was circling through the buffer.

Next up for testing was our webserver program, since that was still simpler than Minecraft. I could feel myself getting nervous. We loaded it up and connected PPP… and no good! We couldn’t connect. My heart sank. It was lunch time, so we left to eat and brainstorm. But before we left, we kicked off the overstrike ASCII art program, which we expected to take 10s of minutes to print. The result looked great!

A strip of teletype paper held up above the Model 35, showing an ASCII portrait of Nathan.

When we came back, we reloaded the webserver to just try again… and it just worked! Maybe we misconfigured it on the first attempt? I could curl no problem. I loaded on my browser and…

Nathan giving a thumbs up next to the UNIVAC while holding a laptop with a webpage open.

Unreal! This demonstrates PPP/IP/TCP all working over serial on the UNIVAC to serve a webpage that I fetched with my modern computer! I couldn’t believe it. (I don’t know why that extra “H” appears. I bet it has something to do with the additional request Chrome makes for “favicon.ico”. No idea :) )

Now for the moment of truth. What about Minecraft?

We loaded the program. I started up my Minecraft client. I pointed it to the UNIVAC IP and clicked connect. Sure enough, on the first try, we logged in!

I was over the moon. All these months of debugging + clever hacks and finally we did what was thought impossible (at least, TheScienceElf thought it was impossible :)).

We spent the rest of the afternoon running programs to get footage for the video and generally celebrating our accomplishments of the last 8 months.

Conclusion

What a wild ride. My favorite projects are the ones that I didn’t know were possible when I set off to go do them.

I enjoy the thought that everything we did here was technically possible 60 years ago. Can you imagine going back in time and dropping them a paper tape with these programs on it? They’d lose their minds!

Thanks so much to the people that made this possible: Duane, Bill, Steven, and TheScienceElf. Thanks to all the staff at VCF for allowing us to come out and have a great time with the computer! What an amazing experience.

Thanks for reading! Source is here.

Wile E. Coyote operating a UNIVAC in a Looney Tunes cartoon.

Nathan pushes the buttons to start the Minecraft server on the UNIVAC, 2026 (colorized)

↑ top

14.Theseus, a Static Windows Emulator

Sourcehttps://neugierig.org/software/blog/2026/04/theseus.html

Siteneugierig.org

Submitterzdw (Hacker News)

Submitted2026-04-20 04:14 UTC (Hacker News)

HN activity24 points · 1 comments

Length2.9K words (~13 min read)

An new old approach to emulation.

April 19, 2026

This post is likely the end of my series on retrowin32.

I bring you: Theseus, a new Windows/x86 emulator that translates programs statically, solving a bunch of emulation problems while surely introducing new ones.

What happened to retrowin32?

I haven't been working on retrowin32, my win32 emulator, in part due to life stuff and in part because I haven't been sure where I wanted to go with it. And then someone who had contributed to it in the past posted retrotick, their own web-based Windows emulator that looks better than my years of work, and commented on HN that it took them an hour with Claude.

This is not a post about AI, both because there are too many of those already and because I'm not yet sure of my own feelings on it. But one small thing I have been thinking about is that (1) AI has been slowly but surely climbing the junior to senior engineer ladder; and (2) one of the main pieces of being a senior engineer is better understanding what you ought to be building, as distinct from how to build it.

(Is that just the Innovator's Dilemma's concept of "retreating upmarket", applied to my own utility as a human? Not even sure. I am grateful I do this work for the journey, to satisfy my own curiosity, because that means I am not existentially threatened like a business would be in this situation. As Benny Feldman says: "I cheat at the casino by secretly not having an attachment to material wealth!")

So, Mr. Senior Engineer, what ought we build? What problem are we even solving with emulators, and how do our approaches meet that? I came to a kind of unorthodox solution that I'd like to tell you about!

Emulators and JITs

The simplest CPU emulator is very similar to an interpreter. An input program, after parsing, becomes x86 instructions like:

mov eax, 3
add eax, 4
call ...  ; some Windows system API

An interpreting emulator is a big loop that steps through the instructions. It looks like:

loop {
   let instr = next_instruction();
   match instr {
      // e.g. `mov eax, 3`
      Mov => { set(argument_1(), argument_2()); }
      // e.g. `add eax, 4`
      Add => { set(argument_1(), argument_1() + argument_2()); }
      ...
  }
}

Like an interpreter, this approach is slow.

At a high level interpreters are slow because they are doing a bunch of dynamic work for each instruction. Imagine emulating a program that runs the same add instruction in a loop; the above emulator loop has all these function calls to repeatedly ask "what instruction am I running now?" and inspect the arguments, only to eventually do the same add on each iteration. x86 memory references are extra painful because they are very flexible.

Further, on x86 the add instruction not only adds the numbers but also computes six derived values, including things like the parity flag: whether the result contains an even number of 1 bits(!). A correct emulator needs to either compute all of these as well, or perform some sort of side analysis of the code to decide how to run it efficiently.

There are various fun techniques to improve emulators. But if you want to go fast what you really need is some combination of analyzing the code and generating native machine code from it — a JIT. JITs are famously hard to write! They are effectively optimizing compilers, which means all the complexity of optimization and generating machine code, but also where the runtime of the compilation itself is in the critical performance path. I liked this post's discussion of why JITs are hard which mentions there have been more than 15 attempts at a Python JIT.

Static binary translation

So suppose you want to generate efficient machine code, but you don't want to write a JIT. You know what's really good at analyzing code and generating efficient machine code from it? A compiler!

So here's the main idea. Given code like the above input x86 snippet, we can process it into source code that looks like:

regs.eax = 3;
regs.eax = add(regs.eax, 4);
windows_api();  // some native implementation of the API that was called

We then feed this code back in to an optimizing compiler to get a program native to your current architecture, x86 no longer needed.

In other words, instead of handing an .exe file directly to an emulator that might JIT code out, we instead have a sort of compiler that statically translates the .exe (via a second compiler in the middle) directly into a "native" executable.

(I write native in scare quotes because while the resulting executable is a native binary, it is a binary that is carrying around a sort of inner virtual machine representing the x86 state, like the regs struct in the above code. More on this in a bit.)

I think I came up with this basic idea on my own just by thinking hard about what I was trying to achieve, but it turns out this approach is known as static binary translation and is well studied. It has some nice properties, and also some big problems.

Decompilation

I'll go into those, but first, a minor detour about how I ended up here.

Have you heard of decompilation? These madmen (madpeople?) are manually recreating the source code to old video games, one function at a time. They take the game binary, extract the machine code of one function, then use a fancy UI (click one of the entries under "Recent activity") to iteratively tinker on reproducing the higher-level code that generates the exact same machine code. It's kind of amazing.

(To do this, they need to even run the same original compiler that was used to compile the target game. Those compilers are often Windows programs, which means implementing the above fancy UI involves running old Windows binaries on their Linux servers. This is how I first learned about them — they need a Windows emulator!)

Decompilation is not only just a weird and fascinating (and likely tedious?) human endeavor. It also highlighted something important for me: I don't so much care about having an emulator that can run any random program, I care about running a few very specific programs and I'm willing to go to even some manual lengths to help out.

In practice, if you look at a person building a Windows emulator, they end up as surgeons needing to kind of manually reach in and pump the heart of the target program themselves anyway, including debugging the target program and working around its individual bugs. It's common for emulators to even manually curate a list of programs that are known to work or fail.

An old idea

Statically translating machine code is not a new idea. Why isn't it more popular? My impression in trying to read about it is that it is often dismissed because it can't work, but at least so far it's worked well. Maybe I haven't yet encountered some impossible problem that I've so far overlooked?

(When trying to look up related work for this blog post, I saw this attempt at statically translating NES that concluded it can't be done, but then also these people seem to be succeeding at it so it's hard to say.)

I think there are two main problems, a technical one and a more cultural one.

The technical part is that the simple idea has complex details. To start with, any program that generates code at runtime (e.g. itself containing a JIT) won't work, but it's easy for me to just dismiss those programs as out of scope. There are also challenges around things like how control flow works, but those are small and interesting and I might go into them in future posts.

A common topic of research is that it's in the limit impossible to statically find all of the code that might be executed even in a program that doesn't generate code at runtime, because of dynamic control flow from vtables or jump tables. In particular, while there are techniques to find most of the code, no approach is guaranteed to work perfectly. This is where decompilation changed my view: if I'm willing to manually help out a bit on a specific program, then this problem might be fine?

The main cultural reason I think binary translation isn't more common is that it's not as convenient as a generic emulator that handles most programs already. Users aren't likely to want to run a compiler toolchain, though I have seen projects embed the compiler (e.g. LLVM) directly to avoid this.

The other cultural problem is there are legal ramifications if you intend to distribute translated programs. Every video game emulator relies on the legal fiction of "first, copy the game data from the physical copy you already own and pass that in as an input", so they get to plausibly remain non-derivative works.

But I'm not solving for users, I'm solving for my own interest. These cultural problems don't matter to me.

Benefits

Again consider the snippet above, which is adding 3 and 4. In a static translator world we parse the instruction stream ahead of time, so the compiler gets to see that we want to put a 3 in eax and not (as an interpreter would) spend runtime considering what values we are reading and writing where.

A compiler will not only generate the correct machine code for the target architecture, it even will optimize code like the above to just store the resulting value 7. And a compiler is capable of eliminating unneeded code like parity computations if you frame things right. Because the Theseus code generation happens "offline", separately from the execution of the program, I can worry less than a JIT might to about spending runtime analyzing the code to try to help.

When I started this I had thought that performance would be the whole benefit of this approach, but it turns out to be easier to develop as well because it brings in all of the other developer tools:

The translated instructions appear as regular code in the output program, which means the native debugger can step translated instructions, which appear as regular source code.
If the program crashes, the native stack trace traces back in to the (translated assembly of the) original program.
I haven't tried it yet, but CPU profiling ought to have the same benefit.

In retrowin32 I ended up building a whole debugger UI to help track down problems, but in Theseus I've just used my system debugger so far and it's been fine.

In retrowin32 I also spent a lot of time fiddling with the bridge between the emulator and native code. This boundary still exists in Theseus but it is so much smaller, because the translated code can directly call my native win32 system API implementation (with a bit of glue code to move data in and out of the inner machine's representation).

On MacOS retrowin32 could run under Rosetta but it meant the entire executable needed to be an x86-64 binary, which meant it required a cross-compiled SDL. A Theseus binary is native code that just calls the native SDL.

All told it is just much simpler. From the start of this idea to getting the test program I've tinkered with all this while running its first scene, including DirectX, FPU, and MMX, only took me a couple weeks.

Partial evaluation

You can think of the different approaches of interpreter to JIT to static binary as a spectrum of how much work you do ahead of time versus at runtime. Theseus take the dynamic question of "what kind of mov is this" and move it to the ahead of time compilation step, partially evaluating the generic instruction handler into a specific instruction with nailed-down arguments. (I'll link again to the excellent blog about meta-tracing C code. Read about Futamura projections for this idea taken to its extreme conclusion!)

For another example, a typical Windows emulator must parse and load the PE executable on startup, but Theseus does that at compile time and writes out just the data structures needed to execute it. The PE-parsing code isn't needed in the output.

Similarly, executable startup involves linking and loading any referenced DLLs including those from the system, but Theseus must see all the code it will run, so it does this linking ahead of time. Here's some output near a call to a Windows API, where at compile time it resolved an IAT reference (the ds:[...] address) directly to the Rust implementation I wrote:

// 004012a0 push 4070A4h
push(ctx, 0x4070a4u32);
// 004012a5 push 8
push(ctx, 0x8u32);
// 004012a7 call dword ptr ds:[4060E8h]
call(ctx, 0x4012ad, Cont(user32::CreateWindowExA_stdcall))

In some sense it's as if Theseus at compile time is partially running the system binary loader and the output source code is a snapshot of the ready state. It reminds me a bit of the problem of unpacking executables.

WebAssembly

Theseus should easily extend to running on the web under WebAssembly; most of it is just compiling the generated program with wasm as the target architecture. (I initially had this working then decided I don't need the additional complexity for now, so it isn't implemented.)

Separately, the output program from Theseus is inspired by how WebAssembly is executed. In both there is an outer host program that carries within it a "machine" with its own idea of code and memory. The code within that machine can only read/write to its own memory and must call provided hooks to bridge out to the host. Like WebAssembly, the Theseus output executable code is isolated from the data, with the nice property that no amount of unintentional/malicious memory writes can create new code.

A wasm Theseus would be a turducken of machines:

the native host machine's WebAssembly implementation (e.g. the Chrome runtime), with its notion of memory, runs a
WebAssembly virtual machine with the Theseus wasm blob, with its own idea about memory (e.g. where my Rust implementation of the Windows API puts allocations), and within that there is
the x86 virtual machine and Windows program's notion of memory (which e.g. might say "read from the static data table at memory offset $x").

In thinking about it, it's tempting to try to blend some layers of machines here, and make the WebAssembly program's memory 1:1 with the input Windows program's idea of memory. That is, if the input program writes to some address $x, you could translate that to exactly writing to WebAssembly memory address $x. (You'd need to adjust the middle layer to hide its data structures in places the x86 program doesn't use.) I had to do something like this to make retrowin32 work under an x86 emulator. WebAssembly even would let me lay out the memory directly from the binary. I don't think this really buys you much, it would just be kind of cute.

On the topic of WebAssembly and static binary translation, check out wastrel which is static binary translation applied to the problem of executing WebAssembly. Reading about it surely gave me the seeds of this idea.

Theseus

I named this project Theseus, as in the ship.

Consider again the x86 assembly at the top of the post. What does it do? Depending on how you look at it, one correct answer is "adds three and four" or even just "computes 7". Or you could say it puts 3 in the eax register, adds 4 to the eax register, consumes some CPU clocks, and sets various CPU flags.

If I or my compiler replaces one of these interpretations with another, is it still the same program? Depending on which context you care about — my impression is that emulating systems like the NES requires getting the clocks exactly right — these details either matter or don't. In the case of Theseus I am explicitly throwing away the input program because I have replaced all its parts, one by one.

I have one farther off idea, again along the lines of the ship of Theseus. Implementing the Windows API is an endless stream of working around four decades of Hyrum's Law. Consider that random bug workaround again: if you were documenting the API of DirectPlayEnumerateA would you write that it calls the callback, or would it be more correct to say that it calls the callback and also restores a preserved stack pointer? If you look at the code of a Windows emulator like Wine today it is full of things like this.

One idea I've been thinking about is that for problems like these, rather than making the emulator more complicated, you could take a page from the decompilation playbook and provide an easy way to manage replacing parts of the program itself.

Once you're willing to replace pieces of a program there are more interesting possibilities. If a program has some bit of code that doesn't perform well, instead of making a JIT fancier, you could just manually replace the code with your own implementation. (It's plausible you wouldn't even need to change algorithms, it might be enough to just write the same algorithm in native code and let your modern compiler apply its autovectorization logic to it.) With enough machinery, you could even replace parts to add features, as one contributor to retrowin32 investigated here and even implemented for some GameBoy games.

↑ top

15.Modern Front end Complexity: essential or accidental?

Sourcehttps://binaryigor.com/modern-frontend-complexity.html

SiteBinary Igor

AuthorIgor Roztropiński

Published2026-04-18

HN activity35 points · 21 comments

Length3.7K words (~17 min read)

Languageen

Once upon a time, at the dawn of the web, browsers and websites were simple ... Then slowly, step by step, more and more interactivity was added.

It was simple back then

What are the roots of this Complexity? How have we arrived here?

Once upon a time, at the dawn of the web, browsers and websites were simple. There were no apps really, but mostly static pages - collections of .html files sprinkled with some CSS for better look. These websites were text-based for the most part, linking to other similar documents available on the World Wide Web. Everything was plain and simple; static documents, referring to each other.

Then slowly, step by step, more and more interactivity was added; first came forms and inputs, not long afterwards - JavaScript programming language (both in 1995).

At this stage, Complexity was still low. Web systems developed then consisted mostly of:

.html documents and templates
.css file or files
some .js scripts
HTTP servers to make these static files available and handle state altering requests from forms
databases to store system's state

Crucially, the UI source code of these first websites and apps was mostly the same as the output files interpreted and executed in the browser - runtime target. Even with the use of PHP and templating languages/systems (like Mustache), it looked very similar to the target HTML files, displayed by the browser:

<h1>{{page.title}}</h1>
<div>
  <p>{{name.label}}: {{user.name}}</p>
  <p>{{email.label}}: {{user.email}}</p>
  <p>{{language.label}}: {{user.language}}</p>
</div>
<a href="/sign-out">{{sign-out}}</a>

A templating engine - just a library available in the server runtime/environment - turns this into a specific HTML page:

<h1>User Account</h1>
<div>
  <p>Name: Igor</p>
  <p>Email: [email protected]</p>
  <p>Language: EN</p>
</div>
<a href="/sign-out">Sign Out</a>

A little more complicated than static collections of .html documents, but still fairly straightforward. What has happened next?

Then came AJAX - weird acronym for Asynchronous JavaScript and XML. It brought a completely new possibility to update HTML document content in the background, asynchronously - without reloading the whole page. From this point onwards, more and more of websites functionality started to be delegated to increasingly complex JavaScript - especially for partial updates, triggered mostly by more sophisticated user interactions, to avoid full page reloads. Not long after that, the concept of Single Page Application (SPA) and first frameworks arrived: Backbone.js, Knockout.js and AngularJS (2010). In this model, the source code we work on is very remote from what finally lands in the browser environment. More elaborate abstractions came here as well - complexifying needed tooling as a result.

That is how, more or less, we ended up with today's Complexity - where most apps are built with React, Vue, Angular or Svelte, requiring a whole toolchain to build and develop, such as Vite or Webpack. How they work is inherently different from what browsers were designed to do.

Source Code vs Browser Runtime

As the gap between source code format and browser runtime has been growing - because of these newly discovered and adapted abstractions - more tools and of increasing complexity became essential to develop, build and deploy web applications.

Let's take a typical modern SPA - written in React, using TypeScript and Vite for development & building. To make it digestible and understandable by the browser:

TypeScript must be transpiled/compiled into JavaScript - it is its superset and browsers do not know anything about it
In our case, we also need to transform TSX to JSX - TSX is just a typed variety of JSX
Take JSX files and turn them into JavaScript - browsers have no clue what to do about .jsx files
It is an SPA with only one index.html HTML file and potentially tens, hundreds or even thousands of small .js files - for performance reasons, they should be packaged into a single .js bundle (or a few ones). Since at least React as a dependency has to be available, it must be added to the resulting bundle as well
With the latter, there are mostly two additional optimizations: tree shaking and minification. Tree shaking takes dependencies and leaves in the resulting bundle only actually used source code, not all of it; minification on the other hand, removes unnecessary whitespaces, shortens variable names and so on to make the final .js file as small as possible - while still keeping its functionality intact

On top of that, there might be additional steps:

Post CSS processing - transforming CSS in various ways: adding vendor prefixes, allowing the use of not yet widely supported CSS features or CSS modules/scopes
Polyfills and transpilers - as new proposals, specifications and versions of JavaScript (ECMAScript) are developed, it takes time to have them widely supported. Polyfills and transpilers close this gap - they make it possible to use new and not yet widely supported features of JavaScript by transforming our source code to a version that works in older runtimes (browsers) as well

As we can clearly see - that really is a lot! And of course, it would be highly impractical to write scripts performing all of these transformations; that is why we have build tools like Webpack, Turbopack and Vite. They of course introduce yet another dependency; something new to learn and master. But, we have gone so far away from what browsers are actually operating on at runtime, that they rather are necessary. One could make a very good case that it developed in this way purely for historical reasons, because of the browser limitations in the past (there were no native modules for a long time for example).

The current ecosystem complexity is rivaling Tower of Babel. I would then ask:

Can we start from scratch and figure out a much simpler approach, given how browsers have evolved in recent years?

What is essential

For most web apps, what today's users treat as given:

instant load times
native-like, smooth transitions between pages
high degree of interactivity; most user actions should feel fast
real-time validation and hints; especially for complex forms and processes

What programmers want:

great developer experience - ability to quickly see and validate UI changes
possibility of creating, sharing and reusing configurable UI components
testability - how do we know whether it works?
easy to introduce translations & internationalization

A simpler alternative

Here is an idea:

UI mostly server-driven, server-side rendered with the use of HTMX
Single Page Application - routing is provided by HTMX, out of the box
HTML Web Components - for reusable and framework agnostic components. It is an approach where Web Components mostly provide behavior, not the structure - we will see how it works and what the benefits are below
Mustache for server-side rendered HTML templates - implementations are available in pretty much all major programming languages
TailwindCSS to make styling easier
Simple scripts to bundle it all together and prepare a package for deployment

A fully working example is available in this repo. Let's go through the most important and interesting parts.

Server

In the example, I have written a server in Java, using Spring Boot framework; but, it could have been written in any other programming language and/or framework suited for web development. I call it a server, because in this approach, there is no frontend/backend distinction really; there is just an app, with views rendered mostly by the server, sprinkled with client-side JS here and there.

From various endpoints, rendered HTML pages or fragments are returned as:

@GetMapping("/devices")
String devices(Model model, Locale locale,
  @RequestParam(required = false) String search) {
  translations.enrich(model, locale, Map.of("devices-page.title", "title"),
    "devices-page.title",
    "devices-page.search-input-placeholder",
    "devices-page.search-indicator",
    "devices-page.trigger-error-button");

  enrichWithDevicesSearchResultsTranslations(model, locale);

  var devices = deviceRepository.devices(search);

  return templatesResolver.resolve("devices-page", 
    devicesModel(model, devices));
}

Which, depending on the context:

returns a full HTML page, if the page is loaded by the browser for the first time
returns an HTML fragment, if we arrived at the /devices url from some other place in the already loaded app

How do we know whether to return a full HTML page or fragment?

Thankfully, HTMX adds the hx-request header to each HTTP request it makes. So, if there is no hx-request header present in the HTTP request, our response is a full HTML page:

<!DOCTYPE HTML>
<html lang="en">

...

<body>

{{ page-specific-html }}

</body>

</html>

And if this is a subsequent request - clicking from one page to the next, without full page reload - the hx-request header is present and we return an HTML fragment:

{{ page-specific-html }}:

<div class="space-y-2 flex flex-col">
  ...
  <div class="cursor-pointer rounded border-2 p-0 flex">
    <span class="px-4 py-2 flex-1">9b0d5f33-6f9e-4aef-bb81-a57a045fb1aa: iPhone 13</span>
    <drop-down class="relative">
        <div data-drop-down-anchor class="absolute right-2 text-3xl">...</div>
        <div data-drop-down-options class="rounded border-2 whitespace-nowrap absolute mt-2 right-0 top-6 bg-white border rounded hidden z-99">
            <div class="p-2" hx-get="/devices/9b0d5f33-6f9e-4aef-bb81-a57a045fb1aa" hx-push-url="true" hx-target="#app">Details</div>
            <div class="p-2" hx-get="/buy-device/9b0d5f33-6f9e-4aef-bb81-a57a045fb1aa" hx-push-url="true" hx-target="#app">Buy</div>
        </div>
    </drop-down>
  </div>
  ...
</div>

This is how it looks:

A few interesting things to note here:

various hx- attributes (HTMX): hx-get, hx-push-url and hx-target
custom <drop-down> element (Web Component)
lots of Tailwind CSS classes

Let's start with hx- mechanics.

HTMX

When we click on the Details or Buy option, the browser url is changed by HTMX using standard History API. At the same time, HTMX makes GET request to /devices/9b0d5f33-6f9e-4aef-bb81-a57a045fb1aa or /buy-device/9b0d5f33-6f9e-4aef-bb81-a57a045fb1aa accordingly. Content of the HTML element identified by app id is swapped with the HTML fragment, received from the server. As a result, we see a new HTML page without full page reload - in the exact same way as it works in the traditional, client-heavy & JSON-oriented SPAs.

HTML Web Components

This is a different strategy to develop Web Components where structure is fully or mostly defined in HTML; components just add behavior to it through JavaScript.

Using <drop-down> as an example (Mustache template):

<drop-down class="relative">
  <div data-drop-down-anchor class="absolute right-2 text-3xl">...</div>
  <div data-drop-down-options class="rounded border-2 whitespace-nowrap absolute mt-2 right-0 top-6 bg-white border rounded hidden z-99">
    <div class="p-2" hx-get="/devices/{{id}}" hx-push-url="true" hx-target="#app">{{devices-search-results.details-option}}</div>
    <div class="p-2" hx-get="/buy-device/{{id}}" hx-push-url="true" hx-target="#app">{{devices-search-results.buy-option}}</div>
  </div>
</drop-down>

As we see, there are anchor and options elements marked as data-drop-down-anchor and data-drop-down-options respectively. What then the <drop-down> is doing:

class DropDown extends HTMLElement {

  #hideOnOutsideClick = undefined;

  connectedCallback() {
    const anchor = this.querySelector("[data-drop-down-anchor]");
    const options = this.querySelector("[data-drop-down-options]");

    anchor.onclick = () => options.classList.toggle("hidden");

    this.#hideOnOutsideClick = (e) => {
      if (e.target != anchor) {
        options.classList.add("hidden");
      }
    };

    window.addEventListener("click", this.#hideOnOutsideClick);
  }

  disconnectedCallback() {
    window.removeEventListener("click", this.#hideOnOutsideClick);
  }
}

It does not alter HTML structure; instead, it enriches certain elements with a dynamic drop down behavior.

In the similar vein, there are a few more components implemented in this fashion:

Thanks to this approach, UI is mostly rendered on the server side, which gives us SEO and performance benefits. We reduce JavaScript that has to be written, since things are mostly done in HTML; and because it is handled primarily by the server, it is easier to test and verify its correctness. What are the drawbacks? There is more HTML to write and sometimes we might need to be aware of some styling dependencies - as in the <drop-down> example:

<drop-down class="relative">
  <div data-drop-down-anchor class="absolute">...</div>
  <div data-drop-down-options class="absolute mt-2 right-0 top-6 hidden z-99">
    ...
  </div>
</drop-down>

A few CSS properties have to be set - relative display for the parent, absolute for its children - on the <drop-down> to be displayed as expected, so one could argue that these are not really independent components with encapsulated behavior. But, we gain a lot of flexibility thanks to this philosophy - pretty much everything is configurable here, since only behavior is provided, not structure and styling. In the context of creating generic and reusable components, that is a tradeoff definitely worth taking. Especially considering the fact that if some patterns of styling and configuration repeat, nobody stops us from creating dedicated and more specific wrappers for such cases.

Errors & Validation

When the error is caused by a user entering an unsupported or otherwise problematic url (full page load), a dedicated error page with translated exception is displayed:

In most cases though, HTMX is fetching data and triggering mutations for us. When it fails - getting non-2xx code - we do the following:

<error-modal>
  ...
</error-modal>

...

<script>
document.addEventListener("htmx:afterRequest", e => {
  if (e.detail.failed) {
    const errorModal = document.querySelector("error-modal");
    const error = e.detail.xhr.response;
    const [title, message] = error.split("#");
    errorModal.dispatchEvent(new CustomEvent("error-modal-show", 
      { detail: { title: title, content: message }}));
  }
});
</script>

In case of error, a translated error title and message is received, separated by the # sign. All we have to do is to publish a custom event that the <error-modal> listens to:

In inline validation cases, we would rather not hit the backend unnecessarily. For that, we have the <validateable-input> component that wraps a standard <input> element, allowing us to hide or show a validation error - depending on whether the configured validator returns true or false:

As mentioned, all these messages are translated into the user language - how does it work?

Translations

All translations live on the server, as the UI is rendered there; we have simple message_{locale}.properties files:

devices-page.title=Devices
devices-page.search-input-placeholder=Search devices...
devices-page.search-indicator=Searching devices...
devices-page.trigger-error-button=Trigger some error

User language is decided based on the standard Accept-Language header, but could also be resolved through cookie, query param or some user-specific settings/state stored on the server.

Testability

Another benefit of this strategy is improved testability - why is that?

Well, HTML pages and fragments are almost entirely generated on the server side. Sometimes, JavaScript is added through Web Components or inline scripts to enhance components with purely client-side behavior; this pattern is sometimes called Islands Architecture. To test and validate most of it, all we have to do is to write server integration tests of the kind:

@Test
void rendersFullDevicesPage() {
  var allDevices = deviceRepository.allDevices();

  var response = testRestClient.get()
    .uri("/devices")
    .retrieve()
    .toEntity(String.class);

  assertThat(response.getStatusCode())
    .isEqualTo(HttpStatus.OK);

  var document = Jsoup.parse(response.getBody());
  assertThat(document.select("html"))
    .isNotEmpty();

  var devicesElement = document.select("#devices");
  allDevices.forEach(device -> {
    assertThat(devicesElement.text())
      .contains(device.id().toString())
      .contains(device.name());
    
    var devicePageAttribute = "[hx-get=/devices/%s]".formatted(device.id());
    var buyDevicePageAttribute = "[hx-get=/buy-device/%s]".formatted(device.id());
    assertThat(devicesElement.select(devicePageAttribute))
      .isNotEmpty();
    assertThat(devicesElement.select(buyDevicePageAttribute))
      .isNotEmpty();
  });
}

In this way, we test pretty much all application layers at once:

tests are defined together with the server code and are running in the same environment
true HTTP requests are made to implemented by us server
our server utilizes a real database
data is transformed into the HTML page and returned to the client; in the ready to be displayed format
we make assertions on the HTML content and format - validating that it has appropriate structure, attributes and data

True, it is not rendered in the real end user environment, but using jsoup (or a similar tool) we can have a high degree of confidence that it is going to be rendered by the browser as well.

What about UI states and components that utilize JavaScript to provide related functionality & behavior? There, I would use something like Playwright to write E2E tests, running in the actual browser, for the particular pages and UI states that could not be reliably and thoroughly tested with the integration tests alone. Thankfully, with the approach taken here these cases are rather rare - the vast majority of UI constitutes server-rendered HTML pages and fragments.

Development & Production

For local development, we simply start the server:

./mvnw spring-boot:run

In our particular case, we use Spring Boot Developer Tools configured as:

spring:
  devtools:
    restart:
      enabled: true
      poll-interval: 500ms
      quiet-period: 250ms

In a nutshell, whenever server's code is modified it gets recompiled almost immediately - we have hot/live reloading thanks to this; all we have to do is to reload the page in the browser and see recently made changes.

If a database is used, we additionally run it as a Docker/Podman container.

Since TailwindCSS is applied here as well, for the local development, the following script should be running:

npm ci

cd ops
./live-css-gen.sh

≈ tailwindcss v4.2.2

Done in 93ms
Done in 169µs
Done in 4ms

So that CSS is constantly regenerated, as we edit UI-related files.

For production, what would be ideal:

bundled JavaScript - mostly Web Components - into a single/few file(s)
hashed static assets (like components_a8049c1a.js) - JavaScript, CSS, images - so that they might be cached for as long as possible
prepared self-contained app package - including these transformed assets - ready for deployment; containerized (or a binary), so that it could easily be deployed to the target environment

To support it, I have prepared two scripts. package_components.py takes all JS components and turns them into a single components_{hash}.js file. build_and_package.bash generates CSS with the help of @tailwindcss/cli tool, calls package_components.py to package & hash components, builds the server in Docker and creates ready to be deployed, self-contained Docker image of our app - with all frontend assets and backend/server code required to run it. We just then need to copy dist/load_and_run_app.bash & dist/run_app.bash scripts together with the modern-frontend-complexity-alternative.tar.gz gzipped Docker image to our prod environment and run:

bash load_and_run_app.bash 

Loading modern-frontend-complexity-alternative:latest image, this can take a while...
Loaded image: modern-frontend-complexity-alternative:latest
Image loaded, running it...
Stopping previous modern-frontend-complexity-alternative version...
modern-frontend-complexity-alternative
Removing previous container....
modern-frontend-complexity-alternative

Starting new modern-frontend-complexity-alternative version...

e12ff5c2d81e933560e2a8a974b79654cfe219c43b5a47995c576ab1a562ccf8

docker logs modern-frontend-complexity-alternative 

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/

 :: Spring Boot ::                (v4.0.3)

2026-04-11T05:55:52.297Z  INFO 1 --- [           main] c.ModernFrontendComplexityAlternativeApp : Starting ModernFrontendComplexityAlternativeApp v0.0.1-SNAPSHOT using Java 25.0.2 with PID 1 (/modern-frontend-complexity-alternative.jar started by root in /)
2026-04-11T05:55:52.301Z  INFO 1 --- [           main] c.ModernFrontendComplexityAlternativeApp : No active profile set, falling back to 1 default profile: "default"
2026-04-11T05:55:53.003Z  INFO 1 --- [           main] o.s.boot.tomcat.TomcatWebServer          : Tomcat initialized with port 8080 (http)
2026-04-11T05:55:53.012Z  INFO 1 --- [           main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2026-04-11T05:55:53.012Z  INFO 1 --- [           main] o.apache.catalina.core.StandardEngine    : Starting Servlet engine: [Apache Tomcat/11.0.18]
2026-04-11T05:55:53.033Z  INFO 1 --- [           main] b.w.c.s.WebApplicationContextInitializer : Root WebApplicationContext: initialization completed in 683 ms
2026-04-11T05:55:53.298Z  INFO 1 --- [           main] o.s.boot.tomcat.TomcatWebServer          : Tomcat started on port 8080 (http) with context path '/'
2026-04-11T05:55:53.305Z  INFO 1 --- [           main] c.ModernFrontendComplexityAlternativeApp : Started ModernFrontendComplexityAlternativeApp in 1.372 seconds (process running for 1.736)

Tradeoffs & Improvements

What are the drawbacks?

If we come with the traditional SPA mindset, it is quite a shift in thinking. Instead of designing and consuming REST API endpoints, responses of which we then must transform on the client side, most data is rendered on the server and received in the ready-to-be-displayed format by the browser. Additionally, we mostly work with the native browser APIs instead of relying on framework-specific ways - which is a big advantage, since native APIs have a much longer shelf life than the current version of React, Vue, Angular or Svelte.

There is no transpilation & polyfillation step. A very limited JavaScript is written - only for Web Components and some event listeners, making the UI more interactive, so it is not a problem. But that is the fact - without this step, which simplifies tooling a lot, we must choose used JS features more consciously, so that it runs in all our target environments.

Some tooling must be built. Since in this architecture source code mostly reflects what later runs in browser runtime, required tooling is minimal. In fact, if we have just a few Web Components, there is not really a need to bundle them - several HTTP requests for static files are not a problem, it will be quite performant. But, if their number grows it is better to bundle them into a single file. Same is true for hashing all static assets - adding suffixes - so that they might be cached more efficiently. On the other hand, this lower level strategy could be considered an advantage - we are more aware of how the browser processes our files and are in full control of it. Also, significantly fewer transformations are required overall, just because this approach is more aligned with how browsers work, processes and manipulate HTML documents.

For potential improvements, it would be nicer to have hot/live reloading where our browser tab is automatically refreshed and we do not have to do it manually; tooling is still not there, but definitely feasible to build. A library of reusable HTML Web Components and server-side templates would be of a great benefit as well - right now (as I am aware) there are no ready to be used components built in a way presented here, so currently it is on us to create them. Simply put, the ecosystem is not there yet.

There is a simpler way

As we have learned, what started as simple, but world-wide, static documents sharing (the web), ended up as a highly complex runtime (browsers), allowing us to build almost any application and rivaling possibilities of native environments.

Along the way, we went through a few different phases and approaches to build those increasingly more interactive websites and applications. First, there were Multi Page Applications (MPAs) sprinkled with just some JavaScript here and there to make them more interactive. Then, people started to experiment with Single Page Applications (SPAs), where there is no full page reloads and pretty much all data transformations and UI state transitions are handled in the thick & complex JavaScript layer, running entirely on the client side.

Currently, we live in the JavaScript-heavy reality, where browser runtime looks completely different from source code files we work on. It has led to massive increase in complexity of tooling required to develop and build those applications - lots of transformations must be made for apps created with this approach to work in the browser - what the runtime understands and what we work on - source code files - is often totally different. Additionally, there are many new concepts that must be understood and mastered in order to work proficiently with these tools. Complexity has reached a tipping point here; although it is becoming more hidden in the increasingly elaborate tooling.

There is a simpler way.

We can utilize HTMX, HTML Web Components and a templating language to build websites and apps in a way much more aligned with how the browser works - without sacrificing user experience, complex features or developer experience.

I then invite you to experiment with this simpler alternative: let's destroy the Tower of Babel Complexity and make web development simple and productive again!

↑ top

Plugin	What it does
cursor-line	Highlight the active line with theme-aware colors
fuzzy-finder	fzf-powered file picker as a floating overlay
sel-badge	Show selection count in the status bar
color-preview	Inline color swatches next to hex values
pane-manager	Tmux-like splits with Ctrl+W — no external multiplexer needed
image-preview	Display images in a floating overlay anchored to the cursor
smooth-scroll	Animated scrolling
prompt-highlight	Visual feedback when entering prompt mode

18.Anthropic says OpenClaw-style Claude CLI usage is allowed again

Sourcehttps://docs.openclaw.ai/providers/anthropic

SiteOpenClaw

Submitterjmsflknr (Hacker News)

Submitted2026-04-21 03:43 UTC (Hacker News)

HN activity435 points · 246 comments

Length907 words (~4 min read)

Languageen

Anthropic builds the Claude model family and provides access via an API and Claude CLI. In OpenClaw, Anthropic API keys and Claude CLI reuse are both supported. Existing legacy Anthropic token profiles are still honored at runtime if they are already configured.

Anthropic (Claude)

Anthropic builds the Claude model family and provides access via an API and Claude CLI. In OpenClaw, Anthropic API keys and Claude CLI reuse are both supported. Existing legacy Anthropic token profiles are still honored at runtime if they are already configured.

Option A: Anthropic API key

Best for: standard API access and usage-based billing. Create your API key in the Anthropic Console.

CLI setup

openclaw onboard
# choose: Anthropic API key

# or non-interactive
openclaw onboard --anthropic-api-key "$ANTHROPIC_API_KEY"

Anthropic config snippet

{
  env: { ANTHROPIC_API_KEY: "sk-ant-..." },
  agents: { defaults: { model: { primary: "anthropic/claude-opus-4-6" } } },
}

Thinking defaults (Claude 4.6)

Anthropic Claude 4.6 models default to adaptive thinking in OpenClaw when no explicit thinking level is set.
You can override per-message (/think:<level>) or in model params: agents.defaults.models["anthropic/<model>"].params.thinking.
Related Anthropic docs:
- Adaptive thinking
- Extended thinking

Fast mode (Anthropic API)

OpenClaw’s shared /fast toggle also supports direct public Anthropic traffic, including API-key and OAuth-authenticated requests sent to api.anthropic.com.

/fast on maps to service_tier: "auto"
/fast off maps to service_tier: "standard_only"
Config default:

{
  agents: {
    defaults: {
      models: {
        "anthropic/claude-sonnet-4-6": {
          params: { fastMode: true },
        },
      },
    },
  },
}

Important limits:

OpenClaw only injects Anthropic service tiers for direct api.anthropic.com requests. If you route anthropic/* through a proxy or gateway, /fast leaves service_tier untouched.
Explicit Anthropic serviceTier or service_tier model params override the /fast default when both are set.
Anthropic reports the effective tier on the response under usage.service_tier. On accounts without Priority Tier capacity, service_tier: "auto" may still resolve to standard.

Prompt caching (Anthropic API)

OpenClaw supports Anthropic’s prompt caching feature. This is API-only; legacy Anthropic token auth does not honor cache settings.

Configuration

Use the cacheRetention parameter in your model config:

Value	Cache Duration	Description
`none`	No caching	Disable prompt caching
`short`	5 minutes	Default for API Key auth
`long`	1 hour	Extended cache

{
  agents: {
    defaults: {
      models: {
        "anthropic/claude-opus-4-6": {
          params: { cacheRetention: "long" },
        },
      },
    },
  },
}

Defaults

When using Anthropic API Key authentication, OpenClaw automatically applies cacheRetention: "short" (5-minute cache) for all Anthropic models. You can override this by explicitly setting cacheRetention in your config.

Per-agent cacheRetention overrides

Use model-level params as your baseline, then override specific agents via agents.list[].params.

{
  agents: {
    defaults: {
      model: { primary: "anthropic/claude-opus-4-6" },
      models: {
        "anthropic/claude-opus-4-6": {
          params: { cacheRetention: "long" }, // baseline for most agents
        },
      },
    },
    list: [
      { id: "research", default: true },
      { id: "alerts", params: { cacheRetention: "none" } }, // override for this agent only
    ],
  },
}

Config merge order for cache-related params:

agents.defaults.models["provider/model"].params
agents.list[].params (matching id, overrides by key)

This lets one agent keep a long-lived cache while another agent on the same model disables caching to avoid write costs on bursty/low-reuse traffic.

Bedrock Claude notes

Anthropic Claude models on Bedrock (amazon-bedrock/*anthropic.claude*) accept cacheRetention pass-through when configured.
Non-Anthropic Bedrock models are forced to cacheRetention: "none" at runtime.
Anthropic API-key smart defaults also seed cacheRetention: "short" for Claude-on-Bedrock model refs when no explicit value is set.

1M context window (Anthropic beta)

Anthropic’s 1M context window is beta-gated. In OpenClaw, enable it per model with params.context1m: true for supported Opus/Sonnet models.

{
  agents: {
    defaults: {
      models: {
        "anthropic/claude-opus-4-6": {
          params: { context1m: true },
        },
      },
    },
  },
}

OpenClaw maps this to anthropic-beta: context-1m-2025-08-07 on Anthropic requests. This only activates when params.context1m is explicitly set to true for that model. Requirement: Anthropic must allow long-context usage on that credential. Note: Anthropic currently rejects context-1m-* beta requests when using legacy Anthropic token auth (sk-ant-oat-*). If you configure context1m: true with that legacy auth mode, OpenClaw logs a warning and falls back to the standard context window by skipping the context1m beta header while keeping the required OAuth betas.

Claude CLI backend

The bundled Anthropic claude-cli backend is supported in OpenClaw.

Anthropic staff told us this usage is allowed again.
OpenClaw therefore treats Claude CLI reuse and claude -p usage as sanctioned for this integration unless Anthropic publishes a new policy.
Anthropic API keys remain the clearest production path for always-on gateway hosts and explicit server-side billing control.
Setup and runtime details are in /gateway/cli-backends.

Notes

Anthropic’s public Claude Code docs still document direct CLI usage such as claude -p, and Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again. We are treating that guidance as settled unless Anthropic publishes a new policy change.
Anthropic setup-token remains available in OpenClaw as a supported token-auth path, but OpenClaw now prefers Claude CLI reuse and claude -p when available.
Auth details + reuse rules are in /concepts/oauth.

Troubleshooting

401 errors / token suddenly invalid

Anthropic token auth can expire or be revoked.
For new setup, migrate to an Anthropic API key.

No API key found for provider “anthropic”

Auth is per agent. New agents don’t inherit the main agent’s keys.
Re-run onboarding for that agent, or configure an API key on the gateway host, then verify with openclaw models status.

No credentials found for profile anthropic:default

Run openclaw models status to see which auth profile is active.
Re-run onboarding, or configure an API key for that profile path.

No available auth profile (all in cooldown/unavailable)

Check openclaw models status --json for auth.unusableProfiles.
Anthropic rate-limit cooldowns can be model-scoped, so a sibling Anthropic model may still be usable even when the current one is cooling down.
Add another Anthropic profile or wait for cooldown.

More: /gateway/troubleshooting and /help/faq.

↑ top

19.A type-safe, realtime collaborative Graph Database in a CRDT

Sourcehttps://codemix.com/graph

Sitecodemix.com

Submitterphpnode (Hacker News)

Submitted2026-04-21 10:33 UTC (Hacker News)

HN activity123 points · 33 comments

Length1.2K words (~6 min read)

Languageen

Open-source TypeScript property graph database from codemix.

Plane route demo

Global airline routes demo

Load a snapshot of real airline routes into the graph and query it with TypeScript.

Live demo

Add your face to the wall

Powered by @codemix/graph and @codemix/y-graph-storage — a real graph database, synced via a Yjs CRDT across every open tab. Add yourself, rearrange people, draw connections.

Installation

Install the package from npm — no native dependencies, runs anywhere Node or a bundler can.

$ pnpm add @codemix/graph

Note: This is alpha-quality software. We use it in production at codemix and it works well for our use cases, but please be careful using it with your own data.

Define your schema

Describe vertices, edges, and indexes in a plain object. Property types flow through every query, traversal, and mutation — no casts, no runtime surprises.

import { Graph, GraphSchema, InMemoryGraphStorage } from "@codemix/graph";
import { z } from "zod";

const schema = {
  vertices: {
    User: {
      properties: {
        email: { type: z.email(), index: { type: "hash", unique: true } },
        name:  { type: z.string() },
      },
    },
    Repo: {
      properties: {
        name:  { type: z.string() },
        stars: { type: z.number() },
      },
    },
  },
  edges: {
    OWNS:    { properties: {} },
    FOLLOWS: { properties: {} },
  },
} as const satisfies GraphSchema;

const graph = new Graph({ schema, storage: new InMemoryGraphStorage() });

→Any Standard Schema library — Zod, Valibot, ArkType, or your own.
→Validated on every mutation — properties are checked on addVertex, addEdge, and updateProperty.
→Indexes declared inline — hash, B-tree, and full-text; built lazily and maintained incrementally.

Add some data

Vertices and edges are added through the graph instance. Property arguments are checked against your schema at both compile time and runtime.

// add vertices — args are typed to each label's property schema
const alice  = graph.addVertex("User", { name: "Alice", email: "alice@example.com" });
const bob    = graph.addVertex("User", { name: "Bob",   email: "bob@example.com" });
const myRepo = graph.addVertex("Repo", { name: "my-repo", stars: 0 });

// add edges
graph.addEdge(alice, "OWNS",    myRepo, {});
graph.addEdge(bob,   "FOLLOWS", alice,  {});

// read properties — types come from the schema
alice.get("name");     // string
myRepo.get("stars");   // number

// update in place
graph.updateProperty(myRepo, "stars", 42);
// or via the element itself
myRepo.set("stars", 42);

Write type-safe queries

A Gremlin-style traversal API — familiar step names, but every label, property key, and hop is checked by TypeScript against your schema.

Start a traversal

import { GraphTraversal } from "@codemix/graph";

const g = new GraphTraversal(graph);

for (const path of g.V().hasLabel("User")) {
  path.value.get("name");  // string  ✓
  path.value.get("email"); // string  ✓
}

Filter by property

// exact match or predicate
const [alice] = g.V()
  .hasLabel("User")
  .has("email", "alice@example.com");

const seniors = g.V()
  .hasLabel("User")
  .where((v) => v.get("name").startsWith("A"));

Traverse edges

// follow OWNS edges from User → Repo
for (const path of g.V()
  .hasLabel("User")
  .has("email", "alice@example.com")
  .out("OWNS").hasLabel("Repo")) {
  path.value.get("stars"); // number — typed from Repo's schema
}

Label and select

// capture vertices at multiple hops and project them together
for (const { user, repo } of g.V()
  .hasLabel("User").as("user")
  .out("FOLLOWS")
  .out("OWNS").hasLabel("Repo").as("repo")
  .select("user", "repo")) {
  console.log(
    user.value.get("name"),  // string
    repo.value.get("stars"), // number
  );
}

Offline-first sync and realtime collaboration

Swap InMemoryGraphStorage for YGraph and the entire graph lives in a Yjs CRDT document. Every traversal, Cypher query, and index works unchanged — you just get conflict-free sync on top.

Plug in a provider

import * as Y from "yjs";
import { WebsocketProvider } from "y-websocket";
import { YGraph } from "@codemix/y-graph-storage";

const doc = new Y.Doc();
const graph = new YGraph({ schema, doc });

// Connect any Yjs provider — sync happens automatically.
// Every peer that joins the room sees the same graph.
const provider = new WebsocketProvider("wss://my-server", "graph-room", doc);

Subscribe to fine-grained changes

// Events fire for local and remote mutations alike
const unsubscribe = graph.subscribe({
  next(change) {
    // change.kind is one of:
    //   "vertex.added" | "vertex.deleted"
    //   "edge.added"   | "edge.deleted"
    //   "vertex.property.set" | "vertex.property.changed"
    console.log(change.kind, change.id);
  },
});

Live queries

// Wraps any traversal and re-fires when the result set could change
const topRepos = graph.query((g) =>
  g.V().hasLabel("Repo").order("stars", "desc").limit(10)
);

const unsubscribe = topRepos.subscribe({
  next() {
    for (const path of topRepos) {
      console.log(path.value.get("name"), path.value.get("stars"));
    }
  },
});

// Adding or updating a Repo elsewhere — even from a remote peer —
// triggers the subscriber automatically.
graph.updateProperty(myRepo, "stars", 99);

Collaborative property types

import { ZodYText, ZodYArray } from "@codemix/y-graph-storage";
import { z } from "zod";

// Declare Y.Text / Y.Array / Y.Map properties in the schema
const schema = {
  vertices: {
    Document: {
      properties: {
        title:   { type: ZodYText },          // collaborative string
        tags:    { type: ZodYArray(z.string()) }, // collaborative array
      },
    },
  },
  edges: {},
} as const satisfies GraphSchema;

// Plain values are auto-converted — no need to construct Y.* manually
const doc = graph.addVertex("Document", { title: "Hello", tags: ["crdt"] });

// Mutate in place — all peers see the change with no conflicts
doc.get("title").insert(5, ", world");
doc.get("tags").push(["graph"]);

Cypher queries for APIs and LLMs

The same graph is queryable via a Cypher-compatible string language — ideal for exposing data to LLMs via an MCP server, or accepting ad-hoc queries from external clients without bundling a traversal library.

Parse and execute

import { parseQueryToSteps, createTraverser } from "@codemix/graph";

const { steps, postprocess } = parseQueryToSteps(`
  MATCH (u:User)-[:OWNS]->(r:Repo)
  WHERE r.stars > 100
  RETURN u.name, r.name
  ORDER BY r.stars DESC
  LIMIT 10
`);

const traverser = createTraverser(steps);
for (const row of traverser.traverse(graph, [])) {
  console.log(postprocess(row));
  // { u: { name: "Alice" }, r: { name: "my-repo" } }
}

Parameterised queries

// Pass parameters to avoid string interpolation
const { steps, postprocess } = parseQueryToSteps(`
  MATCH (u:User { email: $email })-[:OWNS]->(r:Repo)
  RETURN r.name, r.stars
`);

const traverser = createTraverser(steps);
const rows = Array.from(
  traverser.traverse(graph, [{ email: "alice@example.com" }])
).map(postprocess);

Mutations

// CREATE, MERGE, SET, DELETE are all supported
const { steps } = parseQueryToSteps(`
  MATCH (r:Repo { name: $name })
  SET r.stars = r.stars + 1
`);

createTraverser(steps).traverse(graph, [{ name: "my-repo" }]);

// Enforce read-only — throws ReadonlyGraphError on any write clause
const { steps: safeSteps } = parseQueryToSteps(query, { readonly: true });

License & History

This package is licensed under the MIT license.

It was orignally written as a research project by Charles Pick, founder of codemix and author of the infamous ts-sql demo. Later, when we were building codemix we needed a structured knowledge graph, so we adapted the code, added Y.js support and later, Opus 4.5 added a Cypher-like query language.

Star on GitHub

While you're here

A single source of truth for your product.
For humans and AI.

codemix captures what you actually mean — your business domain, your user flows, the concepts, the constraints — and keeps it in sync with your codebase automatically.

Change your product through chat, diagrams, or collaborative editing. Steer coding agents through development and review code with real understanding. Every agent on your team shares the same context.

Create something completely new, or import your existing codebase to get started.

Build something brand new

Try codemix for free, no credit card required.

↑ top

20.MNT Reform is an open hardware laptop, designed and assembled in Germany

Sourcehttp://mnt.stanleylieber.com/reform/

Sitemnt.stanleylieber.com

Submitterspeckx (Hacker News)

Submitted2026-04-20 14:14 UTC (Hacker News)

HN activity237 points · 89 comments

Length723 words (~4 min read)

MNT Reform

MNT Reform is an open hardware laptop, designed and assembled in Berlin, Germany.

2021.10.08: ordered mnt reform.

2021.12.27: received mnt reform #000120.

2023.04.17: mnt reform #000120 is now being offered as a loaner by sdf.org.

2023.06.02: ordered mnt reform (2023 refresh).

2023.06.29: bought used mnt reform on ebay.

2023.07.03: received used mnt reform #000158.

2024.11.05: sold mnt reform #000158.

2025.02.24: bought used mnt reform on ebay.

2025.03.05: received used mnt reform diy.

screen

pressure mark

The trackball can press against the screen when the lid is closed, causing a small mark to appear on the screen.

case

Lid, screen bezel, keyboard frame, and wrist rest are made from milled aluminium. Side panels and transparent bottom panel are made from acrylic.

Screws in the LCD bezel are not covered, and over time the one in the center can start to rub the paint off of the wrist rest.

wrist rest

fabricate new side panels - forum thread

My friend kindly sent me a pair of metal replacement side panels. First I tried painting them with a paint brush and a bottle of Vanta Black. This flaked off easily, so I sanded them down and repainted them with black spraypaint (satin finish). Managed to chip that as well during installation. I don’t know what I’m doing.

aluminum side panel - left

aluminum side panel - right

2022.03.03 Update: MNT has now made available steel replacement side panels.

black powder coated steel side panel

accessories

usb-c pd adapter (female) - bought one, it works

usb-c pd adaptor (male, non-amazon) - reported to work

lifepo4 replacement batteries (affordable, out of stock)

lifepo4 replacement batteries (expensive, in stock)

lifepo4 external charger - for recovering depleted cells (2-bay)

lifepo4 external charger - for recovering more depleted cells (8-bay)

laird wifi antenna - improved reception

2022.04.27 Update: I ended up just stretching the original molex antenna down under the trackball, which improved reception even more than buying an expensive new antenna. Because of its shape and the orientation of its cables, the Laird antenna wouldn’t quite reach.

molex wifi antenna installed beneath trackball

iogear gwu637 ethernet to wifi n adapter - for operating systems where wifi doesn’t (yet) work

piñatex sleeve - note: pull tabs broke off in the first week

2022.02.22 Update: MNT sent me a replacement sleeve with new, all-metal zipper pulls that are now standard equipment on the sleeve.

piñatex sleeve v2 with metal zipper pulls

2022.07.16 Update: One of the all-metal zipper pulls shattered as I tried to unzip the sleeve.

piñatex sleeve v2 with broken metal zipper pulls

mbk-colors: 1u and 1.5u homing - replacement key caps, some with raised edges to help with acclimating to the non-standard keyboard layout

replacement key caps

operating systems

9front/reform

9front - howto, sdcard image, sysinfo

alpine linux - fully functional (howto pending)

void linux - sdcard image (does not boot on my machine)

debian linux - pre-installed

keyboard

http://mnt.stanleylieber.com/keyboard/

audio in linux

fix for speakers too quiet:

    By default, the speaker output of MNT Reform is a bit quiet, and
    changing the volume with PulseAudio won’t dramatically change it.
    There’s one more knob you can turn up that is only accessible via
    ALSA.

    Open a Terminal and type alsamixer.  Then press F6 and select
    the wm8960-audio card.  Navigate with Cursor keys to the Playback
    slider and turn it up

Well, there is no wm8960-audio listed on my system, only (default). And Master is already cranked to 100. Investigating, I noticed:

   sl@reform:~$ dmesg | grep 8960
   [    3.613559] wm8960 2-001a: Failed to issue reset

edgineer says:

    Usually a reboot gets the audio going for me if I see failed to issue
    reset (happens on booting from power off).  Lukas speculates on a fix
    here[1] and another person[2] provided this line in order to rebind the
    device without a reboot:

          echo 2-001a > /sys/bus/i2c/drivers/wm8960/bind

    I was able to replicate the issue and test the above line out just
    now.  I had to “sudo su” first.  Then the audio device showed up in
    alsamixer again just fine.

[1] [2]

This worked for me, as well.

Update 2022.06.20: After numerous updates, sound no longer works for me in Alpine Linux.

leds in linux

turn off wifi led:

   echo 0 > /sys/class/leds/ath9k-phy0/brightness   # needs root permissions

files

foot - foot.ini (sl)

rofi - mnt-reform.rasi

sway - config (default), config (sl)

vga - font (download page)

waybar - config, style.css

doc

operator handbook - buy, pdf

diy assembly manual - pdf

interactive system diagram and interactive PCBs - html

sources (kicad, etc.) - repository

external usb keyboard manual - pdf

reviews

arstechnica

links

buy, community, faq, ifixit, reform school

Parker, Blaine L.

↑ top

21.Show HN: Ctx – a /resume that works across Claude Code and Codex

Sourcehttps://github.com/dchu917/ctx

SiteGitHub

Submitterdchu17 (Hacker News)

Submitted2026-04-20 16:35 UTC (Hacker News)

HN activity34 points · 15 comments

Length1.3K words (~6 min read)

Languageen

Local context manager for Claude Code and Codex with workstreams, transcript binding, and branching. - dchu917/ctx

Local context manager for Claude Code and Codex.

Keep exact conversation bindings, resume work cleanly, branch context without mixing streams, and optionally inspect saved workstreams in a local browser frontend.

Claude Code chat          Codex chat
      |                      |
      v                      v
   /ctx ...               ctx ...
          \              /
           v            v
      +----------------------+
      | workstream: feature-audit |
      |   claude:  abc123    |
      |   codex:   def456    |
      +----------------------+
                 |
                 +--> feature-audit-v2 branch

Why `ctx`

Exact transcript binding: each internal ctx session can bind to the exact Claude and/or Codex conversation it came from.
No transcript drift: later pulls stay on that bound conversation instead of jumping to the newest chat on disk.
Safe branching: start a new workstream from the current state of another one without sharing future transcript pulls or hijacking the source conversation.
Indexed retrieval: saved workstreams, sessions, and entries are indexed for fast ctx search lookup.
Curated loads: pin saved entries so they always load, exclude saved entries so they stay searchable but stop getting passed back to the model, or delete them entirely.
Local-first: no API keys, no hosted service, plain SQLite plus local files.

Quick Install

Clone the repo and do the standard project-local setup:

git clone https://github.com/dchu917/ctx.git
cd ctx
./setup.sh

This is the main development-friendly install path.

It does the following:

creates ./.contextfun/context.db
writes ./ctx.env
installs a repo-backed ctx shim into ~/.contextfun/bin
links local skills into ~/.claude/skills and ~/.codex/skills

Use this when:

you want the repo checked out locally
you want ctx to use a project-local DB by default
you are developing or editing the repo itself

4-Step Demo

Clone and set it up:

git clone https://github.com/dchu917/ctx.git
cd ctx
./setup.sh

Start a new workstream:

Claude Code:

/ctx start feature-audit --pull

Codex or your terminal:

ctx start feature-audit --pull

Know what --pull means:

ctx start feature-audit --pull creates the workstream and pulls the existing context from the current conversation into it.
ctx start feature-audit creates the workstream starting from that point only. It does not backfill the earlier conversation.

Come back later and continue or branch:

Claude Code:

/ctx resume feature-audit
/ctx branch feature-audit feature-audit-v2

Codex:

ctx resume feature-audit
ctx branch feature-audit feature-audit-v2

Daily Use

Claude Code:

/ctx: show the current workstream for this repo, or tell you that none is set yet.
/ctx list: list saved workstreams, with this repo first when applicable.
/ctx search dataset download: search saved workstreams and entries for matching context.
/ctx start my-stream --pull: create a new workstream and pull the existing context from the current conversation into it before continuing.
/ctx resume my-stream: continue an existing workstream and append new context from this conversation to it.
/ctx rename better-name: rename the current workstream.
/ctx rename better-name --from old-name: rename a specific workstream without switching to it first.
/ctx delete my-stream: delete the latest saved ctx session in that workstream.
/ctx curate my-stream: open the saved-memory curation UI for that workstream.
/ctx branch source-stream target-stream: create a new workstream seeded from the current saved state of another one.
/branch source-stream target-stream: Claude shortcut for the same branch operation.

Codex:

ctx: show the current workstream for this repo, or tell you that none is set yet.
ctx list: list saved workstreams.
ctx list --this-repo: list only workstreams linked to the current repo.
ctx search dataset download: search saved workstreams and entries for matching context.
ctx search dataset download --this-repo: search only workstreams linked to the current repo.
ctx web --open: open the optional local browser UI for browsing, searching, and copying continuation commands.
ctx start my-stream: create a new workstream starting from this point only.
ctx start my-stream --pull: create a new workstream and pull the existing context from the current conversation into it first.
ctx resume my-stream: continue an existing workstream.
ctx resume my-stream --compress: continue an existing workstream with a smaller load pack.
ctx rename better-name: rename the current workstream.
ctx rename better-name --from old-name: rename a specific workstream without switching to it first.
ctx delete my-stream: delete the latest saved ctx session in that workstream.
ctx curate my-stream: open the saved-memory curation UI for that workstream.
ctx branch source-stream target-stream: create a new workstream seeded from the current saved state of another one.

Codex note:

Codex does not currently support repo-defined custom slash commands like /ctx list, so in Codex you should use the installed ctx command with subcommands. When ctx start, ctx resume, or ctx branch load context, they print a short summary of what the workstream is, the latest session being targeted, and the most recent items. They also include an explicit hint that in Codex you can inspect the full command output with ctrl-t, and in Claude you can expand the tool output block, plus guidance for the agent to summarize briefly and ask how you want to proceed instead of pasting the full pack back.

Other Installation Paths

Clone the repo and install a shared global setup from that clone

git clone https://github.com/dchu917/ctx.git
cd ctx
./setup.sh --global

This runs the same quickstart entrypoint, but installs the pinned global release into ~/.contextfun instead of wiring the current clone as the live runtime.

Install globally without cloning first

curl -fsSL https://raw.githubusercontent.com/dchu917/ctx/main/scripts/install.sh | bash

This installs a pinned tagged release into ~/.contextfun, including the ctx binary, the Python package, the default DB, and the self-contained Claude/Codex skills.

Install the bootstrap skill first with `skills.sh`

npx skills add https://github.com/dchu917/ctx --skill ctx -y -g

This installs the ctx bootstrap skill first, not the CLI binary directly. After that, the bundled skills/ctx/scripts/ctx.sh wrapper can run ctx install or auto-install the global CLI into ~/.contextfun on first use.

Bootstrap an agent shell without a full manual clone flow

Global shell bootstrap:

source <(curl -fsSL https://raw.githubusercontent.com/dchu917/ctx/main/scripts/agent_bootstrap.sh)

Project-local shell bootstrap:

source <(curl -fsSL https://raw.githubusercontent.com/dchu917/ctx/main/scripts/agent_setup_local_ctx.sh)

These are best for Claude Code or Codex terminals.

Advanced manual wiring after cloning

Repo-backed ctx shim:

bash scripts/install_shims.sh

Skill links only:

bash scripts/install_skills.sh

Override skill directories if needed:

CODEX_SKILLS_DIR=/custom/codex/skills \
CLAUDE_SKILLS_DIR=/custom/claude/skills \
bash scripts/install_skills.sh

Documentation

Curate Saved Memory

Use ctx curate <workstream> to review the saved entries that feed future loads for a workstream:

ctx curate my-stream

The terminal UI lets you scroll saved entries, inspect a preview, and change how each entry behaves in future packs:

j / k or arrow keys move through entries
Enter toggles a larger preview
p pins an entry so it always loads, even in compressed mode
x excludes an entry from future loads, but keeps it saved and searchable
a restores the default load behavior
d marks an entry for deletion, then y confirms the delete
q exits

Notes:

This changes ctx memory only. It does not edit or delete the original Claude/Codex chat.
If you are in a non-interactive shell, use ctx web --open and manage entries from the browser detail page instead.
ctx delete --interactive <workstream> opens the same curation UI.
See docs/usage.md and docs/architecture.md for deeper detail on load controls.

Clear Workstreams

Use ctx clear to delete whole workstreams together with their linked sessions and saved entries:

ctx clear --this-repo --yes
ctx clear --all --yes

Notes:

--this-repo deletes only workstreams linked to the current repo.
--all deletes workstreams across the entire current ctx DB.
--yes is required for the actual delete. Without it, ctx prints what would be removed and exits without deleting anything.
This clears ctx-managed memory, attachments, and current-workstream pointers for the deleted workstreams. It does not delete the original Claude/Codex chat files.

Security

ctx is a context layer, not a sandbox. See SECURITY.md for the threat-model summary and docs/maintenance.md for operational notes.

FAQ

Do I need API keys?

No. Everything is local.

Can multiple repos share the same context DB?

Yes. Set ctx_DB to a shared path such as ~/.contextfun/context.db.

Does deleting a ctx session delete the actual Claude/Codex chat?

No. It only deletes the internal ctx session and its stored attachments.

License

MIT. See LICENSE.

↑ top

26.Leonardo, Borgia, and Machiavelli: A Fateful Collusion

Sourcehttps://www.historytoday.com/archive/leonardo-borgia-and-machiavelli-fateful-collusion

Sitehistorytoday.com

AuthorPaul Strathern is author of The Medici: Godfathers of the Renaissance (Plmlico, 2004).

Submitted2026-04-16 06:38 UTC (Hacker News)

HN activity45 points · 0 comments

Length2.3K words (~10 min read)

Languageen

During the latter half of 1502, when the Italian Renaissance was at its height, three of its most distinguished yet disparate figures travelled together through the remote hilly region of the Romagna in north-eastern Italy. Cesare Borgia (1475-1507), backed by his father Pope Alexander VI (1431-1503), was leading a military campaign whose aim was to carve out his own personal princedom. He had hired Leonardo da Vinci (1452-1519) as his chief military engineer whose brief was to reinforce the castles and defences in the region as well as to construct a number of revolutionary new military machines, which he had designed in his notebooks. Accompanying this unlikely duo was the enigmatic figure of Niccolò Machiavelli (1469-1527), who had been despatched by the Florentine authorities as an emissary to the travelling ‘court’ with instructions to ingratiate himself with Borgia and, as far as possible, discover his intentions towards Florence whose position to the west, just across the Apennine mountains, left it particularly vulnerable to Borgia’s territorial ambitions.

In a characteristically Machiavellian situation Borgia knew perfectly well what Machiavelli was up to, and Machiavelli knew that he knew this. Machiavelli had been instructed to send regular diplomatic despatches back to Florence, reporting on all he had discovered. Machiavelli well understood that Borgia was intercepting these despatches and reading them himself, discarding those he felt should not be sent. As a result, Machiavelli would often resort to alluding in the most oblique form to what was actually taking place. Borgia, a man whose considerable intellect matched his reputation for treachery and violence, was not fooled by this. He knew that the Florentine authorities would certainly have established a simple code with Machiavelli before he had set out. Remarks about the mountains, the local people, the weather and even the state of Machiavelli's accommodation might all refer to vital intelligence.

Machiavelli's information came from a number of unlikely sources. Sometimes it even came directly from Borgia himself, but could he believe what Borgia told him? Machiavelli had to be guarded about any other sources of information, which usually came from careless remarks let drop by secretaries or high-ranking officers among Borgia's entourage whom Machiavelli had befriended. Though everyone knew Machiavelli was a spy, there was something wittily subversive in his character which seemed to appeal to them. This also appealed to Borgia himself: here was a man of some learning, whose intellect matched his own, who genuinely appeared more interested in discussing philosophical ideas than in per- forming the task of a mere envoy. Such a man was rare company among the rough and ready mercenary commanders with whom Borgia was surrounded. And, unlike his commanders, in a curious way he knew that he could trust Machiavelli, man to man: up to a point, that is. Many of Borgia's most daring and sensational plans relied upon the notion of secrecy and betrayal, elements which he was not even willing to pass on to his military commanders until the last moment, when there was no chance of such secrecy being compromised.

**Portrait of Cesare Borgia, Tobias Stimmer, c.1549-75. RIjksmuseum. Public Domain.**

For obvious reasons, Machiavelli frequently made misleading remarks about the sources of his information in order to protect their identity. However, one particular source - referred to only as a 'friend' - was a combination of various informants, who observed intelligence and bits of gossip picked up here and there. Or so Machiavelli would have had us believe. It has now become clear that most of the information from this 'friend' did in fact come directly from one source and that this vital informant was none other than Machiavelli's friend and fellow Florentine Leonardo da Vinci.

Borgia's reasons for hiring Leonardo da Vinci were obvious. Besides being known as a great artist, he had already established himself as the most ingenious and talented military engineer in Italy. Yet why on earth should an artist of such refined sensibilities as Leonardo simply abandon painting to face the rigours as well as the dangers of campaign life with a man as notorious as Borgia. The evidence suggests that Leonardo was going through something of a crisis at this time. He had grown tired of painting - so much so that he had already become notorious for leaving canvases and frescoes unfinished because he had 'solved' their difficulties and they thus no longer interested him. He wished to have time to pursue his inventive and ingenious scientific pursuits, which he secretly jotted down in his coded notebooks, and perhaps felt that the freedom given to him by Borgia would let him do this. Borgia's instructions allowed Leonardo to roam the Romagna almost at will, coming up with ideas for new defences and infrastructure as he saw fit. Another quirk of Leonardo's character was that he seemed to be attracted to, and do his best work for, men of powerful and unpredictable temperament who nonetheless allowed him freedom to develop his own ideas in between his undemanding public duties. Many of Leonardo's most accomplished and ingenious creations literally disappeared into the air - intricate ice sculptures, technically sophisticated machines which would explode into fireworks, sensational dramatic stage devices which would be discarded after the night's performance.

In his time, Leonardo da Vinci would be employed by some of the most powerful and flamboyant figures of his age ranging from Lorenzo the Magnificent of Florence to Galleazzo 'il Moro' Sforza, who murdered his way to becoming Duke of Milan; from the young Francis I of France, king of the most powerful nation in Europe, and to Cesare Borgia, a man whose misdeeds were of such enormity that he has become a byword for evil.

Borgia was the illegitimate son of Pope Alexander VI, a pontiff whose notoriety placed him in a class of his own, even among the popes of the period. (Cardinal Rodrigo Borgia, as he was at the time of the papal elections, was the first man to ensure himself the papal throne by unashamedly buying - with mule trains of jewels and gold - the requisite amount of cardinals required to ensure his election.) His second son Cesare carried on the Borgia traditions to the best of his considerable abilities: he may well have murdered his older brother to ensure his place as his father's son and heir, and had a psychologically intense relationship with his notorious sister Lucrezia, which was at the very least subconsciously incestuous. (A suspicious number of her husbands and lovers met a gruesome end while he was around.) And, where treachery was concerned, he was second to none - in an age and culture where treachery was very much the norm.

**A hoist in use at an arsenal, engraving by Francesco Bartolozzi after Leonardo da Vinci, 1796. Wellcome Collection. Public Domain.**

We know that Borgia and Machiavelli formed a close, if somewhat wary, friendship. Leonardo's reactions to his companions are less clear: Borgia is mentioned just once, in an aside, in his notebooks. What we do know is that during the course of Leonardo's travels of inspection for Borgia he came across the mountainous landscape in the upper Arno valley that would form the mysterious background to the Mona Lisa, one of the few paintings he would keep in his possession to the end of his days, constantly returning to it, pondering its composition, emphasising or toning down details and so forth. The present somewhat podgy-faced beauty which hangs in the Louvre is now known to be a travesty of the original. Over centuries the surface of the lighter pigments of her face have developed many tiny fissures, thus broadening and rounding her cheeks, while the darker pigments which depict her more definite features have lesser fissures and have thus retained a much closer approximation to their original form. This continuous retouching of the Mona Lisa was a symptom of a psychological trait in Leonardo, which became much more accentuated after his service with Borgia.

Leonardo's tendency to leave works unfinished and to flit from one subject to another in his notebooks, his inability to order this work into separate topics, or execute any overall extensive plan, all these minor traits became exaggerated to almost pathological proportions after his work with Borgia. Despite Leonardo's later attempts to order his voluminous notebooks, nothing whatsoever came of this project except a comparatively brief treatise on painting (which was probably put together by his faithful assistant Melzi). As a result, Leonardo's scientific legacy - to say nothing of the ground- breaking anatomical investigations that took him so much effort and caused him so much trouble - would play no part whatsoever in the advancement of science. All those ingenious devices, the working machines (from helicopters to submarines), the screws, the gears, the 'hodometer' (for the precise measuring of distances, invented for Borgia), all this came to nothing. In the event, the notebooks would be sold off after Leonardo's death, sometimes a few separated sheets at a time, to rich collectors. These souvenir hunters had no conception of what Leonardo's notebooks were about and regarded them merely as curiosities of genius. They could not even read the mirror-written Latin instructions beside the drawings, a simple code whose secretive crabbed script was not fully deciphered until well over a century later. The waste is inestimable. If Galileo (born less than half a century after Leonardo's death) had been able to peruse Leonardo's notebooks, entire new branches of science might have come into being, while others would have made significant advances, in some cases centuries before they in fact did so.

How did Borgia contribute to this psychological flaw in Leonardo? And why did Machiavelli make Borgia the exemplary hero of his notorious political treatise The Prince? Ironically, the reason for these two disparate effects is the same: Borgia's duplicitous ruthlessness. A supreme example of this was witnessed by both Machiavelli and Leonardo on the occasion when Borgia charmed his treacherous commanders into meeting him for a reconciliation at the town of Senigallia, assuring them that he could not fulfil his ambitions without them then had them all murdered. Some were garrotted in his presence, others transported in cages and slaughtered later.

**Study of a warrior for a fresco on The Battle of Anghiari, Leonardo da Vinci, 1505. Katholieke Universiteit Leuven. Public Domain.**

Machiavelli's initial despatch to Florence describing these events indicates that he was almost out of his wits with terror. News of the betrayals spread fast, and Sinigallia was in mayhem as Borgia's troops went on the rampage, beyond the control of even their redoubtable commander. We can only imagine how this must have affected the sensitive mind of Leonardo, who was with Machiavelli on this occasion. The oblique, ever-secretive Leonardo makes no mention of this event in his notebooks. Such an omission is not unusual; he often simply shut out from his mind any upsetting reality he could not face. But this horrific event would have its effect nonetheless almost at once it would accentuate what might be termed his 'intellectual stutter'. The meticulous details of his observations would lose any semblance of overall fluency as the intensity of his mind darted from one idea to another. It was at this time that he attempted to explain this curious mental tic (to himself?) by writing beside a diagram in his notebook that he would not complete this project because of 'the evil nature of man'.

The more resilient and realistic Machiavelli would eventually take a diametrically different attitude. Indeed, he even went so far as to embrace the 'evil nature of man'. If a prince was to conquer a territory, rule it and continue to govern it amid the treacherous politics of Renaissance Italy, then Borgia's ruthless lack of moral concern was the only way he could succeed. All this Machiavelli would later set down in The Prince, whose amorality would inspire indignant outrage across Europe and beyond.

As for Borgia himself, the truly astonishing extent of his ambitions only gradually emerged after his death. His plan had been to establish his own princedom in the Romagna. Backed by the diplomatic machinations of his powerful father, he would then take Florence and eventually unite the whole of Italy under his power. To give Machiavelli his due he probably realised this earlier than most; he too wished to see a united Italy that would achieve a power it had not seen since the collapse of the Roman Empire over a millennium before. Yet even Machiavelli did not suspect the full enormity of what Borgia had planned with his father. Upon the death of Alexander VI a new pope would be elected by the college of cardinals. There is some evidence that Borgia planned to dispense with this centuries-old tradition for voting in St Peter's successor to the rule of Christendom. Instead, he intended to seize the papacy, declare himself pope and turn this office into a secular hereditary institution ruled by the House of Borgia. As Machiavelli had seen, the key to Borgia's success lay in his astonishing ability to outwit his enemies by means of treachery beyond wildest imagination.

Ironically, when Borgia's luck finally ran out, it was he who would fall victim to others, betrayed by Pius III, the Pope who succeeded his father, and then by his ally and protector the Viceroy of Naples. Shipped in irons to Spain, here too he would be dogged by bad luck. Despite escaping from his castle prison, the once mighty Cesare Borgia would suffer an ignominious end in a minor military skirmish far removed from Rome in obscure rural Spain, all his grand ambitions unachieved.

↑ top

28.Show HN: Daemons – we pivoted from building agents to cleaning up after them

Sourcehttps://charlielabs.ai/

SiteCharlie Labs

Submitterrileyt (Hacker News)

Submitted2026-04-21 16:16 UTC (Hacker News)

HN activity39 points · 24 comments

Length1.0K words (~5 min read)

Languageen

Keep PRs mergeable, documentation accurate, issues up to date, and bugs out of production with AI daemons.

Keep PRs mergeable, documentation accurate, issues up to date, and bugs out of production with a new type of AI background process that is self-initiated and defined in easy-to-use .md files.

---
name: pr-helper
purpose: Keeps PRs review-ready.
watch:
  - when a pull request is opened
  - when a pull request is synchronized
routines:
  - suggest PR description improvements
  - flag missing reviewer context
deny:
  - merge pull requests
  - push to protected branches
schedule: "0 9 * * *"
---

## Policy
Focus on short, actionable feedback.

## Output format
1. Findings
2. Suggested edits
3. Questions for author

What's in a Daemon .md file?: Daemons are defined in Markdown files that live in your repo. You define the role once — what it watches, what it does, what it can't do — and the daemon handles it from there.
Frontmatter: Declarative fields between --- fences define what the daemon is: its name, purpose, watch conditions, routines, deny rules, and schedule.
Content: Markdown below the --- frontmatter defines how the daemon operates: policy, output format, escalation rules, limits, and more.
Portable: Daemon files are an open format. The same file works across any provider that supports the spec.

Where They Fit

Where Daemons fit in

Agents are human-initiated. Daemons are self-initiated — they observe the environment, detect drift, and act without a prompt.

     GitHub   Linear   Sentry   Slack   Docs
                         |
                         v
+------------------------------------------------+
|  AGENTS                     (human-initiated)  |
|  Build features, fix bugs, ship code           |
+------------------------+-----------------------+
                         |
           +-------------+--------------+
           | Code, PRs, Issues, Docs    |
           |     drift accrues here     |
           +-------------+--------------+
                         |
+------------------------------------------------+
|  DAEMONS                     (self-initiated)  |
|  Watch, detect, fix, repeat. No prompt needed. |
|                                                |
|  > Resolve merge conflicts                     |
|  > Update stale documentation                  |
|  > Triage and assign bugs                      |
|  > Patch outdated dependencies                 |
|  > Label and organize issues                   |
|  > Fix failing CI checks                       |
+------------------------------------------------+

The Problem

Daemons do the work that agents leave behind

Operational debt is the new technical debt. Daemons pay it down.

Debt accumulates

Operational debt accrues in your Linear issues, GitHub PRs, dependencies, and more, creating serious drag and reducing overall quality.

Agents accelerate it

Agents help teams ship faster, which creates operational debt faster too. More code, more docs, and more issues to maintain.

Daemons maintain it

A daemon fills this maintenance role. You define the role once — what it watches, what it does, what it can't do — and the daemon handles it from there.

Daemon Library

Daemons are defined in Markdown files that you can modify, create, and share

Project Manager

Keep your issues up to date

Bug Triage

Watch your bug tracker and prevent reoccurrences

Codebase Maintainer

Keep dependencies up to date and patches in place

Librarian

Keep your documentation accurate so onboarding is not a wild goose chase

---
name: issue-labeler
purpose: Ensures every Linear issue has the correct labels from the type and touchpoint label groups.
watch:
  - when a Linear issue is created
routines:
  - add missing labels to a new Linear issue
  - find issues with missing labels and add them
deny:
  - remove labels from issues
  - replace or change existing labels on issues
  - comment on issues
  - change issue status, priority, assignee, or any field other than labels
schedule: "0 2 * * *"
---

## Policy
- Only add labels. Never remove, replace, or overwrite existing labels.
- If an issue already has a label from a group, do not touch that group.
- Apply the single best-fit label from each missing group.

## Limits
- On issue-created events, process only the triggering issue.
- On the daily sweep, label at most 20 issues per activation.

Hybrid activation: Wakes on new issues and sweeps daily to catch anything missed.
Additive only: Deny rules ensure the daemon can only add labels, never remove or change existing ones.
Rate-limited: Limits section caps work per activation so the daemon doesn't overwhelm reviewers.

Autonomy

Predictable and reliable autonomy

Daemons excel at ongoing work. Use agents to build, and use daemons to maintain what you've built.

Work you didn't have to notice

Every action a daemon takes is one a human didn't have to notice, decide on, and initiate.

Specialized and improving

Daemons perform specific roles, get better over time, and always follow your team's conventions.

Predictable behavior earns trust

Encode your preferences and style once, and daemons will keep things tidy. Predictable behavior earns autonomy.

Define a role, not a task

A task has a start, an end, and a definition of done. A role is an ongoing responsibility with judgment. The daemon file is a role description.

Compounding control

Every daemon file edit changes behavior across every future activation for the whole team. Each change is small, but the effects multiply.

Direction once, not every time

Agents require direction every time. Task 500 costs the same human attention as task 1. Daemons require direction once, then less and less over time.

Accumulating context

Daemons build a richer model of the team and codebase over time. A daemon at month three is sharper than at day one, without anyone updating a file.

Infrastructure

Always on and easy to use

Local agents need your laptop to run, and cloud-based agents can be flaky and unpredictable.

Config in your repo

The daemon file is a spec in your repo. The team tunes it like any other config: tighten a threshold, add a deny rule, narrow the scope.

Continuous observation

Daemons run continuously in the background, observing where work happens — in GitHub, Linear, Slack, and more.

Zero maintenance

Daemons run smoothly and execute reliably, without having to stare at logs, monitor uptime, or restart processes manually.

Eventually, you forget they're running. That's the daemon working.

Open Format

Build your own daemons with our flexible specification that anyone can use

Daemons don't complete tasks — they fulfill roles. Create your own now.

"The Charlie Daemons are working very, very well, both for commenting and cleaning things up in Linear as well as the event-based actions in GitHub. We're moving so f@#$ing fast with agents that it's really great to have Daemons watching our backs to make sure we can keep up the accelerated pace."

— Jasper Croome, aarden.ai

↑ top

29.Tim Cook's Impeccable Timing

Sourcehttps://stratechery.com/2026/tim-cooks-impeccable-timing/

SiteStratechery by Ben Thompson

Submitterhasheddan (Hacker News)

Published2026-04-21

HN activity251 points · 344 comments

Length2.7K words (~12 min read)

Languageen-US

Tim Cook had an extraordinary run — and impeccable timing, both in terms of when he became CEO, and when he is stepping down.

Listen to this post:

It’s the nature of business that the eulogy for a chief executive doesn’t happen when they die, but when they retire, or, in the case of Apple CEO Tim Cook, announce that they will step up to the role of Executive Chairman on September 1. The one morbid exception is when a CEO dies on the job — or quits because they are dying — and the truth of the matter is that that is where any honest recounting of Cook’s incredibly successful tenure as Apple CEO, particularly from a financial perspective, has to begin.

The numbers, to be clear, are extraordinary. Cook became CEO of Apple on August 24, 2011, and in the intervening 15 years revenue has increased 303%, profit 354%, and the value of Apple has gone from $297 billion to $4 trillion, a staggering 1,251% increase.

Apple's increase in market cap over Tim Cook's tenure as CEO

The reason for Cook’s accession in 2011 became clear a mere six weeks later, when Steve Jobs passed away from cancer on October 5, 2011. Jobs’ death isn’t the reason Cook was chosen — Cook had already served as interim CEO while Jobs underwent treatment in 2009 — but I think the timing played a major role in making Cook arguably the greatest non-founder CEO of all time.

Zero to One

Peter Thiel introduced the concept of Zero To One thusly:

When we think about the future, we hope for a future of progress. That progress can take one of two forms. Horizontal or extensive progress means copying things that work — going from 1 to n. Horizontal progress is easy to imagine because we already know what it looks like. Vertical or intensive progress means doing new things — going from 0 to 1. Vertical progress is harder to imagine because it requires doing something nobody else has ever done. If you take one typewriter and build 100, you have made horizontal progress. If you have a typewriter and build a word processor, you have made vertical progress.

Steve Jobs made 0 to 1 products, as he reminded the audience in the introduction to his most famous keynote:

Every once in a while, a revolutionary product comes along that changes everything. First of all, one’s very fortunate if one gets to work on one of these in your career. Apple’s been very fortunate: it’s been able to introduce a few of these into the world.

In 1984, we introduced the Macintosh. It didn’t just change Apple, it changed the whole computer industry. In 2001, we introduced the first iPod. It didn’t just change the way we all listen to music, it changed the entire music industry.

Well, today we’re introducing three revolutionary products of this class. The first one: a widescreen iPod with touch controls. The second: a revolutionary mobile phone. And the third is a breakthrough Internet communications device. Three things…are you getting it? These are not three separate devices. This is one device, and we are calling it iPhone.

Steve Jobs would, three years later, also introduce the iPad, which makes four distinct product categories if you’re counting. Perhaps the most important 0 to 1 product Jobs created, however, was Apple itself, which raises the question: what makes Apple Apple?

The Cook Doctrine

“What Makes Apple Apple” isn’t a new question; it was the central question of Apple University, the internal training program the company launched in 2008. Apple University was hailed on the outside as a Steve Jobs creation, but while I’m sure he green lit the concept, it was clear to me as an intern on the Apple University team in 2010, that the program’s driving force was Tim Cook.

The core of the program, at least when I was there, was what became known as The Cook Doctrine:

We believe that we’re on the face of the Earth to make great products, and that’s not changing.

We’re constantly focusing on innovating.

We believe in the simple, not the complex.

We believe that we need to own and control the primary technologies behind the products we make, and participate only in markets where we can make a significant contribution.

We believe in saying no to thousands of projects so that we can really focus on the few that are truly important and meaningful to us.

We believe in deep collaboration and cross-pollination of our groups, which allow us to innovate in a way that others cannot.

And frankly, we don’t settle for anything less than excellence in every group in the company, and we have the self-honesty to admit when we’re wrong and the courage to change.

And I think, regardless of who is in what job, those values are so embedded in this company that Apple will do extremely well.

Cook explained this on Apple’s January 2009 earnings call, during Jobs’ first leave of absence, in response to a question about how Apple would fare without its founder. It’s a brilliant statement, but it is — as the last paragraph makes clear — ultimately about maintaining, nurturing, and growing what Jobs built.

That is why I started this Article by highlighting the timing of Cook’s ascent to the CEO role. The challenge for CEOs following iconic founders is that the person who took the company from 0 to 1 usually sticks around for 2, 3, 4, etc.; by the time they step down the only way forward is often down. Jobs, however, by virtue of leaving the world too soon, left Apple only a few years after its most important 0 to 1 product ever, meaning it was Cook who was in charge of growing and expanding Apple’s most revolutionary device yet.

Cook’s Triumphs

Cook, to be clear, managed this brilliantly. Under his watch the iPhone not only got better every year, but expanded its market to every carrier in basically every country, and expanded the line from one model in two colors to five models in a plethora of colors sold at the scale of hundreds of millions of units a year.

Cook was, without question, an operational genius. Moreover, this was clearly the case even before he scaled the iPhone to unimaginable scale. When Cook joined Apple in 1998 the company’s operations — centered on Apple’s own factories and warehouses — were a massive drag on the company; Cook methodically shut them down and shifted Apple’s manufacturing base to China, creating a just-in-time supply chain that year-after-year coordinated a worldwide network of suppliers to deliver Apple’s ever-expanding product line to customers’ doorsteps and a fleet of beautiful and brand-expanding stores. There was not, under Cook’s leadership, a single significant product issue or recall.

Cook also oversaw the introduction of major new products, most notably AirPods and Apple Watch; the “Wearables, Home, and Accessories” category delivered $35.4 billion in revenue last year, which would rank 128 on the Fortune 500. Still, both products are derivative of the iPhone; Cook’s signature 0 to 1 product, the Apple Vision Pro, is more of a 0.5.

Cook’s more momentous contribution to Apple’s top line was the elevation of Services. The Google search deal actually originated in 2002 with an agreement to make Google the default search service for Safari on the Mac, and was extended to the iPhone in 2007; Google’s motivation was to ensure that Apple never competed for their core business, and Cook was happy to take an ever increasing amount of pure profit.

The App Store also predated Cook; Steve Jobs said during the App Store’s introduction that “we keep 30 [percent] to pay for running the App Store”, and called it “the best deal going to distribute applications to mobile platforms”. It’s important to note that, in 2008, this was true! The App Store really was a great deal.

Three years later, in a July 28, 2011 email — less than a month before Cook officially became CEO — Phil Schiller wondered if Apple should lower its take once they were making $1 billion a year in profit from the App Store. John Gruber, writing on Daring Fireball in 2021, wondered what might have been had Cook followed Schiller’s advice:

In my imagination, a world where Apple had used Phil Schiller’s memo above as a game plan for the App Store over the last decade is a better place for everyone today: developers for sure, but also users, and, yes, Apple itself. I’ve often said that Apple’s priorities are consistent: Apple’s own needs first, users’ second, developers’ third. Apple, for obvious reasons, does not like to talk about the Apple-first part of those priorities, but Cook made explicit during his testimony during the Epic trial that when user and developer needs conflict, Apple sides with users. (Hence App Tracking Transparency, for example.)

These priorities are as they should be. I’m not complaining about their order. But putting developer needs third doesn’t mean they should be neglected or overlooked. A large base of developers who are experts on developing and designing for Apple’s proprietary platforms is an incredible asset. Making those developers happy — happy enough to keep them wanting to work and focus on Apple’s platforms — is good for Apple itself.

I want to agree with Gruber — I was criticizing Apple’s App Store policies within weeks of starting Stratechery, years before it became a major issue — but from a shareholder perspective, i.e. Cook’s ultimate bosses, it’s hard to argue with Apple’s uncompromising approach. Last year Apple Services generated 26% of Apple’s revenue and 41% of the company’s profit; more importantly, Services continues to grow year-over-year, even as iPhone growth has slowed from the go-go years.

China and AI

Another way to frame the Services question is to say that Gruber is concerned about the long-term importance of something that is somewhat ineffable — developer willingness and desire to support Apple’s platforms — which is, at least in Gruber’s mind, essential for Apple’s long-term health. Cook, in this critique, prioritized Apple’s financial results and shareholder returns over what was best for Apple in the long run.

This isn’t the only part of Apple’s business where this critique has validity. Cook’s greatest triumph was, as I noted above, completely overhauling and subsequently scaling Apple’s operations, which first and foremost meant developing a heavy dependence on China. This dependence was not inevitable: Patrick McGee explained in Apple In China, which I consider one of the all-time great books about the tech industry, how Apple made China into the manufacturing behemoth it became. McGee added in a Stratechery Interview:

Let me just refer back to something that you wrote I think a few months ago when you called the last 20, 25 years, like the golden age for companies like Apple and Silicon Valley focused on software and Chinese taking care of the hardware manufacturing. That is a perfect partnership, and if we were living in a simulation and it ended tomorrow, you’d give props for Apple to taking advantage of the situation better than anybody else.

The problem is we’re probably not living in the simulation and things go on, and I’ve got this rather disquieting conclusion where, look, Apple’s still really good probably, they’re not as good as they once were under Jony Ive, but they’re still good at industrial design and product design, but they don’t do any operations in our own country. That’s all dependent on China. You’ve called this in fact the biggest violation of the Tim Cook doctrine to own and control your destiny, but the Chinese aren’t just doing the operations anymore, they also have industrial design, product design, manufacturing design.

It really is ironic: Tim Cook built what is arguably Apple’s most important technology — its ability to build the world’s best personal computer products at astronomical scale — and did so in a way that leaves Apple more vulnerable than anyone to the deteriorating relationship between the United States and China. China was certainly good for the bottom line, but was it good for Apple’s long-run sustainability?

This same critique — of favoring a financially optimal strategy over long-term sustainability — may also one day be levied on the biggest question Cook leaves his successor: what impact will AI have on Apple? Apple has, to date, avoided spending hundreds of billions of dollars on the AI buildout, and there is one potential future where the company profits from AI by selling the devices everyone uses to access commoditized models; there is another future where AI becomes the means by which Apple’s 50 Years of Integration is finally disrupted by companies that actually invested in the technology of the future.

Cook’s Timing

If Tim Cook’s timing was fortunate in terms of when in Apple’s lifecycle he took the reins, then I would call his timing in terms of when in Apple’s lifecycle he is stepping down as being prudent, both for his legacy and for Apple’s future.

Apple is, in terms of its traditional business model, in a better place than it has ever been. The iPhone line is fantastic, and selling at a record pace; the Mac, meanwhile, is poised to massively expand its market share as Apple Silicon — another Jobs initiative, appropriately invested in and nurtured by Cook — makes the Mac the computer of choice for both the high end (thanks to Apple Silicon’s performance and unified memory architecture) and the low end (the iPhone chip-based MacBook Neo significantly expands Apple’s addressable market). Meanwhile, the Services business continues to grow. Cook is stepping down after Apple’s best-ever quarter, a milestone that very much captures his tenure, for better and for worse.

At the same time, the AI question looms — and it suggests that Something Is Rotten in the State of Cupertino. The new Siri still hasn’t launched, and when it does, it will be with Google’s technology at the core. That was, as I wrote in an Update, a momentous decision for Apple’s future:

Apple’s plans are a bit like the alcoholic who admits that they have a drinking problem, but promises to limit their intake to social occasions. Namely, how exactly does Apple plan on replacing Gemini with its own models when (1) Google has more talent, (2) Google spends far more on infrastructure, and (3) Gemini will be continually increasing from the current level, where it is far ahead of Apple’s efforts? Moreover, there is now a new factor working against Apple: if this white-labeling effort works, then the bar for “good enough” will be much higher than it is currently. Will Apple, after all of the trouble they are going through to fix Siri, actually be willing to tear out a model that works so that they can once again roll their own solution, particularly when that solution hasn’t faced the market pressure of actually working, while Gemini has?

In short, I think Apple has made a good decision here for short term reasons, but I don’t think it’s a short-term decision: I strongly suspect that Apple, whether it has admitted it to itself or not, has just committed itself to depending on 3rd-parties for AI for the long run.

As I noted above and in that Update, this decision may work out; if it doesn’t, however, the sting will be felt long after Cook is gone. To that end, I certainly hope that John Ternus, the new CEO, was heavily involved in the decision; truthfully, he should have made it.

To that end, it’s right that Cook is stepping down now. Jobs might have been responsible for taking Apple from 0 to 1, but it was Cook that took Apple from 1 to $436 billion in revenue and $118 billion in profit last year. It’s a testament to his capabilities and execution that Apple didn’t suffer any sort of post-founder hangover; only time will tell if, along the way, Cook created the conditions for a crash out, by virtue of he himself forgetting The Cook Doctrine and what makes Apple Apple.

↑ top

1.The Vercel breach: OAuth attack exposes risk in platform environment variables

2.Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica

3.OpenAI Livestream

Past livestreams

Built to benefit everyone

Introducing ChatGPT Atlas

DevDay 2025

Introducing gpt-realtime and Realtime API updates for production voice agents

Introducing GPT-5

Introducing ChatGPT agent: bridging research and action

ChatGPT for Business

Introducing Codex

Introducing OpenAI o3 and o4-mini

Introducing GPT-4.1 in the API

Introducing 4o Image Generation

Introducing next-generation audio models in the API

New tools for building agents

Introducing GPT-4.5

4.Cal.diy: open-source community edition of cal.com

Cal.diy

About Cal.diy

What's different from Cal.com?

Built With

Getting Started

Prerequisites

Development

Setup

Quick start with yarn dx

Development tip

Gitpod Setup

Manual setup

Setting up your first user

Approach 1

Approach 2

E2E-Testing

Resolving issues

E2E test browsers not installed

Upgrading from earlier versions

Deployment

Docker

Requirements

Running Cal.diy with Docker Compose

Updating Cal.diy

Building from source with Docker

Configuration

Important Run-time variables

Build-time variables

Troubleshooting

SSL edge termination

Failed to commit changes: Invalid 'prisma.user.create()'

CLIENT_FETCH_ERROR

Railway

Northflank

Vercel

Render

Elestio

License

Enabling Content Security Policy

Integrations

Obtaining the Google API Credentials

Adding google calendar to Cal.diy App Store

Obtaining Microsoft Graph Client ID and Secret

Obtaining Zoom Client ID and Secret

Obtaining Daily API Credentials

Obtaining Basecamp Client ID and Secret

Obtaining HubSpot Client ID and Secret

Obtaining Webex Client ID and Secret

Obtaining ZohoCRM Client ID and Secret

Obtaining Zoho Calendar Client ID and Secret

Obtaining Zoho Bigin Client ID and Secret

Obtaining Pipedrive Client ID and Secret

Rate Limiting with Unkey

Contributing

Good First Issues

Contributors

Translations

Acknowledgements

5.Framework Laptop 13 Pro

6.Laws of Software Engineering

7.A Periodic Map of Cheese

Quick start with `yarn dx`