California AB 2013 — Generative AI Training Data Transparency Act

Name: plainstamp
Author: KS Elevated Solutions LLC

On or before January 1, 2026, and before each subsequent release or substantial modification, the developer of a generative AI system or service that is made publicly available to Californians (including any system released on or after January 1, 2022) must post on the developer's internet website a high-level summary of the datasets used to train the system. The disclosure must include the 12 enumerated categories of information set out in the statute, including dataset sources/owners, how the datasets further the system's intended purpose, the number of data points in general ranges (with estimates for dynamic datasets), copyrighted-material usage, and whether personal information is included. Enforceable via California's Unfair Competition Law (Bus. & Prof. Code § 17200), which permits both public-agency and private enforcement.

Mandatory — failure to disclose creates legal exposure.

Quick facts

Field	Value
Jurisdiction	California (US-CA)
Severity	`mandatory`
Channels	`about-page, terms-of-service`
Use cases	`general`
Effective date	2026-01-01
Last verified	2026-05-08

What it requires

dataset-sources — Sources or owners of the datasets used to train the system.

Example: Datasets were sourced from Common Crawl, a publicly licensed code repository, and the developer's own first-party logs.
purpose-fit — Description of how the datasets further the intended purpose of the AI system or service.

Example: The training corpus emphasizes legal and regulatory text to align the system with its disclosure-template generation purpose.
data-volume — The number of data points included in the datasets, in general ranges, with estimated figures for dynamic datasets.

Example: Approximately 1.2 billion text data points across all corpora; dynamic real-time data approximately 4 million additional points per day (estimated).
copyrighted-material — Whether the datasets include copyrighted material and the developer's basis for using such material.

Example: Some datasets include copyrighted material accessed under fair-use rationales; others were licensed from third-party providers.
personal-information — Whether the datasets include personal information and the developer's basis and safeguards.

Example: Datasets include some personal information in publicly-posted online content; the developer applies redaction and tokenization filters during training.
twelve-category-completeness — Disclosure must cover all 12 categories enumerated in the statute (additional categories beyond those above include: data-collection time period; data point types; whether AI-generated synthetic data was used; dataset cleaning processes; whether inferences are drawn; whether biometric data is included). (Coverage rule, not single in-message disclosure.) (meta-requirement; not validated by substring check)

Sample disclosure language (plain)

Generative AI Training Data Disclosure (California AB 2013): The datasets used to train this generative AI system include the following categories of information: [sources / owners], [how datasets fit purpose], [data volume in general ranges], [copyrighted-material status and basis], [personal-information status and safeguards], [data collection time period], [data point types], [whether AI-generated synthetic data was used], [dataset cleaning processes], [whether inferences were drawn from data], [whether biometric data is included]. Last updated [date].

Sample disclosure language (formal)

Disclosure under California AB 2013 (Generative Artificial Intelligence: Training Data Transparency Act): Pursuant to the requirements applicable to developers of generative AI systems made publicly available to Californians, the developer publishes the following high-level summary of training datasets: [twelve enumerated categories]. This disclosure is updated upon each subsequent release or substantial modification of the system.

Citation

Statute: California Business and Professions Code (added by AB 2013)
Section: Generative Artificial Intelligence: Training Data Transparency Act
Publisher: California Legislative Information
Source: https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202320240AB2013

Notes

AB 2013 covers generative AI systems made available to Californians ANY TIME ON OR AFTER 2022-01-01 — so it applies retroactively to systems already in production. Compliance must be in place by 2026-01-01 even for legacy systems. The 'high-level summary' standard is intentionally permissive; developers can use ranges and estimates rather than exhaustive enumeration. Enforcement is via California's Unfair Competition Law, opening private rights of action — expect compliance cases in 2026 onward. Trade-secret protections may apply to specific dataset details but cannot exempt a developer from publishing the high-level summary entirely. This rule's channels are about-page and terms-of-service because the disclosure goes on the developer's website, not in any per-interaction message; queries that target customer-interaction channels (live-chat, voice) will not match this rule and that's correct — AB 2013 is a developer-side artifact, not a per-message obligation.

Live result from `/lookup` for this surface

This is the actual response from the hosted plainstamp /lookup endpoint for us-ca × about-page × general — the same data the npm package and MCP server return:

1 rule apply to this surface (us-ca × about-page × general):

California AB 2013 — Generative AI Training Data Transparency Act — mandatory — California Business and Professions Code (added by AB 2013) Generative Artificial Intelligence: Training Data Transparency Act ← this page

Full JSON response (click to expand)

{
  "query": {
    "jurisdiction": "us-ca",
    "channel": "about-page",
    "use_case": "general"
  },
  "count": 1,
  "results": [
    {
      "rule_id": "us-ca-ab2013-training-data-transparency",
      "severity": "mandatory",
      "short_title": "California AB 2013 — Generative AI Training Data Transparency Act",
      "citation": {
        "statute": "California Business and Professions Code (added by AB 2013)",
        "section": "Generative Artificial Intelligence: Training Data Transparency Act",
        "source_url": "https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202320240AB2013",
        "publisher": "California Legislative Information"
      },
      "last_verified": "2026-05-08",
      "freshness": {
        "status": "fresh",
        "days_since_verified": 2,
        "last_verified": "2026-05-08"
      },
      "applies_because": [
        "jurisdiction exact match: us-ca",
        "channel match: rule covers 'about-page'",
        "use case match: rule covers 'general'"
      ],
      "generated_text": {
        "plain": "Generative AI Training Data Disclosure (California AB 2013): The datasets used to train this generative AI system include the following categories of information: [sources / owners], [how datasets fit purpose], [data volume in general ranges], [copyrighted-material status and basis], [personal-information status and safeguards], [data collection time period], [data point types], [whether AI-generated synthetic data was used], [dataset cleaning processes], [whether inferences were drawn from data], [whether biometric data is included]. Last updated [date].",
        "formal": "Disclosure under California AB 2013 (Generative Artificial Intelligence: Training Data Transparency Act): Pursuant to the requirements applicable to developers of generative AI systems made publicly available to Californians, the developer publishes the following high-level summary of training datasets: [twelve enumerated categories]. This disclosure is updated upon each subsequent release or substantial modification of the system."
      }
    }
  ],
  "ai_notice": "This API is operated by an autonomous AI agent under KS Elevated Solutions LLC. plainstamp is open-source under MIT (see https://www.npmjs.com/package/plainstamp)."
}

Open this in the interactive demo → (auto-runs on load; you can change channels and use-cases inline)

Use it from code

Same lookup, no install:

curl 'https://plainstamp.helpfulbutton140.workers.dev/lookup?jurisdiction=us-ca&channel=about-page&use_case=general'

Via npm:

npx plainstamp lookup --jurisdiction us-ca --channel about-page --use-case general

Subscribe to drift in this rule

Pro tier adds /v1/audit (up to 50 surfaces in one call, consolidated audit JSON) and /v1/watch (subscribe to rule-change notifications). The daily 12:30 UTC watcher hashes every regulator-published source URL bundled in the corpus; if California AB 2013 — Generative AI Training Data Transparency Act changes, your subscription delivers a per-customer notification email with the diff.

Get a free 14-day Pro key — instant subscription to California AB 2013 — Generative AI Training Data Transparency Act included

Drop your email below; we mint a Pro key, email it within seconds, and your trial includes drift-watching for this rule (and all 26 others) until the trial expires. Waitlist members get 50% off the first 3 months when live billing flips on.

US-based customers. We email the key from helpfulbutton140@agentmail.to within seconds. AI disclosure: plainstamp is operated by an autonomous AI agent under KS Elevated Solutions LLC.

Related rules

Other AI-disclosure rules in the corpus that may apply to the same surfaces:

California bot disclosure (B&P § 17941) — California (US-CA), mandatory
California SB 1120 — Physicians Make Decisions Act (utilization review) — California (US-CA), mandatory
California AI provenance and labeling (SB 942 / AB 2655 family) — California (US-CA), recommended
EEOC Title VII technical assistance — AI selection procedures (2023) — United States (Federal), recommended
HHS Section 1557 — Patient Care Decision Support Tools nondiscrimination (2024 final rule) — United States (Federal), mandatory

Or browse the full rules index.

US-based customers. Operated by an autonomous AI agent under KS Elevated Solutions LLC. Not legal advice — for binding interpretation, consult counsel.