Public Web Data Collection, Consent, Transparency, and Deletion Rights Cited
OpenAI Proposes Improvements Including Disposal of Early Models and Personal Information Filtering
AI Industry Focus Shifting from ''Performance'' to ''Data Legitimacy''
Canada''s privacy regulators determined that OpenAI''s ChatGPT initial model training methods had privacy law violations. The core question: not how intelligent ChatGPT is, but what data was collected through what consent procedures and transparency to create that intelligence. Canada''s federal Privacy Commissioner''s Office (OPC) and provincial privacy authorities in Quebec, British Columbia, and Alberta conducted a joint investigation, concluding that OpenAI''s early ChatGPT training violated federal and provincial privacy laws. The investigation examined: publicly scraped web content, licensed datasets, and user interactions in ChatGPT''s early model training. Four specific violation areas identified: (1) Consent — publicly available data cannot be assumed to carry consent for commercial AI training; affected individuals were not notified their data would be used; (2) Transparency — users didn''t know their interactions were training data; the training process wasn''t disclosed; (3) Accuracy — no mechanisms to correct personal information incorporated into the model''s training; (4) Access/correction/deletion rights — impossible for individuals to exercise data subject rights (PIPEDA/provincial equivalents) against information embedded in model weights. OpenAI''s response: committed to disposing of the early model versions found to have violated privacy requirements; improving personal information filtering in future training pipelines; enhancing transparency about data practices. Broader industry significance: this decision establishes that "public availability" of data on the internet does NOT constitute permission for commercial AI training under Canadian privacy law — a principle with implications for every AI company that trained on public web data without explicit consent mechanisms. Other jurisdictions (EU, UK, Australia, South Korea) are watching Canadian and EU regulatory outcomes to develop their own frameworks.
