![]() |
| Pix credit here (Bethesda Softworks1996) |
I was delighted to have had the opportunity to present a series of Lectures hosted by the East China University of Political Science and Law (ECUPL) at the end of May. My thanks to my hosts and especially to Sun Yuhua for organizing these opportunities to engage with colleagues. The overall theme (and thus the title) of the lectures was AI Governance in Comparative Perspective, Theory and Practice: China, U.S. and E.U, With a Sideways Glance at the U.N. I introduced the Lectures in an earlier post (Introduction; AI Governance in Comparative Perspective, Theory and Practice: China, U.S. and E.U.-- Lectures at the East China University of Political Science and Law (May 2026)).
The overall theme (and thus the title) of the lectures was AI Governance in Comparative Perspective, Theory and Practice: China, U.S. and E.U, With a Sideways Glance at the U.N. The subject of the lectures requires little by way of introduction: Artificial intelligence is the broad term that has come to represent a growing cluster of non-human and digitalized processes and operations that has as its primary task the constitution of non-human systems capable of performing tasks that were once thought to require human intelligence. And so is the impulse to manage, control, exploit, embed, understand, and regulate these processes, systems, and perhaps eventually non-human consciousness with a huge potential to undertake many of the computational tasks (the mathematical and logical processing of data) that were once the sole domain of and perhaps defined what it meant to be human. That is the point where things get interesting. It is at the point where the development of machines, that is of non-human systems, capable of performing tasks that were once thought to require human intelligence, collide with regulatory structures meant to manage, contain, constrain, liberate, embed, project and exploit such non-human systems, whether they are traditional or emerging, public or private regulatory systems, that human collectives and the machine-systems they have created now find themselves.
The eight lectures progress sequentially from conceptual and theoretical frameworks (lectures 1 and 2, the objects and subjects of AI regulation), through a deeper consideration of regulatory systems in three distinguishable regulatory regimes--the US, EU, and China (Lectures 3, 4.5). The last two lectures consider judicial efforts to embed AI within traditional legal orders (Lecture 6), and the way in which the object of regulation (in the form of the owners of the larger AI enterprises) understand the relationship between AI, the state, and society (Lecture 7) . Lecture 8 summarizes and draws larger themes going forward.
This post includes a summary of the Lecture 1 Notes, as well as the Lecture 1 PPT. Those interested may reach out to me to discuss availability of audio of the lecture and the full text of the Lecture 1 notes
Given the nature of the project I thought it might be useful to engage with an commercially available AI service for the production of a summary of the Lecture 1 materials. After some back and forth with Google's Gemini, we came up with the following abstract of Lecture 1.
Lecture 1 Abstract
Artificial intelligence is not a singular technology but a complex, multi-layered socio-technical stack comprising data, models, optimization processes, hardware, human labor, and institutional governance. In public discourse, the term "AI" collapses these diverse layers into a single concept—a semiotic instability that presents a severe challenge for legal regulation. Effective AI governance begins with precise technical classification. Regulators cannot intelligently assign liability, duties, or rights without distinguishing between nested technical concepts—moving from the broad field of artificial intelligence to statistical machine learning, deep learning neural networks, and generative AI. Furthermore, governance must pinpoint exactly where law intersects with the technology, determining whether regulation targets a core mathematical algorithm, a trained model, an operational system, or a commercially deployed product.
Modern AI fundamentally diverges from traditional software because it is data-driven, probabilistic, learned, scalable, and institutionally embedded. Rather than executing explicit, hand-coded rules, modern models learn statistical associations from historical data. This shift creates distinct legal frictions: data quality and collection methods raise privacy and intellectual property concerns; probabilistic outputs clash with administrative demands for explicit reasoning; and the self-learned nature of deep learning creates algorithmic opacity, where even developers cannot fully interpret internal model representations.
Regulation must necessarily function around and within what might be understood as the matrix of modern AI: modalities, components, and dialectics. To fully map this technical object, governance must evaluate AI as a three-dimensional matrix defined by its functional modalities, structural components, and a core linguistic dialectic.
The first axis consists of functionally differentiated modalities, which span from primitive rule-based systems to multi-layered artificial neural networks, deep learning computer vision, and large language models (LLMs). Each modality processes information differently and introduces unique regulatory surface areas—whether it is the rigid, discriminatory potential of an explicit rule or the unpredictable, generative risks of an LLM.
The second axis maps the physical and mathematical system components that animate these modalities:*Data: The foundational social artifact and raw material.Binding this entire matrix together is a profound dialectic between human coding and machine language. Traditional software relies on human-written, imperative instructions that dictate exact logical pathways. Modern AI, however, shifts the human role to setting high-level frameworks (architectures, loss functions, and training boundaries). The system then computes its own "machine language"—an abstract, multi-dimensional vector space of embeddings and weights that humans cannot read line-by-line.
*Values: Human choices embedded during pre-training, parameter tuning, and data labeling.
*Weights: The internal, numerical parameters within a neural network that encode statistical patterns.
*Processes: The continuous computational workflows—such as optimization, backpropagation, and inference—that transform static code into dynamic behavior.
This creates a constant tension: humans try to impose legal, ethical, and operational constraints using natural language, while the underlying technology executes via statistical optimization. This translation gap between human intent and emergent machine capability, between cognition and computation, is the ultimate challenge of modern AI governance.
Because these highly scalable systems are embedded within core societal institutions—allocating resources, credit, and power—technical risks inevitably transform into broad political and legal challenges. This governance dilemma is further complicated by the historical transition from brittle, rule-based Symbolic AI to general-purpose Foundation Models. Powered by the transformer architecture, modern foundation models can be adapted to endless downstream tasks, distributing legal responsibility across original developers, commercial deployers, and end-users. Mitigating these systemic risks requires mapping the entire machine-learning pipeline as a continuous, non-neutral process. Human judgment and institutional bias shape the pipeline long before a model is trained—specifically during data collection, preprocessing, and the assignment of subjective cultural labels. During training, optimization algorithms iteratively adjust internal parameters to minimize a loss function, yet standard post-training benchmarks often mask performance disparities among sub-populations. Finally, the deployment phase introduces inference-level risks, including data drift, security manipulation, and user-input privacy violations.
It might follow that AI cannot be regulated as an abstract, stable entity. It is an evolving process that stretches from the initial transformation of the world into data through to real-time institutional deployment. Ultimately, comparative AI governance is a geopolitical contest over how this technical object is legally constructed. While the United States constructs AI through market innovation and national competitiveness, and the European Union frames it through product safety and fundamental rights, China regulates it through the lens of socialist modernization and state public opinion management. To navigate these conflicting regimes, legal and administrative frameworks must move past superficial definitions and directly govern the specific technical layers, pipeline choices, and institutional realities that make modern AI what it is.
Links to Lectures:
Lecture 0 -- Introduction
Lecture 1—From Algorithms to Foundation Models: What Contemporary AI is “Made of”
Lecture 2—What Are We Actually Governing When We Govern AI?
Lecture 3—The “Markets State”: U.S. Approach
Lecture 4—The “Rights State”: EU Approach
Lecture 5—The “Guided State”: The Chinese Approach
Lecture 6—Courts, Companies, and the Legal Construction of AI
Lecture 7—AI Narratives: Palantir; Anthropic; Open AI; and Leopold Aschenbrenner
Lecture 8—Putting It All Together: Trends, Trend Lines, and Regulatory Dialectics
Lecture Notes Summary
Lecture 1—From Algorithms to Foundation Models: What Contemporary AI is “Made of”
An effective framework for AI governance might start with an analytical shift away from normative judgments about whether specific technical capacities are "good" or "bad" as starting points. Instead, the focus might be rationalized around a structural classification of the objects of regulation and also on the disaggregation of the technical stack to identify exactly what can and will be the object of regulation.
The following summary provides a neutral, analytical breakdown of the text, organized by its original headings, focusing on how the technology is categorized for regulatory oversight.
Overview
Artificial intelligence is not a singular, uniform object. It is a layered socio-technical stack comprising multiple distinct layers: data, models, optimization processes, hardware infrastructure, deployment interfaces, human labor, institutional incentives, and governance choices.
For governance, treating AI as a single, abstract term is unworkable. A legal system cannot regulate AI effectively without specifying the exact object of regulation: is it data, code, a model, an interface, a platform, an institutional decision, a specific harm, or physical infrastructure?
AI governance therefore begins with classification. Different geopolitical blocks construct the object of regulation based on distinct administrative, economic, and political strategies:
The United States constructs AI as an object of markets, innovation, standards, procurement, sectoral enforcement, and national competitiveness.
The European Union constructs AI as an object of risk, product safety, fundamental rights, and supervisory administration.
The People’s Republic of China constructs AI as an object of socialist modernization, platform governance, data security, public opinion management, and national development.
Position in the Series
This lecture establishes the technical and conceptual map necessary to analyze subsequent regulatory approaches. To understand a "market-led" approach (US), a "risk-based" approach (EU), or a "platform-control" approach (China), one must first map where legal duties, liabilities, and oversight mechanisms attach within the underlying technical object. The comparative analysis of AI governance is a contest over how AI is constructed as an object of law, economics, politics, and administration.
Opening Problem: AI Is a Bad Regulatory Object
The term "AI" is historically unstable and strategically defined by different actors depending on their institutional goals:
Commercial Developers apply broad definitions for marketing (to signal value and innovation) but shift to narrow definitions when regulation appears (to reduce legal obligations).
Regulators favor broad definitions to ensure risky automated systems do not escape oversight.
States define AI strategically to justify industrial policy, infrastructure investments, or national security export controls.
Civil Rights Advocates focus the definition tightly on automated decision-making systems where discrimination or exclusion manifests.
Because law depends on precise classification, a legal duty must attach to a stable, identifiable entity: an actor, an action, a product, a process, or an infrastructure layer. The term AI is useful for public discourse but too imprecise for legal text, much less for policy discussion. Effective governance requires disaggregating the term in two ways. The first is horizontal--AI as a cover for a number of functionally differentiated systems. The second is vertical--AI as a cover for the elements, the components of each AI system.
Initial Distinctions: AI, Machine Learning, Deep Learning, Generative AI, and AI Governance
To regulate the technology, its structural layers must be categorized as a series of nested technical hierarchies:
Artificial Intelligence (AI): The broad field concerned with building systems that perform tasks associated with human intelligence (perception, classification, prediction, planning, language use, decision support, recommendation, or autonomous action).
Machine Learning (ML): A narrower technical approach within AI where systems improve performance by learning patterns from data rather than being explicitly programmed for every rule. The object of control shifts from hardcoded instructions to model architectures, training procedures, and learning objectives.
Deep Learning: A subset of machine learning utilizing multi-layered artificial neural networks to learn internal representations of data. It shifts capabilities away from human-designed features but increases technical opacity.
Generative AI: A specific family of deep learning systems that create new outputs (text, code, images, audio, video, synthetic data) rather than merely classifying or scoring existing data.
AI Governance: The complete set of rules, institutions, standards, norms, technical controls, and organizational practices used to shape the development, deployment, and effects of these systems. It encompasses formal law, technical standards, procurement audits, model evaluations, liability regimes, and physical infrastructure controls over semiconductors and cloud access.
Algorithm, Model, System, and Product
A critical disaggregation for targeting legal liability is the distinction between an algorithm, a model, a system, and a product:
Algorithm: A step-by-step computational procedure for solving a problem or producing an output (e.g., a sorting procedure or a formula used to update weights during training). Algorithms do not inherently learn or adapt.
Model: A learned representation of relationships in data containing internal numerical values (parameters) that encode statistical patterns discovered during training. Its behavior emerges from the interaction of data, architecture, and optimization objectives.
AI System: The larger technical arrangement that includes the model alongside its data pipelines, input/output interfaces, preprocessing tools, monitoring systems, deployment infrastructure, and human operational protocols. Decisions are executed at the system layer, not by the model in isolation.
AI Product or Service: The commercialized or institutionally deployed version of the system encountered by the end-user (e.g., a specific chatbot interface, a recommendation feed, or an automated hiring-screening system). It often contains multiple models, external databases, retrieval mechanisms, and content filters.
Different laws target different layers of this breakdown:
What Makes Modern AI Modern?
Modern AI systems differ structurally from traditional software across five core technical characteristics, altering how they must be treated as objects of oversight:
Data-Driven: Traditional software operates on explicit, developer-written rules. Modern AI learns statistical associations from historical data. Consequently, data quality, data provenance, intellectual property rights, and data annotation labor become primary sites of legal and regulatory intervention.
Probabilistic: Modern AI systems generate probabilities, scores, rankings, and statistical predictions rather than deterministic outputs. This creates a fundamental tension with administrative and legal frameworks that demand explicit, deterministic reasons for decisions (e.g., the denial of a loan, visa, or benefit).
Learned Internal Behavior: The internal parameters of a model emerge dynamically through training. This creates structural opacity, which can be technical (mathematical complexity), organizational (proprietary trade secrets), practical (the scale of the model), or epistemic (developers not understanding how internal representations generate specific outputs).
Highly Scalable: A single foundational model can be integrated via APIs into thousands of downstream applications or distributed instantly to millions of users. A localized error or bias can scale rapidly into a systemic, population-wide failure mode.
Institutionally Embedded: AI is integrated directly into existing decision-making chains within banks, hospitals, police departments, and platforms. Regulating the tool requires regulating how it functions within these institutional resource-allocation processes.
Historical Layers of AI
The technological evolution of AI can be categorized into four historical layers, each presenting unique characteristics for technical auditing and compliance:
1. Symbolic AI
Based on explicit rules, formal logic, and hardcoded representations (e.g., expert systems).
Regulatory Characteristic: Highly inspectable and auditable because the logic trees are transparent. However, it is technically brittle and fails when applied to unformatted, real-world context-dependent data.
2. Statistical Machine Learning
Shifts from explicit instructions to patterns learned from examples.
Regulatory Characteristic: Auditing shifts from code inspection to data validation, requiring regulators to evaluate labels, features, evaluation methods, and training data distributions.
3. Deep Learning
Utilizes multi-layered artificial neural networks to automatically learn internal representations of data, accelerated by large datasets, increased compute power (GPUs), and architectural advancements.
Regulatory Characteristic: Introduces deep opacity due to the interaction of billions of internal parameters, making post-hoc explanation technically difficult and making systems vulnerable to adversarial shifts.
4. Foundation Models
Large, general-purpose models (such as large language models built on the transformer architecture's attention mechanisms) trained on broad data and adaptable to a vast array of downstream tasks.
Regulatory Characteristic: Disrupts traditional product-liability frameworks because the primary developer cannot foresee all downstream applications, and downstream deployers lack access to or understanding of the underlying training data and infrastructure.
The Machine-Learning Pipeline
To design precise policy interventions, the lifecycle of AI development must be disaggregated into sequential stages, each serving as a distinct touchpoint for governance:
Data Collection: Capturing real-world phenomena and translating them into data points. This stage defines what is observed, what is ignored, and what categories are used.
Cleaning and Preprocessing: Filtering, standardizing, and normalizing raw inputs. Human technical choices at this stage determine what constitutes "noise" or an "outlier."
Labeling and Annotation: Assigning target answers to inputs (e.g., classifying text as "hate speech" or "dissent"). This stage explicitly encodes institutional assumptions, cultural judgments, and legal classifications into the technical object.
Data Splitting: Partitioning datasets into training data (to fit parameters), validation data (to tune hyperparameters during development), and test data (held out to evaluate final generalization capability on unseen examples).
Model Training and Optimization: The iterative adjustment of parameters (internal weights and biases) to minimize a loss function (the mathematical measure of model error). This is driven by optimization algorithms like gradient descent (calculating local error reductions) and backpropagation (passing error signals backward through network layers). Human-configured design choices are controlled via hyperparameters (learning rates, layer counts, batch sizes).
Evaluation and Benchmarking: Assessing the model using specific metrics (accuracy, precision, recall, F1 scores) or public benchmarks. Governance challenges here center on whether benchmarks accurately replicate real-world deployment risks or obscure subgroup failures under high average scores.
Deployment and Inference: Placing the validated system into an active technical environment where it runs inference (processing live, unlabeled user inputs to generate real-time outputs). This splits governance into pre-deployment training rules (intellectual property, data provenance, compute audits) and post-deployment inference rules (data privacy, output monitoring, user appeals, and liability for harms).
Monitoring and Data Drift: Oversight of the system post-deployment. Because real-world environments, user behaviors, and economic conditions change, models experience data drift (performance degradation), requiring clear protocols for retraining, restriction, or retirement.
Features, Representations, and Embeddings
The final layer of disaggregation concerns how information is structured and processed mathematically inside an AI model:
Features: The individual input variables utilized by a model. In traditional machine learning, these are explicitly designed by humans (e.g., income level, age, specific lab values). In deep learning, the system extracts its own internal representations of these variables.
Embeddings: The primary mathematical mechanism for deep learning representation, where discrete data points (words, images, user profiles, videos) are translated into high-dimensional numerical vectors.
Embeddings place items with contextual or semantic similarity close together within a mathematical vector space.
For regulation, embeddings are highly relevant objects of analysis because they formalize and lock in historical relationships. If the underlying training data contains structural inequalities or demographic biases, the embedding process translates those social patterns into mathematical proximity. Downstream applications optimizing for statistical similarity will systematically replicate those historical patterns under a veneer of mathematical neutrality.


No comments:
Post a Comment