Wednesday, June 19, 2024

Shaping Algorithmic Governance in an Era of "High Quality Development": 孙凝晖《人工智能与智能计算的发展》Sun Ninghui; "The Development of Artificial Intelligence and Intelligent Computing" (special lecture of the 14th National People's Congress Standing Committee. May 2024)

  


"It can be seen that the United States prefers the virtual economy with a higher rate of return and despises the real economy with high investment costs and low economic returns. China tends to develop the real economy and the virtual economy simultaneously, and pays more attention to the development of equipment manufacturing, new energy vehicles, photovoltaic power generation, lithium batteries, high-speed rail, 5G and other real economies."  [可见美国更倾向于回报率更高的虚拟经济,轻视投资成本高且经济回报率低的实体经济。中国倾向于实体经济与虚拟经济同步发展,更加重视发展装备制造、新能源汽车、光伏发电、锂电池、高铁、5G等实体经济。 ] (孙凝晖《人工智能与智能计算的发展》)

In a quite interesting lecture delivered at the 14th National People's Congress Standing Committee, "The Development of Artificial Intelligence and Intelligent Computing" [人工智能与智能计算的发展] Sun Ninghui [孙凝晖] suggested the contours and approaches that may be shaping Chinese perceptions and conversations about the future--and the future management--of next generation tech, including big data tech, generative intelligence, and descriptive and predictive analytics. The lecture is worth reading in its entirety and follows below in the original Chinese and in a crude English translation. 

It may be worth foregrounding a number of quite interesting points made and perspectives embraced:

1. The economic-policy role of AI and related tech ought not to be underestimated in the Chinese policy universe. "General Secretary Xi Jinping pointed out that the new generation of artificial intelligence should be used as a driving force to promote the leapfrog development of science and technology, the optimization and upgrading of industries, and the overall leap of productivity, and strive to achieve high-quality development." [ 习近平总书记指出,把新一代人工智能作为推动科技跨越发展、产业优化升级、生产力整体跃升的驱动力量,努力实现高质量发展。 ] (人工智能与智能计算的发展). It is a centerpiece of the vanguard's evolution of the concept and application of "high quality development" of productive forces and thus of the New Era forms of Socialist modernization. It also has a double use--its internal success will produce external benefits--both along the Chinese Belt & Road, and in its engagement with its primary "big country" rival--the United Stated. It is the form of development, then, that becomes key to the evolution of internal and external policies. 

2. Part of the development of the approach to policy going forward is the nature of the understanding of the history of the technology that is now at its core. Sun Ninghui [孙凝晖] advances a version of that history that is important as foundation for what is to come. He speaks here to the organization of tech in and as platforms--something quite useful, which historically is divided into five eras (" five types of successful platform computing systems have been formed " [ 已经形成了五类成功的平台型计算系统 ]). Also important is Sun's division of the "computer age" into three distinct phases with substantial semiotic dimensions.  IT1.0 was the era of machines--that is of semiotic objects or "firstness" in which the object could be signified within separable vessels. IT2.0 was the era of the network--that is of semiotic "secondness" or signification in which machine collectives could be arranged in ways that made possible common or collective signification and meaning making.  IT3.0 Sun argues, is the era of intelligent computing--the point of semiotic "thirdness" or interpretation where the networked machine is transformed from a vessel to its operation, or in contemporary language, where it can can achieve a measure of autonomy. The phenomenological approach is intimately connected with New Era dialectics, one first tested with "social credit" systems in China and the US. Its essence infuses the analysis andf recommendations that follow.

3. However, to get to the point of intelligent analytics for policy under the coded premises of New Era Marxist Leninism, it is first necessary to try to understand the essence of generative intelligence, at least at a basic level (my take here).  Its essence for Sun is tied to a core Leninist principle--the power of collectivization. In this case collectivization is focused on data, that is on intelligence or relevant data silos, each fitted to its functionally differentiated objectives. "The characteristic of the large model is to win by "big", which has three meanings: (1) large parameters, GPT-3 has 170 billion parameters; (2) large training data, ChatGPT uses about 300 billion words and 570GB training data; (3) large computing power requirements, GPT-3 uses about tens of thousands of V100 GPUs for training." [ 大模型的特点是以“大”取胜,其中有三层含义,(1)参数大,GPT-3就有1700亿个参数;(2)训练数据大,ChatGPT大约用了3000亿个单词,570GB训练数据;(3)算力需求大,GPT-3大约用了上万块V100 GPU进行训练。 ] (人工智能与智能计算的发展). It is only from that conceptual-structural baseline that it is possible to understand the ramification for politics-policy-and human (economic) relations. From this conceptual baseline, Sun posits AI development in five parts: (1) development of multimodal large modes to mimic the human five senses); (2) development of video generation of large models (the consequences for the constitution of virtual spaces remains unclear but significant); (3) the development of embodied intelligence (though here the Japanese may have the current edge) that akes generative AI mobile, like humans; (4) the support of AI4R (AI for Research) (as a matter of public guidance in China and of private capacity in the US with state serendipitous support); and (5) consideration of the quite sensitive issue of AI consciousness as a critical task though one that even its its conceptualization remains elusive and phantasmagorical (again my take here).

4. It is no surprise that one of the markers of China-US competition has been the effective abandonment on the core global ordering premises of convergence based on transparency and sharing, to one in which regional centers protect their advantages through expanding notions of security. That is not "news". What is important is the way in which one transposes signification on the "idea" of security. Sun foregrounds the following. 

"First, there is the proliferation of false information on the Internet * * * Second, fake videos, especially fake leaders' videos, cause international disputes, disrupt election order, or cause sudden public opinion events, * * * [which] have led to a decline in social trust in the news media industry. * * * Third, fake news, mainly through the automatic generation of false news to make illegal profits * * * Fourth, face-changing and voice-changing are used for fraud. * * * Fifth, indecent images are generated, especially for public figures."  (人工智能与智能计算的发展)

These serve to define the universe of impacts that are deemed harmful and against which the power fo the State must be deployed. All of these harms produce negative consequences noy just on social collectives but also on AI development. Sun Ninghui identifies these are "credibility issues" [可信问题]. Interestingly these go to the political.normative character of data, an inherently semiotic perspective that starts from the premise that all data is inherently infused with the judgments and sensibilities that define it as data  (relevant) and suggests the character of that relevance. Among these are 

"(1) factual errors of "serious nonsense"; (2) using Western values ​​to narrate and export political bias and wrong speech; (3) being easily misled and exporting wrong knowledge and harmful content; (4) data security issues are aggravated, and big models become traps for important sensitive data." [(1)“一本正经胡说八道”的事实性错误;(2)以西方价值观叙事,输出政治偏见和错误言论;(3)易被诱导,输出错误知识和有害内容;(4)数据安全问题加重,大模型成为重要敏感数据的诱捕器,ChatGPT将用户输入纳入训练数据库,用于改善ChatGPT,美方能够利用大模型获得公开渠道覆盖不到的中文语料,掌握我们自己都可能不掌握的“中国知识”。因此,迫切需要发展大模型安全监管技术与自己的可信大模型。"]. (人工智能与智能计算的发展)

These are,  fact, critically important considerations for a system sensitive an objective of comprehensively infusing national life with Chinese characteristics (see my take here on an aspect of this sensibility and its consequences). This ideological security is then melded with traditional security concerns against both unpatriotic forces within the State and foreign elements outside of the state. 

Pix Credit here
5. The solution to the problems of security are quite interesting.  One expects part of it--the advocacy of centralized law and norm making under the guidance and leadership of the political vanguard and aligned with New Era Marxist-Leninism and its application to the challenges of the current manifestation of the general contradiction. But there is also embedded in that something quite remarkable: " promote human-machine harmony and friendship" [促进人机和谐友好] (人工智能与智能计算的发展). The meaning and scope of this notion is well worth exploring. It may signal, if only tentatively, a n opening to a quite distinctive approach to generative intelligence.   

6. But these solutions cannot be undertaken in a vacuum; indeed they may well be shaped by the current state of relations with the United States, and most particularly with the American exercise of  what Sun Ninghui correctly identified as the national security aspects of AI--but this time drected against China. That poses a set of dilemmas for the Chinese. 

Dilemma 1 is that the United States has long been in a leading position in AI core capabilities, and China is in tracking mode.* * * Dilemma 2 is that high-end computing power products are banned from sale, and high-end chip processes have been stuck for a long time.* * * Dilemma 3 is the weak domestic intelligent computing ecosystem and insufficient penetration of AI development frameworks.* * * Dilemma 4 is that the cost and threshold of AI application in the industry remain high." [困境一为美国在AI核心能力上长期处于领先地位,中国处于跟踪模式。* * *  困境二为高端算力产品禁售,高端芯片工艺长期被卡。* * * 困境三为国内智能计算生态孱弱,AI开发框架渗透率不足。* * * 困境四为AI应用于行业时成本、门槛居高不下。] (《人工智能与智能计算的发展》Section 4)

7. From out of these dilemma three choices emerge that are worth pursuing. Each is both a reflection of an assessment of internal and external constraints. These are worth careful study if only because some portion of the suggestions are likely to make their way up to the higher levels of the governing apparatus. Each is designed to take advantage of perceived weaknesses in the U.S. approach (or its blindness to threat) and to leverage Chinese strengths in ways that advance its core political and operational projects, the later based BOTH on the dual circulation strategy and the development of a new approach to the high quality development of productive forces i the New Era.   "Choice 1: Unify the technical system and take the closed-source or open-source path? * * * Choice 2: Algorithm model or new infrastructure? * * * Choice three: Does AI+ focus on empowering the virtual economy or the real economy?" [选择一:统一技术体系走闭源封闭,还是开源开放的道路?* * * 选择二:拼算法模型,还是拼新型基础设施?* * * 选择三:AI+着重赋能虚拟经济,还是发力实体经济?]. (《人工智能与智能计算的发展》Section 5). Of these Choices 1 and 3 are quite interesting if only for the way they mirror approaches in other policy areas. Choice 1 consists of three steps. The first is to catch up and surpass the United States in its own AI turf, the second step is to then build a closed system internal to China and applied to critical sectors, and the third step is to develop a new  open system for export one meant to supplant and the current dominance of the US system. This is a model with rough echos in other sectors. Choice 3 is important because of its perceptive analysis of the cultural context in which economic policy is undertaken, especially in the way in which its development is understood. The core insight is set out in the opening quote and ought to be burned into policy consciousness., not necessarily for the truth of it but for the power of the belief in it that drives elites in both states. 



中国储能网讯:中国人大网近日刊登孙凝晖在十四届全国人大常委会专题讲座上的讲稿《人工智能与智能计算的发展》,现将全文转载如下,让我们一同走进高深莫测的人工智能世界。

  委员长、各位副委员长、秘书长、各位委员:

  人工智能领域近年来正在迎来一场由生成式人工智能大模型引领的爆发式发展。2022年11月30日,OpenAI公司推出一款人工智能对话聊天机器人ChatGPT,其出色的自然语言生成能力引起了全世界范围的广泛关注,2个月突破1亿用户,国内外随即掀起了一场大模型浪潮,Gemini、文心一言、Copilot、LLaMA、SAM、SORA等各种大模型如雨后春笋般涌现,2022年也被誉为大模型元年。当前信息时代正加快进入智能计算的发展阶段,人工智能技术上的突破层出不穷,逐渐深入地赋能千行百业,推动人工智能与数据要素成为新质生产力的典型代表。习近平总书记指出,把新一代人工智能作为推动科技跨越发展、产业优化升级、生产力整体跃升的驱动力量,努力实现高质量发展。党的十八大以来,以习近平同志为核心的党中央高度重视智能经济发展,促进人工智能和实体经济深度融合,为高质量发展注入强劲动力。

  1 计算技术发展简介

  计算技术的发展历史大致可分为四个阶段,算盘的出现标志着人类进入第一代——机械计算时代,第二代——电子计算的标志是出现电子器件与电子计算机,互联网的出现使我们进入第三代——网络计算,当前人类社会正在进入第四阶段——智能计算。

  早期的计算装置是手动辅助计算装置和半自动计算装置,人类计算工具的历史是从公元1200年的中国算盘开始,随后出现了纳皮尔筹(1612年)和滚轮式加法器(1642年),到1672年第一台自动完成四则运算的计算装置——步进计算器诞生了。

  机械计算时期已经出现了现代计算机的一些基本概念。查尔斯∙巴贝奇(Charles Babbage)提出了差分机(1822年)与分析机(1834年)的设计构想,支持自动机械计算。这一时期,编程与程序的概念基本形成,编程的概念起源于雅卡尔提花机,通过打孔卡片控制印花图案,最终演变为通过计算指令的形式来存储所有数学计算步骤;人类历史的第一个程序员是诗人拜伦之女艾达(Ada),她为巴贝奇差分机编写了一组求解伯努利数列的计算指令,这套指令也是人类历史上第一套计算机算法程序,它将硬件和软件分离,第一次出现程序的概念。

  直到在二十世纪上半叶,出现了布尔代数(数学)、图灵机(计算模型) 、冯诺依曼体系结构(架构) 、晶体管(器件)这四个现代计算技术的科学基础。其中,布尔代数用来描述程序和硬件如CPU的底层逻辑;图灵机是一种通用的计算模型,将复杂任务转化为自动计算、不需人工干预的自动化过程;冯诺依曼体系结构提出了构造计算机的三个基本原则:采用二进制逻辑、程序存储执行、以及计算机由运算器、控制器、存储器、输入设备、输出设备这五个基本单元组成;晶体管是构成基本的逻辑电路和存储电路的半导体器件,是建造现代计算机之塔的“砖块”。基于以上科学基础,计算技术得以高速发展,形成规模庞大的产业。

  从1946年世界上第一台电子计算机ENIAC诞生到二十一世纪的今天,已经形成了五类成功的平台型计算系统。当前各领域各种类型的应用,都可以由这五类平台型计算装置支撑。第一类是高性能计算平台,解决了国家核心部门的科学与工程计算问题;第二类是企业计算平台,又称服务器,用于企业级的数据管理、事务处理,当前像百度、阿里和腾讯这些互联网公司的计算平台都属于这一类;第三类是个人电脑平台,以桌面应用的形式出现,人们通过桌面应用与个人电脑交互;第四类是智能手机,主要特点是移动便携,手机通过网络连接数据中心,以互联网应用为主,它们分布式地部署在数据中心和手机终端;第五类是嵌入式计算机,嵌入到工业装备和军事设备,通过实时的控制,保障在确定时间内完成特定任务。这五类装置几乎覆盖了我们信息社会的方方面面,长期以来人们追求的以智能计算应用为中心的第六类平台型计算系统尚未形成。

  现代计算技术的发展大致可以划分为三个时代。

  IT1.0又称电子计算时代(1950-1970),基本特征是以“机”为中心。计算技术的基本架构形成,随着集成电路工艺的进步,基本计算单元的尺度快速微缩,晶体管密度、计算性能和可靠性不断提升,计算机在科学工程计算、企业数据处理中得到了广泛应用。

  IT2.0又称网络计算时代(1980-2020),以“人”为中心。互联网将人使用的终端与后台的数据中心连接,互联网应用通过智能终端与人进行交互。以亚马逊等为代表的互联网公司提出了云计算的思想,将后台的算力封装成一个公共服务租借给第三方用户,形成了云计算与大数据产业。

  IT3.0又称智能计算时代,始于2020年,与IT2.0相比增加了“物”的概念,即物理世界的各种端侧设备,被数字化、网络化和智能化,实现“人-机-物”三元融合。智能计算时代,除了互联网以外,还有数据基础设施,支撑各类终端通过端边云实现万物互联,终端、物端、边缘、云都嵌入AI,提供与ChatGPT类似的大模型智能服务,最终实现有计算的地方就有AI智能。智能计算带来了巨量的数据、人工智能算法的突破和对算力的爆发性需求。

  2 智能计算发展简介

  智能计算包括人工智能技术与它的计算载体,大致历经了四个阶段,分别为通用计算装置、逻辑推理专家系统、深度学习计算系统、大模型计算系统。

  智能计算的起点是通用自动计算装置(1946年)。艾伦·图灵(Alan Turing)和冯·诺依曼(John von Neumann)等科学家,一开始都希望能够模拟人脑处理知识的过程,发明像人脑一样思考的机器,虽未能实现,但却解决了计算的自动化问题。通用自动计算装置的出现,也推动了1956年人工智能(AI)概念的诞生,此后所有人工智能技术的发展都是建立在新一代计算设备与更强的计算能力之上的。

  智能计算发展的第二阶段是逻辑推理专家系统(1990年)。E.A.费根鲍姆(Edward Albert Feigenbaum)等符号智能学派的科学家以逻辑和推理能力自动化为主要目标,提出了能够将知识符号进行逻辑推理的专家系统。人的先验知识以知识符号的形式进入计算机,使计算机能够在特定领域辅助人类进行一定的逻辑判断和决策,但专家系统严重依赖于手工生成的知识库或规则库。这类专家系统的典型代表是日本的五代机和我国863计划支持的306智能计算机主题,日本在逻辑专家系统中采取专用计算平台和Prolog这样的知识推理语言完成应用级推理任务;我国采取了与日本不同的技术路线,以通用计算平台为基础,将智能任务变成人工智能算法,将硬件和系统软件都接入通用计算平台,并催生了曙光、汉王、科大讯飞等一批骨干企业。

  符号计算系统的局限性在于其爆炸的计算时空复杂度,即符号计算系统只能解决线性增长问题,对于高维复杂空间问题是无法求解的,从而限制了能够处理问题的大小。同时因为符号计算系统是基于知识规则建立的,我们又无法对所有的常识用穷举法来进行枚举,它的应用范围就受到了很大的限制。随着第二次AI寒冬的到来,第一代智能计算机逐渐退出历史舞台。

  直到2014年左右,智能计算进阶到第三阶段——深度学习计算系统。以杰弗里·辛顿(Geoffrey Hinton)等为代表的连接智能学派,以学习能力自动化为目标,发明了深度学习等新AI算法。通过深度神经元网络的自动学习,大幅提升了模型统计归纳的能力,在模式识别①等应用效果上取得了巨大突破,某些场景的识别精度甚至超越了人类。以人脸识别为例,整个神经网络的训练过程相当于一个网络参数调整的过程,将大量的经过标注的人脸图片数据输入神经网络,然后进行网络间参数调整,让神经网络输出的结果的概率无限逼近真实结果。神经网络输出真实情况的概率越大,参数就越大,从而将知识和规则编码到网络参数中,这样只要数据足够多,就可以对各种大量的常识进行学习,通用性得到极大的提升。连接智能的应用更加广泛,包括语音识别、人脸识别、自动驾驶等。在计算载体方面,中国科学院计算技术研究所2013年提出了国际首个深度学习处理器架构,国际知名的硬件厂商英伟达(NVIDIA)持续发布了多款性能领先的通用GPU芯片,都是深度学习计算系统的典型代表。

  智能计算发展的第四阶段是大模型计算系统(2020年)。在人工智能大模型技术的推动下,智能计算迈向新的高度。2020年,AI从“小模型+判别式”转向“大模型+生成式”,从传统的人脸识别、目标检测、文本分类,升级到如今的文本生成、3D数字人生成、图像生成、语音生成、视频生成。大语言模型在对话系统领域的一个典型应用是OpenAI公司的ChatGPT,它采用预训练基座大语言模型GPT-3,引入3000亿单词的训练语料,相当于互联网上所有英语文字的总和。其基本原理是:通过给它一个输入,让它预测下一个单词来训练模型,通过大量训练提升预测精确度,最终达到向它询问一个问题,大模型产生一个答案,与人即时对话。在基座大模型的基础上,再给它一些提示词进行有监督的指令微调,通过人类的<指令,回复>对逐渐让模型学会如何与人进行多轮对话;最后,通过人为设计和自动生成的奖励函数来进行强化学习迭代,逐步实现大模型与人类价值观的对齐。

  大模型的特点是以“大”取胜,其中有三层含义,(1)参数大,GPT-3就有1700亿个参数;(2)训练数据大,ChatGPT大约用了3000亿个单词,570GB训练数据;(3)算力需求大,GPT-3大约用了上万块V100 GPU进行训练。为满足大模型对智能算力爆炸式增加的需求,国内外都在大规模建设耗资巨大的新型智算中心,英伟达公司也推出了采用256个H100芯片,150TB海量GPU内存等构成的大模型智能计算系统。

  大模型的出现带来了三个变革。

  一是技术上的规模定律(Scaling Law),即很多AI模型的精度在参数规模超过某个阈值后模型能力快速提升,其原因在科学界还不是非常清楚,有很大的争议。AI模型的性能与模型参数规模、数据集大小、算力总量三个变量成“对数线性关系”,因此可以通过增大模型的规模来不断提高模型的性能。目前最前沿的大模型GPT-4参数量已经达到了万亿到十万亿量级,并且仍在不断增长中;

  二是产业上算力需求爆炸式增长,千亿参数规模大模型的训练通常需要在数千乃至数万GPU卡上训练2-3个月时间,急剧增加的算力需求带动相关算力企业超高速发展,英伟达的市值接近两万亿美元,对于芯片企业以前从来没有发生过;

  三是社会上冲击劳动力市场,北京大学国家发展研究院与智联招聘联合发布的《AI大模型对我国劳动力市场潜在影响研究》报告指出,受影响最大的20个职业中财会、销售、文书位于前列,需要与人打交道并提供服务的体力劳动型工作,如人力资源、行政、后勤等反而相对更安全。

  人工智能的技术前沿将朝着以下四个方向发展。

  第一个前沿方向为多模态大模型。从人类视角出发,人类智能是天然多模态的,人拥有眼、耳、鼻、舌、身、嘴(语言),从AI视角出发,视觉,听觉等也都可以建模为token②的序列,可采取与大语言模型相同的方法进行学习,并进一步与语言中的语义进行对齐,实现多模态对齐的智能能力。

  第二个前沿方向为视频生成大模型。OpenAI于2024年2月15日发布文生视频模型SORA,将视频生成时长从几秒钟大幅提升到一分钟,且在分辨率、画面真实度、时序一致性等方面都有显著提升。SORA的最大意义是它具备了世界模型的基本特征,即人类观察世界并进一步预测世界的能力。世界模型是建立在理解世界的基本物理常识(如,水往低处流等)之上,然后观察并预测下一秒将要发生什么事件。虽然SORA要成为世界模型仍然存在很多问题,但可以认为SORA学会了画面想象力和分钟级未来预测能力,这是世界模型的基础特征。

  第三个前沿方向为具身智能。具身智能指有身体并支持与物理世界进行交互的智能体,如机器人、无人车等,通过多模态大模型处理多种传感数据输入,由大模型生成运动指令对智能体进行驱动,替代传统基于规则或者数学公式的运动驱动方式,实现虚拟和现实的深度融合。因此,具有具身智能的机器人,可以聚集人工智能的三大流派:以神经网络为代表的连接主义,以知识工程为代表的符号主义和控制论相关的行为主义,三大流派可以同时作用在一个智能体,这预期会带来新的技术突破。

  第四个前沿方向是AI4R(AI for Research)成为科学发现与技术发明的主要范式。当前科学发现主要依赖于实验和人脑智慧,由人类进行大胆猜想、小心求证,信息技术无论是计算和数据,都只是起到一些辅助和验证的作用。相较于人类,人工智能在记忆力、高维复杂、全视野、推理深度、猜想等方面具有较大优势,是否能以AI为主进行一些科学发现和技术发明,大幅提升人类科学发现的效率,比如主动发现物理学规律、预测蛋白质结构、设计高性能芯片、高效合成新药等。因为人工智能大模型具有全量数据,具备上帝视角,通过深度学习的能力,可以比人向前看更多步数,如能实现从推断(inference)到推理(reasoning)的跃升,人工智能模型就有潜力具备爱因斯坦一样的想象力和科学猜想能力,极大提升人类科学发现的效率,打破人类的认知边界。这才是真正的颠覆所在。

  最后,通用人工智能③(Artificial General Intelligence,简称AGI)是一个极具挑战的话题,极具争论性。曾经有一个哲学家和一个神经科学家打赌:25年后(即2023年)科研人员是否能够揭示大脑如何实现意识?当时关于意识有两个流派,一个叫集成信息理论,一个叫全局网络工作空间理论,前者认为意识是由大脑中特定类型神经元连接形成的“结构”,后者指出意识是当信息通过互连网络传播到大脑区域时产生的。2023年,人们通过六个独立实验室进行了对抗性实验,结果与两种理论均不完全匹配,哲学家赢了,神经科学家输了。通过这一场赌约,可以看出人们总是希望人工智能能够了解人类的认知和大脑的奥秘。从物理学的视角看,物理学是对宏观世界有了透彻理解后,从量子物理起步开启了对微观世界的理解。智能世界与物理世界一样,都是具有巨大复杂度的研究对象,AI大模型仍然是通过数据驱动等研究宏观世界的方法,提高机器的智能水平,对智能宏观世界理解并不够,直接到神经系统微观世界寻找答案是困难的。人工智能自诞生以来,一直承载着人类关于智能与意识的种种梦想与幻想,也激励着人们不断探索。

  3 人工智能的安全风险

  人工智能的发展促进了当今世界科技进步的同时,也带来了很多安全风险,要从技术与法规两方面加以应对。

  首先是互联网虚假信息泛滥。这里列举若干场景:

  一是数字分身。AI Yoon是首个使用 DeepFake 技术合成的官方“候选人”,这个数字人以韩国国民力量党候选人尹锡悦(Yoon Suk-yeol)为原型,借助尹锡悦 20 小时的音频和视频片段、以及其专门为研究人员录制的 3000 多个句子,由当地一家 DeepFake 技术公司创建了虚拟形象 AI Yoon,并在网络上迅速走红。实际上 AI Yoon 表达的内容是由竞选团队撰写的,而不是候选人本人。

  二是伪造视频,尤其是伪造领导人视频引起国际争端,扰乱选举秩序,或引起突发舆情事件,如伪造尼克松宣布第一次登月失败,伪造乌克兰总统泽连斯基宣布“投降”的信息,这些行为导致新闻媒体行业的社会信任衰退。

  三是伪造新闻,主要通过虚假新闻自动生成牟取非法利益,使用ChatGPT生成热点新闻,赚取流量,截至2023年6月30日全球生成伪造新闻网站已达277个,严重扰乱社会秩序。

  四是换脸变声,用于诈骗。如由于AI语音模仿了企业高管的声音,一家香港国际企业因此被骗3500万美元。

  五是生成不雅图片,特别是针对公众人物。如影视明星的色情视频制作,造成不良社会影响。因此,迫切需要发展互联网虚假信息的伪造检测技术。

  其次,AI大模型面临严重可信问题。这些问题包括:(1)“一本正经胡说八道”的事实性错误;(2)以西方价值观叙事,输出政治偏见和错误言论;(3)易被诱导,输出错误知识和有害内容;(4)数据安全问题加重,大模型成为重要敏感数据的诱捕器,ChatGPT将用户输入纳入训练数据库,用于改善ChatGPT,美方能够利用大模型获得公开渠道覆盖不到的中文语料,掌握我们自己都可能不掌握的“中国知识”。因此,迫切需要发展大模型安全监管技术与自己的可信大模型。

  除了技术手段外,人工智能安全保障需要相关立法工作。2021年科技部发布《新一代人工智能伦理规范》,2022年8月,全国信息安全标准化技术委员会发布《信息安全技术机器学习算法安全评估规范》,2022-2023年,中央网信办先后发布《互联网信息服务算法推荐管理规定》《互联网信息服务深度合成管理规定》《生成式人工智能服务管理办法》等。欧美国家也先后出台法规,2018年5月25日,欧盟出台《通用数据保护条例》,2022年10月4日,美国发布《人工智能权利法案蓝图》,2024年3月13日,欧洲议会通过了欧盟《人工智能法案》。

  我国应加快推进《人工智能法》出台,构建人工智能治理体系,确保人工智能的发展和应用遵循人类共同价值观,促进人机和谐友好;创造有利于人工智能技术研究、开发、应用的政策环境;建立合理披露机制和审计评估机制,理解人工智能机制原理和决策过程;明确人工智能系统的安全责任和问责机制,可追溯责任主体并补救;推动形成公平合理、开放包容的国际人工智能治理规则。

  4 中国智能计算发展困境

  人工智能技术与智能计算产业处于中美科技竞争的焦点,我国在过去几年虽然取得了很大的成绩,但依然面临诸多发展困境,特别是由美国的科技打压政策带来的困难。

  困境一为美国在AI核心能力上长期处于领先地位,中国处于跟踪模式。中国在AI高端人才数量、AI基础算法创新、AI底座大模型能力(大语言模型、文生图模型、文生视频模型)、底座大模型训练数据、底座大模型训练算力等,都与美国存在一定的差距,并且这种差距还将持续很长一段时间。

  困境二为高端算力产品禁售,高端芯片工艺长期被卡。A100,H100,B200等高端智算芯片对华禁售。华为、龙芯、寒武纪、曙光、海光等企业都进入实体清单,它们芯片制造的先进工艺④受限,国内可满足规模量产的工艺节点落后国际先进水平2-3代,核心算力芯片的性能落后国际先进水平2-3代。

  困境三为国内智能计算生态孱弱,AI开发框架渗透率不足。英伟达CUDA⑤(Compute Unified Device Architecture, 通用计算设备架构)生态完备,已形成了事实上的垄断。国内生态孱弱,具体表现在:一是研发人员不足,英伟达CUDA生态有近2万人开发,是国内所有智能芯片公司人员总和的20倍;二是开发工具不足,CUDA有550个SDK(Software Development Kit, 软件开发工具包),是国内相关企业的上百倍;三是资金投入不足,英伟达每年投入50亿美元,是国内相关公司的几十倍;四是AI开发框架TensorFlow占据工业类市场,PyTorch占据研究类市场,百度飞桨等国产AI开发框架的开发人员只有国外框架的1/10。更为严重的是国内企业之间山头林立,无法形成合力,从智能应用、开发框架、系统软件、智能芯片,虽然每层都有相关产品,但各层之间没有深度适配,无法形成一个有竞争力的技术体系。

  困境四为AI应用于行业时成本、门槛居高不下。当前我国AI应用主要集中在互联网行业和一些国防领域。AI技术推广应用于各行各业时,特别是从互联网行业迁移到非互联网行业,需要进行大量的定制工作,迁移难度大,单次使用成本高。最后,我国在AI领域的人才数量与实际需求相比也明显不足。

  5 中国如何发展智能计算的道路选择

  人工智能发展的道路选择对我国至关重要,关系到发展的可持续性与最终的国际竞争格局。当前人工智能的使用成本十分高昂,微软Copilot套件要支付每月10美元的使用费用,ChatGPT每天消耗50万千瓦时的电力,英伟达B200芯片价格高达3万美元以上。总体来说,我国应发展用得起、安全可信的人工智能技术,消除我国信息贫困人口、并造福“一带一路”国家;低门槛地赋能各行各业,让我国的优势产业保持竞争力,让相对落后的产业能够大幅地缩小差距。

  选择一:统一技术体系走闭源封闭,还是开源开放的道路?

  支撑智能计算产业的是一个相互紧耦合的技术体系,即由一系列技术标准和知识产权将材料、器件、工艺、芯片、整机、系统软件、应用软件等密切联系在一起的技术整体。我国发展智能计算技术体系存在三条道路:

  一是追赶兼容美国主导的A体系。我国大多数互联网企业走的是GPGPU/CUDA兼容道路,很多芯片领域的创业企业在生态构建上也是尽量与CUDA兼容,这条道路较为现实。由于在算力方面美国对我国工艺和芯片带宽的限制,在算法方面国内生态林立很难形成统一,生态成熟度严重受限,在数据方面中文高质量数据匮乏,这些因素会使得追赶者与领先者的差距很难缩小,一些时候还会进一步拉大。  

  二是构建专用封闭的B体系。在军事、气象、司法等专用领域构建企业封闭生态,基于国产成熟工艺生产芯片,相对于底座大模型更加关注特定领域垂直类大模型,训练大模型更多采用领域专有高质量数据等。这条道路易于形成完整可控的技术体系与生态,我国一些大型骨干企业走的是这条道路,它的缺点是封闭,无法凝聚国内大多数力量,也很难实现全球化。  

  三是全球共建开源开放的C体系。用开源打破生态垄断,降低企业拥有核心技术的门槛,让每个企业都能低成本地做自己的芯片,形成智能芯片的汪洋大海,满足无处不在的智能需求。用开放形成统一的技术体系,我国企业与全球化力量联合起来共建基于国际标准的统一智能计算软件栈。形成企业竞争前共享机制,共享高质量数据库,共享开源通用底座大模型。对于全球开源生态,我国企业在互联网时代收益良多,我国更多的是使用者,是参与者,在智能时代我国企业在RISC-V⑥+AI开源技术体系上应更多地成为主力贡献者,成为全球化开放共享的主导力量。

  选择二:拼算法模型,还是拼新型基础设施?  

  人工智能技术要赋能各行各业,具有典型的长尾效应⑦。我国80%的中小微企业,需要的是低门槛、低价格的智能服务。因此,我国智能计算产业必须建立在新的数据空间基础设施之上,其中关键是我国应率先实现智能要素即数据、算力、算法的全面基础设施化。这项工作可比肩二十世纪初美国信息高速公路计划(即信息基础设施建设)对互联网产业的历史作用。  

  信息社会最核心的生产力是网络空间(Cyberspace)。网络空间的演进过程是:从机器一元连接构成的计算空间,演进到人机信息二元连接构成的信息空间,再演进到人机物数据三元连接构成的数据空间。从数据空间看,人工智能的本质是数据的百炼成钢,大模型就是对互联网全量数据进行深度加工后的产物。在数字化时代,在互联网上传输的是信息流,是算力对数据进行粗加工后的结构化抽象;在智能时代,在互联网上传输的是智能流,是算力对数据进行深度加工与精炼后的模型化抽象。智能计算的一个核心特征就是用数值计算、数据分析、人工智能等算法,在算力池中加工海量数据件,得到智能模型,再嵌入到信息世界、物理世界的各个过程中。  

  我国政府已经前瞻性地提前布局了新型基础设施,在世界各国竞争中抢占了先机。

  首先,数据已成为国家战略信息资源。数据具有资源要素与价值加工两重属性,数据的资源要素属性包括生产、获取、传输、汇聚、流通、交易、权属、资产、安全等各个环节,我国应继续加大力度建设国家数据枢纽与数据流通基础设施。  

  其次,AI大模型就是数据空间的一类算法基础设施。以通用大模型为基座,构建大模型研发与应用的基础设施,支撑广大企业研发领域专用大模型,服务于机器人、无人驾驶、可穿戴设备、智能家居、智能安防等行业,覆盖长尾应用。  

  最后,全国一体化算力网建设在推动算力的基础设施化上发挥了先导作用。算力基础设施化的中国方案,应在大幅度降低算力使用成本和使用门槛的同时,为最广范围覆盖人群提供高通量、高品质的智能服务。算力基础设施的中国方案需要具备“两低一高”,即在供给侧,大幅度降低算力器件、算力设备、网络连接、数据获取、算法模型调用、电力消耗、运营维护、开发部署的总成本,让广大中小企业都消费得起高品质的算力服务,有积极性开发算力网应用;在消费侧,大幅度降低广大用户的算力使用门槛,面向大众的公共服务必须做到易获取、易使用,像水电一样即开即用,像编写网页一样轻松定制算力服务,开发算力网应用。在服务效率侧,中国的算力服务要实现低熵高通量,其中高通量是指在实现高并发⑧度服务的同时,端到端服务的响应时间可满足率高;低熵是指在高并发负载中出现资源无序竞争的情况下,保障系统通量不急剧下降。保障“算得多”对中国尤其重要。  

  选择三:AI+着重赋能虚拟经济,还是发力实体经济?  

  “AI+”的成效是人工智能价值的试金石。次贷危机后,美国制造业增加值占GDP的比重从1950年的28%降低为2021年的11%,美国制造业在全行业就业人数占比从1979年的35%降低为2022年的8%,可见美国更倾向于回报率更高的虚拟经济,轻视投资成本高且经济回报率低的实体经济。中国倾向于实体经济与虚拟经济同步发展,更加重视发展装备制造、新能源汽车、光伏发电、锂电池、高铁、5G等实体经济。  

  相应地美国AI主要应用于虚拟经济和IT基础工具,AI技术也是“脱实向虚”,自2007年以来硅谷不断炒作虚拟现实(Virtual Reality,VR)、元宇宙、区块链、Web3.0、深度学习、AI大模型等,是这个趋势的反映。  

  我国的优势在实体经济,制造业全球产业门类最齐全,体系最完整,特点是场景多、私有数据多。我国应精选若干行业加大投入,形成可低门槛全行业推广的范式,如选择装备制造业作为延续优势代表性行业,选择医药业作为快速缩短差距的代表性行业。赋能实体经济的技术难点是AI算法与物理机理的融合。

  人工智能技术成功的关键是能否让一个行业或一个产品的成本大幅下降,从而将用户数与产业规模扩大10倍,产生类似于蒸汽机对于纺织业,智能手机对于互联网业的变革效果。

  我国应走出适合自己的人工智能赋能实体经济的高质量发展道路。

  注释:  

  ①模式识别是指用计算的方法根据样本的特征将样本划分到一定的类别中去,是通过计算机用数学方法来研究模式的自动处理和判读,把环境与客体统称为“模式”。以图像处理与计算机视觉、语音语言信息处理、脑网络组、类脑智能等为主要研究方向。  

  ②Token可翻译为词元,指自然语言处理过程中用来表示单词或短语的符号。token可以是单个字符,也可以是多个字符组成的序列。  

  ③通用人工智能是指拥有与人类相当甚至超过人类智能的人工智能类型。通用人工智能不仅能像人类一样进行感知、理解、学习和推理等基础思维能力,还能在不同领域灵活应用、快速学习和创造性思考。通用人工智能的研究目标是寻求统一的理论框架来解释各种智能现象。  

  ④芯片制造工艺指制造CPU或GPU的制程,即晶体管门电路的尺寸,单位为纳米,目前国际上实现量产的最先进工艺以台积电的3nm为代表。更先进的制造工艺可以使CPU与GPU内部集成更多的晶体管,使处理器具有更多的功能以及更高的性能,面积更小,成本更低等。  

  ⑤CUDA是英伟达公司设计研发一种并行计算平台和编程模型,包含了CUDA指令集架构以及GPU内部的并行计算引擎。开发人员可以使用C语言来为CUDA架构编写程序,所编写出的程序可以在支持CUDA的处理器上以超高性能运行。  

  ⑥RISC-V(发音为“risk-five”)是一个由美国加州大学伯克利分校发起的开放通用指令集架构,相比于其他付费指令集,RISC-V允许任何人免费地使用RISC-V指令集设计、制造和销售芯片和软件。  

  ⑦长尾效应是指那些原来不受到重视的销量小但种类多的产品或服务由于总量巨大,累积起来的总收益超过主流产品的现象。在互联网领域,长尾效应尤为显著。  

  ⑧高并发通常指通过设计保证系统能够同时并行处理很多请求。

【责任编辑:孟瑾】 
 
China Energy Storage Network News: The Chinese People's Congress website recently published Sun Ninghui's speech "The Development of Artificial Intelligence and Intelligent Computing" at the special lecture of the 14th National People's Congress Standing Committee. The full text is reproduced as follows. Let us enter the unfathomable world of artificial intelligence together.

Chairman, Vice Chairmen, Secretary-General, Members:

In recent years, the field of artificial intelligence is ushering in an explosive development led by generative artificial intelligence big models. On November 30, 2022, OpenAI launched an artificial intelligence dialogue chat robot ChatGPT. Its outstanding natural language generation ability has attracted widespread attention worldwide. It has exceeded 100 million users in 2 months. A wave of big models has been set off at home and abroad. Gemini, Wenxin Yiyan, Copilot, LLaMA, SAM, SORA and other big models have sprung up like mushrooms after rain. 2022 is also known as the first year of big models. The current information age is accelerating into the development stage of intelligent computing. Breakthroughs in artificial intelligence technology are emerging in an endless stream, gradually empowering thousands of industries and promoting artificial intelligence and data elements to become typical representatives of new quality productivity. General Secretary Xi Jinping pointed out that the new generation of artificial intelligence should be used as a driving force to promote the leapfrog development of science and technology, the optimization and upgrading of industries, and the overall leap of productivity, and strive to achieve high-quality development. Since the 18th National Congress of the Communist Party of China, the Party Central Committee with Comrade Xi Jinping as the core has attached great importance to the development of the intelligent economy, promoted the deep integration of artificial intelligence and the real economy, and injected strong impetus into high-quality development.

1 Introduction to the development of computing technology

The development history of computing technology can be roughly divided into four stages. The emergence of the abacus marks the entry of mankind into the first generation - the era of mechanical computing. The second generation - the symbol of electronic computing is the emergence of electronic devices and electronic computers. The emergence of the Internet has brought us into the third generation - network computing. At present, human society is entering the fourth stage - intelligent computing.

Early computing devices were manual auxiliary computing devices and semi-automatic computing devices. The history of human computing tools began with the Chinese abacus in 1200 AD, followed by the Napier chips (1612) and the roller adder (1642). In 1672, the first computing device that automatically completed four arithmetic operations - the stepping calculator was born.

Some basic concepts of modern computers have appeared in the mechanical computing period. Charles Babbage proposed the design concept of the Difference Engine (1822) and the Analytical Engine (1834) to support automatic mechanical calculations. During this period, the concepts of programming and programs were basically formed. The concept of programming originated from the Jacquard loom, which controlled the printing pattern through punched cards, and eventually evolved into storing all mathematical calculation steps in the form of calculation instructions; the first programmer in human history was Ada, the daughter of the poet Byron. She wrote a set of calculation instructions for Babbage's Difference Engine to solve the Bernoulli sequence. This set of instructions was also the first set of computer algorithm programs in human history. It separated hardware and software, and the concept of program appeared for the first time.

It was not until the first half of the 20th century that the four scientific foundations of modern computing technology, Boolean algebra (mathematics), Turing machine (computational model), von Neumann architecture (architecture), and transistor (device), emerged. Among them, Boolean algebra is used to describe the underlying logic of programs and hardware such as CPUs; Turing machines are a universal computing model that transforms complex tasks into automatic computing without human intervention; the von Neumann architecture proposes three basic principles for constructing computers: using binary logic, program storage and execution, and computers are composed of five basic units: arithmetic units, controllers, memory, input devices, and output devices; transistors are semiconductor devices that constitute basic logic circuits and storage circuits, and are the "bricks" for building modern computer towers. Based on the above scientific foundations, computing technology has developed rapidly and formed a large-scale industry.

From the birth of the world's first electronic computer ENIAC in 1946 to today in the 21st century, five types of successful platform computing systems have been formed. Currently, various types of applications in various fields can be supported by these five types of platform computing devices. The first category is high-performance computing platforms, which solve scientific and engineering computing problems in core national departments; the second category is enterprise computing platforms, also known as servers, which are used for enterprise-level data management and transaction processing. Currently, computing platforms of Internet companies such as Baidu, Alibaba and Tencent belong to this category; the third category is personal computer platforms, which appear in the form of desktop applications, and people interact with personal computers through desktop applications; the fourth category is smart phones, whose main features are mobility and portability. Mobile phones connect to data centers through the network, and they are mainly Internet applications. They are distributed though deployment in data centers and mobile terminals; the fifth category is embedded computers, which are embedded in industrial equipment and military equipment. Through real-time control, they ensure that specific tasks are completed within a certain time. These five types of devices cover almost all aspects of our information society. The sixth type of platform computing system centered on intelligent computing applications, which people have been pursuing for a long time, has not yet been formed.

The development of modern computing technology can be roughly divided into three eras.

IT1.0, also known as the electronic computing era (1950-1970), has the basic feature of being centered on "machines". The basic architecture of computing technology has been formed. With the advancement of integrated circuit technology, the scale of basic computing units has been rapidly miniaturized, and the density, computing performance and reliability of transistors have been continuously improved. Computers have been widely used in scientific and engineering computing and enterprise data processing.

IT2.0, also known as the network computing era (1980-2020), is centered on "people". The Internet connects the terminals used by people with the data center in the background, and Internet applications interact with people through smart terminals. Internet companies represented by Amazon and others have proposed the idea of ​​cloud computing, encapsulating the computing power in the background into a public service and renting it to third-party users, forming the cloud computing and big data industry.

IT3.0, also known as the intelligent computing era, began in 2020. Compared with IT2.0, it adds the concept of "things", that is, various end-side devices in the physical world are digitized, networked and intelligent, realizing the three-way integration of "people-machine-things". In the era of intelligent computing, in addition to the Internet, there is also data infrastructure to support various terminals to achieve the interconnection of all things through the end-edge cloud. AI is embedded in terminals, objects, edges, and clouds, providing large-model intelligent services similar to ChatGPT, and finally realizing AI intelligence wherever there is computing. Intelligent computing has brought a huge amount of data, breakthroughs in artificial intelligence algorithms, and explosive demand for computing power.

2 Introduction to the development of intelligent computing

Intelligent computing includes artificial intelligence technology and its computing carriers, and has roughly gone through four stages, namely general computing devices, logical reasoning expert systems, deep learning computing systems, and large-model computing systems.

The starting point of intelligent computing is the general automatic computing device (1946). Scientists such as Alan Turing and John von Neumann initially hoped to simulate the process of human brain processing knowledge and invent machines that think like human brains. Although this was not achieved, it solved the problem of computing automation. The emergence of general automatic computing devices also promoted the birth of the concept of artificial intelligence (AI) in 1956. Since then, all the development of artificial intelligence technology has been based on a new generation of computing devices and stronger computing power.

The second stage of the development of intelligent computing is the logical reasoning expert system (1990). Scientists of the symbolic intelligence school such as Edward Albert Feigenbaum took the automation of logic and reasoning ability as the main goal and proposed an expert system that can perform logical reasoning on knowledge symbols. People's prior knowledge enters the computer in the form of knowledge symbols, enabling computers to assist humans in making certain logical judgments and decisions in specific fields, but expert systems rely heavily on manually generated knowledge bases or rule bases. Typical representatives of this type of expert system are Japan's fifth-generation aircraft and China's 863 plan-supported 306 intelligent computer theme. In the logic expert system, Japan uses a dedicated computing platform and a knowledge reasoning language such as Prolog to complete application-level reasoning tasks; China has adopted a different technical route from Japan, based on a general computing platform, turning intelligent tasks into artificial intelligence algorithms, connecting both hardware and system software to a general computing platform, and giving birth to a number of backbone enterprises such as Sugon, Hanwang, and iFlytek.

The limitation of the symbolic computing system lies in its explosive computational time and space complexity, that is, the symbolic computing system can only solve linear growth problems, and cannot solve high-dimensional complex space problems, thus limiting the size of the problems that can be handled. At the same time, because the symbolic computing system is based on knowledge rules, we cannot enumerate all common sense by exhaustive enumeration, so its scope of application is greatly limited. With the arrival of the second AI winter, the first generation of intelligent computers gradually withdrew from the stage of history.

Until around 2014, intelligent computing advanced to the third stage - deep learning computing system. The connected intelligence school represented by Geoffrey Hinton and others aims to automate learning capabilities and has invented new AI algorithms such as deep learning. Through the automatic learning of deep neural networks, the ability of model statistical induction has been greatly improved, and great breakthroughs have been made in application effects such as pattern recognition①. The recognition accuracy of some scenarios even exceeds that of humans. Taking face recognition as an example, the entire neural network training process is equivalent to a network parameter adjustment process. A large amount of labeled face image data is input into the neural network, and then the network parameters are adjusted to make the probability of the result output by the neural network infinitely close to the real result. The greater the probability of the neural network outputting the real situation, the larger the parameters, so that knowledge and rules are encoded into the network parameters. In this way, as long as there is enough data, a large amount of common sense can be learned, and the versatility is greatly improved. The application of connected intelligence is more extensive, including speech recognition, face recognition, autonomous driving, etc. In terms of computing carriers, the Institute of Computing Technology of the Chinese Academy of Sciences proposed the world's first deep learning processor architecture in 2013, and the internationally renowned hardware manufacturer NVIDIA has continuously released a number of general-purpose GPU chips with leading performance, which are typical representatives of deep learning computing systems.

The fourth stage of the development of intelligent computing is the large model computing system (2020). Driven by the large model technology of artificial intelligence, intelligent computing has reached a new height. In 2020, AI shifted from "small model + discriminant" to "large model + generative", and upgraded from traditional face recognition, target detection, and text classification to today's text generation, 3D digital human generation, image generation, speech generation, and video generation. A typical application of large language models in the field of dialogue systems is OpenAI's ChatGPT, which uses the pre-trained base large language model GPT-3 and introduces a training corpus of 300 billion words, which is equivalent to the sum of all English texts on the Internet. The basic principle is: train the model by giving it an input and letting it predict the next word, and improve the prediction accuracy through a large amount of training, so that you can eventually ask it a question, the large model will generate an answer, and you can have an instant conversation with people. On the basis of the base large model, some prompt words are given to it for supervised instruction fine-tuning. Through human <instruction, reply> pairs, the model gradually learns how to have multiple rounds of dialogue with people; finally, through the artificially designed and automatically generated reward function, reinforcement learning iteration is carried out to gradually achieve the alignment of the large model with human values.

The characteristic of the large model is to win by "big", which has three meanings: (1) large parameters, GPT-3 has 170 billion parameters; (2) large training data, ChatGPT uses about 300 billion words and 570GB training data; (3) large computing power requirements, GPT-3 uses about tens of thousands of V100 GPUs for training. In order to meet the explosive increase in the demand for intelligent computing power of large models, both China and foreign countries are building large-scale new intelligent computing centers at a huge cost. NVIDIA has also launched a large-model intelligent computing system composed of 256 H100 chips and 150TB of massive GPU memory.

The emergence of large models has brought three changes.

The first is the technical scaling law, which means that the accuracy of many AI models increases rapidly after the parameter scale exceeds a certain threshold. The reason is not very clear in the scientific community and is highly controversial. The performance of AI models is in a "log-linear relationship" with the three variables of model parameter scale, data set size, and total computing power. Therefore, the performance of the model can be continuously improved by increasing the scale of the model. At present, the number of parameters of the most cutting-edge large model GPT-4 has reached trillions to tens of trillions, and is still growing;

Second, the demand for computing power in the industry has exploded. The training of large models with hundreds of billions of parameters usually requires 2-3 months of training on thousands or even tens of thousands of GPU cards. The sharply increased demand for computing power has driven the ultra-high-speed development of related computing companies. Nvidia's market value is close to two trillion US dollars, which has never happened to chip companies before;

Third, the society has impacted the labor market. The report "Research on the Potential Impact of AI Large Models on the Chinese Labor Market" jointly released by the National Development Research Institute of Peking University and Zhaopin.com pointed out that among the 20 most affected occupations, accounting, sales, and clerical work are at the forefront. Physical labor jobs that require dealing with people and providing services, such as human resources, administration, and logistics, are relatively safer.

The technological frontier of artificial intelligence will develop in the following four directions.

The first frontier direction is multimodal large models. From a human perspective, human intelligence is naturally multimodal. People have eyes, ears, nose, tongue, body, and mouth (language). From an AI perspective, vision, hearing, etc. can also be modeled as a sequence of token②, which can be learned in the same way as the large language model, and further aligned with the semantics in the language to achieve multimodal alignment intelligence.

The second frontier direction is video generation of large models. OpenAI released the Wensheng video model SORA on February 15, 2024, which greatly increased the video generation time from a few seconds to one minute, and significantly improved the resolution, picture realism, and timing consistency. The greatest significance of SORA is that it has the basic characteristics of the world model, that is, the ability of humans to observe the world and further predict the world. The world model is based on the basic physical common sense of understanding the world (such as water flows to lower places, etc.), and then observes and predicts what will happen in the next second. Although there are still many problems for SORA to become a world model, it can be considered that SORA has learned the picture imagination and minute-level future prediction ability, which are the basic characteristics of the world model.

The third frontier direction is embodied intelligence. Embodied intelligence refers to intelligent entities that have bodies and support interaction with the physical world, such as robots and driverless cars. They process multiple sensory data inputs through multimodal large models, and the large models generate motion instructions to drive the intelligent entities.  , replacing the traditional motion-driven mode based on rules or mathematical formulas, and realizing the deep integration of virtual and reality. Therefore, robots with embodied intelligence can gather the three major schools of artificial intelligence: connectionism represented by neural networks, symbolism represented by knowledge engineering, and behaviorism related to cybernetics. The three major schools can act on one intelligent body at the same time, which is expected to bring new technological breakthroughs.

The fourth frontier direction is that AI4R (AI for Research) has become the main paradigm for scientific discovery and technological invention. At present, scientific discovery mainly relies on experiments and human brain intelligence, and humans make bold guesses and carefully verify them. Information technology, whether it is computing or data, only plays some auxiliary and verification roles. Compared with humans, artificial intelligence has great advantages in memory, high-dimensional complexity, full vision, reasoning depth, and conjecture. Can AI be used as the main method for some scientific discoveries and technological inventions to greatly improve the efficiency of human scientific discovery, such as actively discovering physical laws, predicting protein structures, designing high-performance chips, and efficiently synthesizing new drugs? Because the AI ​​model has a full amount of data and a God's perspective, it can see more steps ahead than humans through deep learning. If it can achieve a leap from inference to reasoning, the AI ​​model has the potential to have the same imagination and scientific conjecture as Einstein, greatly improving the efficiency of human scientific discovery and breaking the cognitive boundaries of humans. This is the real subversion.

Finally, Artificial General Intelligence (AGI) is a very challenging and controversial topic. There was once a philosopher and a neuroscientist who made a bet: 25 years later (that is, in 2023), will researchers be able to reveal how the brain achieves consciousness? At that time, there were two schools of thought about consciousness, one called integrated information theory and the other called global network workspace theory. The former believed that consciousness was a "structure" formed by the connection of specific types of neurons in the brain, and the latter pointed out that consciousness is generated when information is transmitted to brain areas through interconnected networks. In 2023, people conducted adversarial experiments through six independent laboratories, and the results did not fully match the two theories. The philosopher won and the neuroscientist lost. Through this bet, we can see that people always hope that artificial intelligence can understand the secrets of human cognition and the brain. From the perspective of physics, physics has a thorough understanding of the macroscopic world, and then it starts to understand the microscopic world from quantum physics. The intelligent world, like the physical world, is a research object with great complexity. The AI ​​big model still uses data-driven and other methods to study the macroscopic world, improve the intelligence level of the machine, and does not understand the intelligent macroscopic world enough. It is difficult to find answers directly in the microscopic world of the nervous system. Since its birth, artificial intelligence has been carrying all kinds of human dreams and fantasies about intelligence and consciousness, and has also inspired people to continue to explore.

3 Security risks of artificial intelligence

While the development of artificial intelligence has promoted the progress of science and technology in today's world, it has also brought many security risks, which must be dealt with from both technical and regulatory aspects.

First, there is the proliferation of false information on the Internet. Here are several scenarios:

One is digital avatars. AI Yoon is the first official "candidate" synthesized using DeepFake technology. This digital person is based on Yoon Suk-yeol, a candidate of the South Korean National Power Party. With the help of Yoon Suk-yeol's 20 hours of audio and video clips and more than 3,000 sentences recorded specifically for researchers, a local DeepFake technology company created the virtual image AI Yoon and quickly became popular on the Internet. In fact, the content expressed by AI Yoon was written by the campaign team, not the candidate himself.

Second, fake videos, especially fake leaders' videos, cause international disputes, disrupt election order, or cause sudden public opinion events, such as fake Nixon's announcement of the failure of the first moon landing and fake Ukrainian President Zelensky's announcement of "surrender". These behaviors have led to a decline in social trust in the news media industry.

Third, fake news, mainly through the automatic generation of false news to make illegal profits, use ChatGPT to generate hot news and earn traffic. As of June 30, 2023, there are 277 fake news websites generated worldwide, which seriously disrupt social order.

Fourth, face-changing and voice-changing are used for fraud. For example, a Hong Kong international company was defrauded of $35 million because the AI ​​voice imitated the voice of a corporate executive.

Fifth, indecent images are generated, especially for public figures. For example, pornographic videos of movie stars are produced, which have a negative social impact. Therefore, it is urgent to develop counterfeit detection technology for Internet false information.

Next, AI big models face serious credibility issues. These issues include: (1) factual errors of "serious nonsense"; (2) narratives based on Western values, outputting political bias and wrong speech; (3) easy to be misled, outputting wrong knowledge and harmful content; (4) data security issues are aggravated, and big models have become traps for important sensitive data. ChatGPT incorporates user input into the training database to improve ChatGPT. The US can use big models to obtain Chinese corpus that is not covered by public channels and master "Chinese knowledge" that we may not master ourselves. Therefore, it is urgent to develop big model security supervision technology and our own credible big models.

In addition to technical means, AI security protection requires relevant legislative work. In 2021, the Ministry of Science and Technology issued the "New Generation Artificial Intelligence Ethics Code". In August 2022, the National Information Security Standardization Technical Committee issued the "Information Security Technology Machine Learning Algorithm Security Assessment Code". In 2022-2023, the Central Cyberspace Affairs Office successively issued the "Internet Information Service Algorithm Recommendation Management Regulations", "Internet Information Service Deep Synthesis Management Regulations", "Generative Artificial Intelligence Service Management Measures", etc. European and American countries have also successively issued laws and regulations. On May 25, 2018, the European Union issued the "General Data Protection Regulation". On October 4, 2022, the United States issued the "Blueprint for the Artificial Intelligence Bill of Rights". On March 13, 2024, the European Parliament passed the EU "Artificial Intelligence Act".

China should accelerate the introduction of the "Artificial Intelligence Law", build an artificial intelligence governance system, ensure that the development and application of artificial intelligence follow the common values ​​of mankind, and promote human-machine harmony and friendship; create a policy environment conducive to the research, development and application of artificial intelligence technology; establish a reasonable disclosure mechanism and audit evaluation mechanism, understand the principles and decision-making process of artificial intelligence mechanisms; clarify the security responsibilities and accountability mechanisms of artificial intelligence systems, trace the responsible parties and remedy them; promote the formation of fair, reasonable, open and inclusive international artificial intelligence governance rules.

4 Dilemma of China's Intelligent Computing Development

Artificial intelligence technology and intelligent computing industry are at the focus of Sino-US technological competition. Although China has made great achievements in the past few years, it still faces many development difficulties, especially the difficulties caused by the US technology suppression policy.

Dilemma 1 is that the United States has long been in a leading position in AI core capabilities, and China is in tracking mode. China has a certain gap with the United States in the number of high-end AI talents, AI basic algorithm innovation, AI base large model capabilities (large language model, Wensheng graph model, Wensheng video model), base large model training data, base large model training computing power, etc., and this gap will continue for a long time.

Dilemma 2 is that high-end computing power products are banned from sale, and high-end chip processes have been stuck for a long time. A100, H100, B200 and other high-end intelligent computing chips are banned from sale to China. Huawei, Loongson, Cambrian, Sugon, Haiguang and other companies have entered the entity list. Their advanced chip manufacturing processes④ are limited. The process nodes that can meet large-scale mass production in China lag behind the international advanced level by 2-3 generations, and the performance of core computing power chips lags behind the international advanced level by 2-3 generations.

Dilemma 3 is the weak domestic intelligent computing ecosystem and insufficient penetration of AI development frameworks. NVIDIA's CUDA⑤ (Compute Unified Device Architecture) ecosystem is complete and has formed a de facto monopoly. The domestic ecosystem is weak, specifically manifested in: First, there is a shortage of R&D personnel. NVIDIA's CUDA ecosystem has nearly 20,000 developers, which is 20 times the total number of personnel in all domestic smart chip companies; second, there are insufficient development tools. CUDA has 550 SDKs (Software Development Kits), which is hundreds of times that of related domestic companies; third, there is insufficient capital investment. NVIDIA invests $5 billion annually, which is dozens of times that of related domestic companies; fourth, the AI ​​development framework TensorFlow occupies the industrial market, PyTorch occupies the research market, and the number of developers of domestic AI development frameworks such as Baidu PaddlePaddle is only 1/10 of that of foreign frameworks. What is more serious is that there are many companies in China, and they cannot form a joint force. From intelligent applications, development frameworks, system software, and intelligent chips, although there are related products at each layer, there is no deep adaptation between the layers, and a competitive technical system cannot be formed.

Dilemma 4 is that the cost and threshold of AI application in the industry remain high. At present, Chinese AI applications are mainly concentrated in the Internet industry and some defense fields. When AI technology is promoted and applied to various industries, especially from the Internet industry to non-Internet industries, a lot of customization work is required, the migration is difficult, and the cost of single use is high. Finally, the number of talents in the field of AI in China is obviously insufficient compared with the actual demand.

5 How China chooses the road to develop intelligent computing

The choice of the road for the development of artificial intelligence is crucial to China, and it is related to the sustainability of development and the final international competitive landscape. The current cost of using artificial intelligence is very high. Microsoft Copilot suite costs $10 per month, ChatGPT consumes 500,000 kilowatt-hours of electricity per day, and Nvidia B200 chips cost more than $30,000. In general, China should develop affordable, safe and reliable artificial intelligence technology to seek to eliminate the size of the information-poor population in China and to benefit the countries along the Belt and Road Initiative; and empower all social strata with low usage thresholds, so that China's key advanced industries can maintain competitiveness and relatively backward industries can significantly narrow the gap.

Choice 1: Unify the technical system and take the closed-source or open-source path?

The intelligent computing industry is supported by a tightly coupled technical system, that is, a series of technical standards and intellectual property rights that closely link materials, devices, processes, chips, complete machines, system software, application software, etc. China is presented with three paths to develop an intelligent computing technology system:

The first is to catch up with and be compatible with the A system dominated by the United States. Most of China's Internet companies take the GPGPU/CUDA compatibility path, and many startups in the chip field also try to be compatible with CUDA in ecological construction. This path is more realistic. Due to the restrictions of the United States on Chinese processes and chip bandwidth in terms of computing power, it is difficult to form a unified domestic ecosystem in terms of algorithms, and the maturity of the ecosystem is severely limited. In terms of data, there is a lack of high-quality Chinese data. These factors will make it difficult to narrow the gap between the pursuers and the leaders, and sometimes it will be further widened.

Second, build a dedicated and closed B system. Build a closed ecosystem for enterprises in special fields such as military, meteorology, and justice, produce chips based on domestic mature processes, pay more attention to vertical large models in specific fields compared to large base models, and use more domain-specific high-quality data for training large models. This path is easy to form a complete and controllable technical system and ecology. Some large Chinese backbone enterprises have taken this path. Its disadvantage is that it is closed and cannot gather most of the domestic forces, and it is difficult to achieve globalization.

Third, build an open and open C system globally. Use open source to break the ecological monopoly, lower the threshold for enterprises to own core technologies, and allow each enterprise to make its own chips at low cost, forming a vast ocean of smart chips to meet ubiquitous smart needs. Use openness to form a unified technical system, and Chinese enterprises and global forces will unite to build a unified intelligent computing software stack based on international standards. Form a pre-competitive sharing mechanism for enterprises, share high-quality databases, and share open source general base models. For the global open source ecosystem, Chinese enterprises have benefited a lot in the Internet era more as a user but also as a participant. In the AI era, Chinese enterprises should become more major contributors to the RISC-V⑥+AI open source technology system and become the leading force of global open sharing.

Choice 2: Algorithm model or new infrastructure?

Artificial intelligence technology should empower all walks of life and has a typical long-tail effect⑦. 80% of Chinese small and medium-sized enterprises need low-threshold, low-price intelligent services. Therefore, the Chinese intelligent computing industry must be built on a new data space infrastructure. The key is that China should take the lead in realizing the comprehensive infrastructure of intelligent elements, namely data, computing power, and algorithms. This work can be compared with the historical role of the US information superhighway plan (i.e., information infrastructure construction) in the early twentieth century on the Internet industry.

The core productivity of the information society is cyberspace. The evolution process of cyberspace is: from the computing space composed of one-dimensional connection of machines, to the information space composed of two-dimensional connection of human-machine information, and then to the data space composed of three-dimensional connection of human-machine-object data. From the perspective of data space, the essence of artificial intelligence is the steel made from data, and the big model is the product of deep processing of the entire Internet data. In the digital age, what is transmitted on the Internet is information flow, which is a structured abstraction after the computing power has roughly processed the data; in the intelligent age, what is transmitted on the Internet is intelligent flow, which is a modeled abstraction after the computing power has deeply processed and refined the data. A core feature of intelligent computing is to use numerical calculations, data analysis, artificial intelligence and other algorithms to process massive data pieces in the computing power pool, obtain intelligent models, and then embed them into various processes in the information world and the physical world.

The Chinese government has proactively laid out new infrastructure in advance, seizing the initiative in the competition among countries around the world.

First, data has become a national strategic information resource. Data has the dual attributes of resource elements and value processing. The resource element attributes of data include production, acquisition, transmission, aggregation, circulation, transaction, ownership, assets, security and other links. China should continue to increase its efforts to build a national data hub and data circulation infrastructure.

Secondly, AI big models are a type of algorithm infrastructure in data space. Based on the general large model, we will build the infrastructure for the research and development and application of large models, support the large models dedicated to the research and development of enterprises, serve the industries of robots, unmanned driving, wearable devices, smart homes, smart security, etc., and cover long-tail applications.

Finally, the construction of the national integrated computing power network has played a leading role in promoting the infrastructure of computing power. While reducing the threshold cost of using computing power, it also provides high-throughput, high-quality intelligent services for the widest range of people. The Chinese solution for computing power infrastructure needs to have "two lows and one high", that is, on the supply side, it greatly reduces the total cost of computing power devices, computing power equipment, network connection, data acquisition, algorithm model call, power consumption, operation and maintenance, development and deployment, so that the majority of small and medium-sized enterprises can afford high-quality computing power services and have the enthusiasm to develop computing power network applications; on the consumer side, it greatly reduces the threshold of computing power use for the majority of users, and public services for the public must be easy to obtain and use, ready to use like water and electricity, and easy to customize computing power services and develop computing power network applications like writing web pages. On the service efficiency side, China's computing power services must achieve low entropy and high throughput, where high throughput means that while achieving high concurrency services, the response time of end-to-end services can be satisfied at a high rate; low entropy means that when there is disordered competition for resources in high concurrent loads, the system throughput is guaranteed not to drop sharply. Ensuring "more computing" is particularly important for China.

Choice three: Does AI+ focus on empowering the virtual economy, or on empowering the real economy? 
 
The effectiveness of "AI+" is the touchstone of the value of artificial intelligence. After the subprime mortgage crisis, the proportion of added value of US manufacturing in GDP decreased from 28% in 1950 to 11% in 2021, and the proportion of US manufacturing employment in the entire industry decreased from 35% in 1979 to 8% in 2022. It can be seen that the United States prefers the virtual economy with a higher rate of return and despises the real economy with high investment costs and low economic returns. China tends to develop the real economy and the virtual economy simultaneously, and pays more attention to the development of equipment manufacturing, new energy vehicles, photovoltaic power generation, lithium batteries, high-speed rail, 5G and other real economies. Correspondingly, American AI is mainly used in the virtual economy and IT basic tools, and AI technology is also "from the real to the virtual". Since 2007, Silicon Valley has been hyping virtual reality (VR), metaverse, blockchain, Web3.0, deep learning, AI big models, etc., which is a reflection of this trend.

China's advantage lies in the real economy. The manufacturing industry has the most complete industrial categories and the most complete system in the world, and is characterized by multiple scenarios and a lot of private data. China should select a number of industries to increase investment and form a paradigm that can be promoted across the industry with low barriers, such as selecting equipment manufacturing as a representative industry to continue its advantages and pharmaceutical industry as a representative industry to quickly narrow the gap. The technical difficulty of empowering the real economy is the integration of AI algorithms and physical mechanisms.

The key to the success of artificial intelligence technology is whether it can significantly reduce the cost of an industry or a product, thereby expanding the number of users and the scale of the industry by 10 times, producing a transformative effect similar to that of steam engines for the textile industry and smartphones for the Internet industry. China
should embark on a high-quality development path that is suitable for its own artificial intelligence to empower the real economy.

Notes:

① Pattern recognition refers to the use of computational methods to divide samples into certain categories according to the characteristics of the samples. It is the use of mathematical methods by computers to study the automatic processing and interpretation of patterns, and the environment and objects are collectively referred to as "patterns". The main research directions are image processing and computer vision, speech and language information processing, brain network groups, and brain-like intelligence.

②Token can be translated as a word unit, which refers to a symbol used to represent a word or phrase in natural language processing. A token can be a single character or a sequence of multiple characters. ③General artificial intelligence refers to the type of artificial intelligence that has intelligence that is equal to or even exceeds that of humans. General artificial intelligence can not only perform basic thinking abilities such as perception, understanding, learning and reasoning like humans, but can also flexibly apply, quickly learn and think creatively in different fields. The research goal of general artificial intelligence is to seek a unified theoretical framework to explain various intelligent phenomena.

④Chip manufacturing process refers to the process of manufacturing CPU or GPU, that is, the size of transistor gate circuits, in nanometers. Currently, the most advanced process for mass production in the world is represented by TSMC's 3nm. More advanced manufacturing processes can integrate more transistors inside the CPU and GPU, so that the processor has more functions and higher performance, smaller area, lower cost, etc.

⑤CUDA is a parallel computing platform and programming model designed and developed by NVIDIA, which includes the CUDA instruction set architecture and the parallel computing engine inside the GPU. Developers can use C language to write programs for the CUDA architecture, and the programs written can run with ultra-high performance on processors that support CUDA.

⑥RISC-V (pronounced "risk-five") is an open general-purpose instruction set architecture initiated by the University of California, Berkeley. Compared with other paid instruction sets, RISC-V allows anyone to use the RISC-V instruction set for free to design, manufacture and sell chips and software.

⑦The long tail effect refers to the phenomenon that the total revenue of products or services with small sales but many types that were not valued originally exceeds that of mainstream products due to their huge total volume. In the field of Internet, the long tail effect is particularly significant.

⑧High concurrency usually refers to the design to ensure that the system can process many requests in parallel at the same time.

【Editor: Meng Jin】he Chinese solution for the infrastructure of computing power should significantly reduce the computing

No comments: