"我认为机器学习仍然是一种'雏形'阶段;它距离成为一个成熟的工程学科还有一段路程要走。"近日,斯坦福大学计算机系教授Percy Liang在接受“香侬科技”采访时表示。在采访中,Percy Liang还提到:

  1. 机器学习现在的很多应用往往涉及生命安全,非同儿戏;

  2. 语言是关于与人的交流的,这一点在NLP社区中是缺失的;

  3. 可重复性在所有科学领域都是一个巨大的问题,人工智能也不例外;

  4. 尽管人们可能认为不存在数据短缺的问题(毕竟,这不是大数据时代吗),事实上,拥有大量的好用的数据仍是一个挑战。

斯坦福大学计算机系助理教授、斯坦福人工智能实验室成员 Percy Liang 主要研究方向为自然语言处理(对话系统,语义分析等方向)及机器学习理论,他与他的学生合作的论文刚刚获得ACL 2018 短论文奖,其本人亦是2016年IJCAI 计算机和思想奖(Computers and Thought Award)得主。




SQuaD (The Stanford Question Answering Dataset)在推进机器阅读理解和问答领域非常成功。然而,除了可以被NLP研究者用来开发更好的阅读理解系统,你认为这个数据集是否潜藏着其他机会?

图1. SQuaD中的样本举例


虽然SQuaD(实际上,任何阅读理解数据集)名义上都是关于阅读理解的,但我认为它们可以有两个方面更广泛的影响:第一,数据集鼓励人们开发新的通用模型。例如,神经机器翻译产生了基于注意力的模型,这在机器学习领域里如今已成为最常见的模型之一。第二,在一个数据集上训练的模型对其他任务是有价值的。例如,在ImageNet上训练卷积神经网络,模型会学习到可用于各种视觉问题的通用图像特征。SQuaD所带来的影响与上面列出的两个例子类似(尽管可能不及它们那么大)。SQuaD已经达到了极限,因为多个系统已经超过了这个数据集上的人类水平。但是,正如Robin Jia和我在EMNLP 2017的一篇论文中所展示的那样,这样的系统可以很容易地被对抗样本所愚弄,在即将到来的ACL 2018中,我们有一篇论文将发布SQuaD 2.0,它包含5万个额外的问题,它们看起来像是有答案的问题,但实际上没有答案。希望这样一个新的数据集,与最新层出不穷的其他数据集(例如,RACE、TriviaQA等)的出现,将有助于推动该领域的进步。


过去您和您的学生已经做了许多非常有影响力的关于人工智能安全的工作(Raghunathanet al.ICLR 2018,Steinhardtet al.NIPS 2017)。同时,您还对神经网络的可解释性进行了研究,包括Koh et al.,ICML 2017最佳论文)。您认为提高深度神经网络的解释性有助于解决人工智能的安全问题吗?为什么?


到目前为止,人工智能研究的主要驱动力一直是获得预测更准确的模型。但是,最近可解释性和鲁棒性/安全性的问题得到了更多的关注,我认为这是特别重要的,因为机器学习现在的很多应用往往涉及生命安全,非同儿戏。如自主驾驶、医疗保健等。然而,可解释性和鲁棒性是模糊的术语,人们对它们并没有统一的定义。在这一点上,我认为仍然有许多概念性工作要做,使这些术语形式化,这样人们才可以做出可量化的进步。我们已经通过使用影响函数(influence functions,Koh et al. ICML 2017)和半定松弛(Raghunathan et al.ICLR 2018)在形式化这些术语方面取得了一些初步的进展,而这两种方法都是统计和优化的经典工具。我认为机器学习仍然是一种“雏形”阶段;它距离成为一个成熟的工程学科还有一段路程要走。


您的许多自然语言处理研究与人类语言处理有着密切的联系(例如,Wang et al., ACL 2016 杰出论文奖:通过人机交互使机器从零开始学习语言,He et al. ACL 2017: 通过学习动态知识图谱嵌入来构建对称合作型聊天机器人)。您认为理解人类语言处理在何种程度上会帮助我们建立更好的机器语言处理系统?

图2. Wang et al. ACL 2016 中的SHRDLURN 语言游戏。机器需通过与人交互从零开始学习语言。





图3. CodaLab工作原理。详情见CodaLab官方网站 https://worksheets.codalab.org/ 。






当我读博士的时候,我非常喜欢机器学习的建模、算法和分析。但是我意识到即使是很强的算法也是有局限性的:你会看到系统所犯的错误,然后你意识到如果只有一个固定的数据集你可能就是做不出来最完善的算法。后来我在斯坦福大学的时候(也是部分源于我在谷歌的时间的影响),我开始将数据-建模两件事放在一起思考。尽管人们可能认为不存在数据短缺的问题(毕竟,这不是大数据时代吗),事实上,拥有大量的好用的数据仍是一个挑战。我们已经提出了许多能够改变这一问题的方法(例如,在Wang et al.,ACL 2015中,我们有一篇论文研究了如何通过让人们改述句子而不是注释逻辑形式的方式来构建语义分析器)。把数据和建模放在一起思考可以拓宽解决方案的各种可能,让你更有创造力。

图4.通过让人们改述句子而不是注释逻辑形式的方式来快速构建语义分析器 (图片来源于Wang et al.,ACL 2015)。















《香侬说》:是香侬科技打造的一款以机器学习与自然语言处理为专题的访谈节目。本期采访嘉宾是斯坦福大学计算机系教授Percy Liang.

Percy Liang:斯坦福大学计算机系助理教授、斯坦福人工智能实验室成员 ,主要研究方向为自然语言处理(对话系统,语义分析等方向)及机器学习理论。


ShannonAI: SQuAD has been extremelysuccessful in pushing forward the field of reading comprehension and question answering. However, do you see any other opportunities brought by this dataset that could be utilized by NLP researchers beside developing better reading comprehension system?

Percy:While SQuAD (really, any reading comprehension dataset) is nominally about reading comprehension, I think the impact can be broader in two ways: First, datasets encourage people to develop new general-purpose models.For example, neural machine translation gave rise to attention-based models, which are now ubiquitous in deep learning.Second, models trained on datasets can be of value to other tasks.For example, training CNNs on ImageNet gives rise to generalizable image features that are used for all sorts of vision problems.SQuAD has had some impact along these two lines, but not to the same extent as the two examples listed above.I think SQuAD has reached its limit, as multiple systems have now exceed human-level performance on this dataset.But as Robin Jia and I showed in a paper from EMNLP 2017, such systems can be easily fooled by adversarial examples, showing that they don't really understand language at a deep level.In an upcoming ACL 2018 short paper, we are releasing SQuAD 2.0, which will contain 50K more questions which look like they have answers but don't actually have an answer.Hopefully, this, along with the flurry of new datasets coming out (e.g., RACE, TriviaQA, etc.) will help drive the progress of the field forward.

ShannonAI:In the past you and your students have done several influential pieces of work on AI safety (Raghunathan et al. ICLR 2018, Steinhardt,et al. NIPS 2017). You have also done work on theinterpretability ofneural networks, includingKoh et al.ICML 2017Best Paper. Do you think enhancing the interpretability of deep neural networks could help address the issues of AI safety? Why or why not?

Percy:The principal driver of AI research thus far has been trying to obtain more accurate models.But more recently, issues of interpretability and robustness/safety have gained more traction, which I think is especially important given that machine learning is making its way into serious applications such as autonomous driving, healthcare, etc.However, interpretability and robustness are vague terms and people don't necessarily agree on their meaning.At this point, I think there's still a lot of conceptual work to be done to formalize these notions so that one can make measurable progress.We have made some initial progress in trying to capture these notions using influence functions (ICML 2017) and semidefinite relaxations (ICLR 2018), which are both classic tools from statistics and optimization.I think machine learning is still in kind of a "prototype" phase; there is still work to be done to evolve it into a mature engineering discipline.

ShannonAI:Many of your researchprojectshave close connections with human language use (e.g.,Learning language games through interaction.Learning symmetric collaborative dialogue agents with dynamic knowledge graph embeddings, etc.). To what extent do you think understanding human language processing would inspire us for building better machine language processing?

Percy:We work a lot with crowdsourcing and humans, because fundamentally, language is about communication with humans. Sometimes I feel like this aspect of language is lost in the NLP community, where most work is on large-scale tasks - machine translation, question answering, information extraction. This is very different from how humans use language to learn and accomplish goals.I would say the relevance of understanding language isn't so much about doing so so that we can mimic humans, but rather, if we want to build systems that can interact with humans, these systems fundamentally need to understand how humans think and act, at least at a behavioral level.Communication and language is not just about the words but about the underlying agents and their goals.

ShannonAI:As mentioned on yourwebsite, you are a strong proponent of efficient and reproducible research. You have been developingCodaLab Worksheets, a platform that allows researchers to maintain the full provenance of an experiment from raw data to final results. What do you think is the biggest obstacle for reproducible research in machine learning? And how should we address them?

Percy:Reproducibility is a huge problem across all of science, including AI, though I think as AI researchers, we really have no excuse - it's all just about executing code on data.The community isdefinitely getting better at releasing code and datasets compared to in the past, but often just having the code and data isn't enough to reproduce the results of a paper, because how the code is run might not be documented. CodaLab, by keeping track of the provenance of the actual execution, certifies that the result you got in the end actually is produced by the code and data.We've tried to make CodaLab as unobstrusive as possible - people can use any programming language, dataset format, etc.However, the challenge is that even still, the incentives are not set up properly for people to aim for this level of reproducibility, even though there is a network effect - if everyone were to use it, then it would be so much easier to build on others' work and the pace of research would be vastly accelerated.I think it's a matter of time.

ShannonAI:Before joining Stanford as a faculty member, yougot your PhD fromBerkeley and have worked at Google as a post-doc. How does your approach as a machine learning researcher change over time?

Percy:When I was in grad school, I was very much into the modeling, algorithms, and analysis of machine learning algorithms.But I realized there's only so much that fancy methods can do - you'd look at the errors that systems make and you realize that it was just impossible to get it right given a fixed dataset.Partly influenced by my time at Google, during my time at Stanford, I've really been thinking about the data-modeling pipeline jointly.Even though one might think there is no shortage of data (after all, isn't this the era of big data?), having large amounts of the right type of data is still a challenge.We've thought about ways of turning a problem on its head (e.g., in ACL 2015, we had a paper showing how to build a semantic parser by having people paraphrase sentences rather than annotate logical forms). Thinking about data and modeling together broadens the design space of solutions and allows you to be more creative.

ShannonAI:What is the most rewarding thing about being a machine learning researcher?

Percy:Machine learning allows you to think both about the underlying mathematical principles and about how to have a real positive impact on society.

ShannonAI:What is the most frustrating thing about being a machine learning researcher?

Percy:Sometimes you're just taking stabs in the dark. You look at the errors of a system, you do something that tries to fix them, and nothing improves. In a certain sense, you use machine learning when you don't understand the underlying phenomena because it's too complicated (or else you would have just written a program directly).

ShannonAI:Do you have any advice for students just entering the field of NLP on developing good taste for research projects?

Percy: Learn the fundamentals and read broadly, especially outside NLP and AI - you never know whether ideas from programming languages, linguistics, cognitive science, optimization, statistics could be relevant to what you're doing.

It's tempting to be carried away by fancy things and try to throw in all the bell and whistles; try to do the opposite.It's more impressive to solve a problem using a simple method rather than solving a problem with a complex method.

Pick a problem that you believe in and pursue it passionately.You'll know because you'll want to think about it all the time.Make it personal.

