Microsoft CTO Kevin Scott's Seven Insights on the Future of AI
A TransformX Highlight
At TransformX, we brought together a community of leaders, visionaries, practitioners, and researchers across industries to explore the shift from research to reality within Artificial Intelligence (AI) and Machine Learning (ML).
In this session, Kevin Scott, CTO of Microsoft joined Scale AI CEO Alexandr Wang to discuss the most impactful recent advances in AI and how we can enable a new wave of innovation by democratizing AI for the benefit of everyone in society.
Introducing Kevin Scott
Kevin Scott is Executive Vice President of Technology & Research, and the Chief Technology Officer, at Microsoft. Scott also hosts a podcast, Behind the Tech, and is the author of “Reprogramming the American Dream,” which explores his vision of AI being democratized so that it might benefit all. <br/> <br/>
<div style="position: relative; padding-bottom: 56.25%; height: 0;"><iframe src="https://fast.wistia.com/embed/medias/9lbo3y04fz" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;"></iframe></div>- Kevin Scott, CTO of Microsoft, sits down with Alexandr Wang, CEO of Scale AI
What Are Kevin’s Key Takeaways?
After listening to Kevin, it is difficult not to be optimistic about the future of AI. Recent advances in the field of AI are leading the way towards a future where those without AI experience or technical skills can still build and deploy AI models. This promises to improve the way we live and work in ways that have yet to be imagined.
AI Is More Accessible Than Ever Before
In the early 2000s, the early expectations of AI weren’t high, Kevin explains. AI was largely in the realm of big tech companies, using it to address mostly back-office needs. AI-enabled applications ranged from basic classification and forecasting tasks to predicting Click Through Rates (CTRs) for online ads.
Today, we now see AI being used in almost every industry, increasingly by those without AI experience or AI skills.
AI Tools Are Here To Help (Everyone!)
Kevin described how, from the invention of the first tools to today’s modern machinery, we’ve continuously reinvented the way we do physical work. With the increasing maturity of AI, we’re now beginning to see the reinvention of cognitive work.
Less than 20 years ago, a Computer Science education involved churning through graduate textbooks to write simple code. The deep-learning revolution in 2012 has dramatically broadened the potential applications of AI.
Now, AI tools can generate code for you with nothing more than natural language instructions. Pre-trained AI models and tools like OpenAI Codex or GitHub CoPilot. have transformed long-established roles and tasks, with software development being just one example. We could soon imagine a time where developers will not need to know any programming languages.
Today, we see AI being used by scientists running simulations and developers building intelligent applications with machine learning APIs.
The full democratization of AI might be realized when anyone, without an AI or technology background, can invent and build new applications with AI. The ability of AI to automate laborious tasks, whether physical or cognitive, will help us reimagine the division of labor between humans and machines.
If AI-based tools, such as robots, are already able to ace our standardized tests, why should we demand our high-school students do the same? Kevin asks, perhaps we should focus on those things that make us uniquely human? We should become comfortable with the ways AI can help us, in much the same way we are comfortable with other forms of automation already in our daily lives.
How Do We Know When We Have Democratized AI for All?
In the ongoing democratization of AI, how do we know when we have done enough to make AI accessible for all? Kevin offers one possible success metric - the number of new and novel applications being created through advanced models, that not even the creators of those original models could have envisioned.
In other words, we will have democratized AI if can greatly reduce the traditional skill, technology and capital investments that have limited AI to the largest big-tech enterprises.
Kevin suggests that a relatively small group of highly-skilled people could build advanced models for a much larger group of non-AI-skilled people with industry-specific domain expertise. That larger group can then re-purpose or re-train those models to solve problems that the original model creators may never have envisioned.
Transformer Models Are Large and Expensive To Build, but They Can Still Be Cost Efficient
Powerful models Transformer models are well known to be expensive to build. The amount of data needed to train them is so large that a single training run of the GPT-3 transformer model can reportedly cost over $4.6M.
Transformers that have been pre-trained on large datasets can be re-trained and re-used for particular applications through transfer learning. OpenAI reports that they could fine-tune a large GPT3 model with a new dataset of just 0.000000211% of the original training data. This means that it’s now possible to re-purpose these models at a tiny fraction of the initial training cost.
The ability to easily tune these complex models for new applications opens the field for anybody to build new applications without the need for access to considerable computational resources.
Beyond re-using existing models for new use-cases, can we train these models more efficiently? Kevin explores how the human brain is capable of so much more than today’s Transformer but consumes a fraction of the energy during training. Is the human brain proof that model training could be much less computationally intensive? A collaborative relationship between AI and cognitive science research could one day lead to AI that approaches the energy efficiency of our brains.
Deep Learning: The Scientific Discovery Tool We Need
Using deep learning, we have made significant advances in finding solutions to complex problems, like drug discovery, for example. Non-deep learning-based algorithms often rely on a human understanding of the problem domain to discover possible solutions.
With deep learning systems, we can model the complexity of some problem domains in ways that are easier than both heuristic and analytical approaches. These advancements don’t mean the end to the need for a mathematical approach to solving problems but rather are the next steps in pursuing further advances. The ability of deep learning models to automatically discover and generalize complex patterns makes them an excellent tool in this regard.
Transformer models and reinforcement learning are now used to solve scientific problems involving simulation and combinatorial optimization. These models can learn the structure of the problem, discluding heuristic aspects of prior non-AI methods. One example is CalTech’s deep learning model for simulations on airfoil design. Researchers built the Fourier Neural Operator to solve partial differential equations (PDEs), including the famous Navier-Stokes airflow equation. This neural operator can solve PDEs at 1,000 times the speed of traditional methods.
A recently developed application of AI is pharmacological drug discovery. Using deep-learning techniques helped researchers identify protein structures that have a super high binding affinity for the receptor-binding domain of SARS Coronavirus 2. The next step could be the modeling of interactions that a synthetic protein might have inside the entire human proteome.
We Need a New (AI) Moonshot
The Apollo space program introduced the term ‘moonshot’ into our modern vocabulary. The term is commonly taken to mean; ‘an extremely ambitious project or mission undertaken to achieve a monumental goal.’ While the Apollo mission of landing on the moon may have seemed arbitrary, it required many advances which have remained valuable decades since. The Apollo program, at the time, cost 2% of the national GDP each year for 10 years. However, it yielded many advances that impact our daily lives, even today.
Kevin shares how we can achieve our own (AI) moonshot, using far less than 2% of our national GDP. We just need to pick a long-term goal that solves one of the many hard problems facing society today, like affordable healthcare, climate change, or even demographic inversion.
We Should Tackle the Fear, Uncertainty, and Doubt of AI, Head On
There’s the fear in both industry and society that the adoption of AI into our lives is bound to take jobs away from humans. However, we still need AI tools because there are some truly complex problems we can’t solve without them.
As AI practitioners, we should consider any possible adverse effects of AI, but it’s important we don’t lose our optimism in the process. To do this, we should maintain a balanced and thoughtful approach.
Open and frank conversations on AI safety and bias can help us maintain balance. With that in mind, we need teachers of AI to educate and inform the general public, in much the same way we have Neil deGrasse Tyson teaching Astrophysics.
Like Neil Degrasse Tyson, AI needs spokespersons to bring transparency. To encourage an interest and understanding in AI would allow the public to form their own opinions of the benefits and consequences of broader AI adoption. As Kevin notes, we do not resent a forklift for doing more work than us.
We as a society need to realize the potential for AI to relieve us of repetitive cognitive work. With the democratization of AI should come recognition of our unique abilities as humans. We should hone in on what it means to be human, free of unnecessary cognitive burdens, so that we might use human abilities with greater creativity.
Would You Like to Learn More?
- See more insights from AI researchers, practitioners and leaders at Scale Exchange
About Kevin Scott
Kevin Scott is executive vice president of Technology & Research and the chief technology officer of Microsoft. He is an innovative leader driving the technical vision to achieve Microsoft’s mission and is passionate about creating technologies that benefit everyone. He focuses on helping make the company an exceptional place for engineers, developers, and researchers to work and learn.
Scott’s 20-year career in technology spans both academia and industry as a researcher, engineer, and leader. Scott holds an M.S. in computer science from Wake Forest University, a B.S. in computer science from Lynchburg College, and has completed most of his Ph.D. in computer science at the University of Virginia. He is an engineering executive, an author, a scholar, and an all-around amazing human being.
Transcript
Nika Carlson (00:23):
Next up, we're thrilled to welcome Kevin Scott. Kevin Scott is executive vice president of technology and research and the CTO of Microsoft. He is an innovative leader, driving the technical vision to achieve Microsoft's mission and is passionate about creating technologies that benefit everyone. He focuses on helping make the company an exceptional place for engineers, developers and researchers to work and learn. Scott's 20 year in technology spans both academia and industry as researcher, engineer and leader. Prior to joining Microsoft, he was senior vice president of engineering and operations at LinkedIn, where he helped build the technology and engineering team and led the company through an IPO and six years of rapid growth. Scott is the host of the podcast Behind the Tech, and authored the book Reprogramming the American Dream, which explores how AI can be realistically used to serve the interests of everyone and not just the privileged few. He has received a Google Founder's Award and Intel PhD fellowship, and an ACM Recognition of Service Award. Scott holds an MS in computer science from Wake Forest University, a BS in computer science from Lynchburg College, and has completed most his PhD in computer science at the University of Virginia. Kevin is joined by Alex Wang, CEO and founder at Scale. Alex, over to you.
Alex Wang (01:53):
Awesome. Thank you so much for sitting down with us today, Kevin. I'm super excited to be chatting with you.
Kevin Scott (01:59):
Yeah, I'm really glad to be back again.
Alex Wang (02:02):
So yeah, so first of all, yeah, welcome back to Transform and excited to do another conversation. I think that we're going to talk all about AI. You've been writing and speaking about AI for quite some time and obviously in your book, Reprogramming the American Dream, you predict that AI will radically disrupt economics and employment for generations to come. So maybe first just to start out with, what was kind of your first aha moment with AI? I know you've been working with it for some time now and when did you kind of first realize that it had incredible potential to change the way we live and work?
Kevin Scott (02:40):
Yeah. I've been in machine learning now for a while. I think the first machine learning program I wrote was in 2004, so yeah, like really long while I guess relatively speaking. Back then, I wouldn't have expected to see as much progress as we've seen over the past 17 years. Like what I was doing was relatively arcane, like I was building a bunch of classification technology for some things in Google's ad system and eventually using a bag of machine learning techniques for doing very technical things like predicting the CTR for an ad. But what I think we've seen, specially with the deep learning revolution since 2012 or so and maybe more so over the past few years is that the machine learning technologies are becoming more broadly applicable and I think this has really accelerated over the past two to three years and I think things are finally at the point where you just see a broader and broader range of people using machine learning to solve their problems every day, which means that machine learning has a greater and greater impact on what's happening in the world and it's not just ranking search results or ranking the content that you have in your social media feed.
Kevin Scott (04:25):
It is helping scientists solve problems that they have in their particular domain, it's helping us manage global pandemics, and increasingly, it is packaging machine learning up in ways where lots and lots of developers can easily use the technologies. So things like the open AI, API, Azure Cognitive Services, Codex, GitHub, Copilot. Like there are just so many ways now that developers can actually pick these tools up where they don't have to do what I did 17 years ago and sort of stare at a stack of graduate textbooks and spend six months writing a bunch of low level C++ code to get a single thing to work.
Alex Wang (05:14):
Yeah. Totally, and I think that like ... Yeah, Microsoft [inaudible 00:05:18] partnered on a bunch of really exciting tools, you mentioned Copilot and Codex and all that stuff more recently. When we had our last conversation in March, we were kind of ... You shared your thoughts on kind of potential self-supervised models as well as the transformer networks that had recently come out. Maybe take us quickly on a quick tour of what you think are some of the most important advancements since then, whether they be sort of applications of that technology like with Codex, with additional optimizations, Codex and Copilot, and what you think the potential implications of those advancements are.
Kevin Scott (05:56):
Yeah, I think the core things that we talked about last time have continued to proceed apace. So the self-supervised models themselves, the larger they scale, they are becoming better at performing the tasks that we know they can perform and they're also admitting a broader range of tasks. So at each decade of scale it seems, you can do more and more with these models and we've gotten to the point where you can really start thinking about these models themselves. Some people call them pre-trained models, like we call them platform models at Microsoft, Stanford just published a really good survey paper, all this 160 some pages, so it's almost like a monograph that talks about the field and I think they're calling them foundation models but the general idea is that not everyone should have to replicate these big models from scratch and you can sort of use them as software engineering objects that you can take what someone else has trained and then either fine tuned or using no fine tuning whatsoever, go solve a whole bunch of problems that the original model builders didn't think about at all.
Kevin Scott (07:12):
So that's really exciting, like that can unlock a ton of creativity and this broadening of what things are useful for has definitely been on display over the past six months. So the OpenAI Codex model and its use and GitHub Copilot is a pretty good example of this. So you take a model that started off being good at a bunch of natural language tasks and it turns out you can make it be good at artificial language tasks, like programming. So that's super exciting, but we've also seen these self-supervised learning techniques applied in different areas. So the same set of techniques are being applied to vision and we've had a whole bunch of breakthrough vision results over the past six months where you're building these pre-trained models for computer vision. We are seeing multimodal models where you're training in multiple domains simultaneously. So text and vision for instance, and then maybe even the most exciting things that are happening along these lines are things where we're applying the technique to things like graphs. So researchers at MSR published an interesting paper on this idea called GraphFormers which is a way to do self-supervised learning on graph structured data and the particular thing that they're using these pre-trained models for is in molecular dynamics and solving simulation problems around molecules and solving molecular structures like protein folding. Like that's super, super, super exciting.
Kevin Scott (09:01):
I will say the second interesting thing that has just continued to happen and that's accelerating and picking up pace in general over the past six months is the number of places that people are using either these self-supervised platform models or using techniques from reinforcement learning to solve problems in the sciences where you have maybe a simulation problem or a combinatorial optimization problem where it is very, very computationally expensive to get to very accurate solutions to the problem. In the case of combinatorial optimization most of these interesting things are [inaudible 00:09:46] problems and in the case of simulation problems you usually have this trade-off of complexity and time scale or time steps in your simulation versus the overall amount of compute you're spending or the amount of acute time that a calculation takes. And we have been able in a whole bunch of places to use machine learning to learn something about the structure of these problems where you can dramatically accelerate the performance of simulations. So an interesting paper from some PhD students at Caltech on using this bag of techniques for solving Navier-Stokes flow equations for airfoil design, there's just been some very interesting stuff happening in molecular dynamics, and the stunning thing here is that when you apply these techniques, things aren't getting 10% better or 50% better or twice as good, they're getting orders of magnitude better.
Kevin Scott (10:53):
This Caltech result, which is emblematic of a pattern that I've seen hundreds of times now, got 1,000x performance improvement by building what they're calling neural operator for their simulation system for Navier-Stokes. So it's just sort of an incredibly exciting thing as more people figure out how to use these techniques.
Alex Wang (11:20):
Yeah, totally. I mean one mental model I have here and I'm curious what you think about it is like ... In a lot of these real world problems, if you think about the ages of human sort of science building or knowledge building, we kind of had math and physics as our primary analysis tool and so that allowed us to get certain levels of performance and to do things really well but like now deep learning and machine learning is almost this like new kind of analysis tool that is like better at finding weak correlations in data, better at finding sort of like these subtle that frankly human brains aren't that great at identifying and you can use it as an analysis to like ... To make significant progress on things that maybe with previous analysis tools we would say are just like ... We know are really, really hard, like the simulations, the sort of biological molecular and dynamics problems, and all these things that you just referenced.
Kevin Scott (12:19):
Yeah. I completely, completely agree with that. It is almost ... You can think of it almost like an adjunct mathematics and numerical optimization in a certain sense. I mean with these combinatorial optimization problems, like it's just been obvious for a while that ... I don't want to offend any of my computer scientist friends who work on combinatorial optimization but usually these things get solved with approximation algorithms and the approximation algorithms involve a bunch of heuristics and the heuristics are not elegant.
Kevin Scott (12:54):
So they are just ... And again, I'm going to end up offending a whole bunch of people but like they're almost like hacks where you sort of take a human understanding of some of this complexity or shortcuts that you can find based on your understanding of the problem domain and you use that to try to accelerate a solution of the problems and what we're seeing is exactly what you said, you can use machine learning systems to really understand or model the complexity of some of these problem domains in ways that are just really hard to do with heuristics or even with analytical modeling. It doesn't mean that you throw your mathematics away and that you don't care about elegance and that you're okay not understanding what these things do, it's just ... It is the next step along this path that we were already on to dealing with a world that is really, really, really complicated and you just sort of need these tools to make progress.
Alex Wang (14:02):
100%. One thing a lot of the community is pretty focused on AGI as kind of like this great end state but there's obviously lots of applications along the way that have already changes people's lives, like AI personal assistance or obviously all the improvements to ranking systems online and there's been now improvements that have seeped into the medical industry, seeped into the financial services industry, et cetera, et cetera. And so there's lots of pretty exciting advancements that are real tangible advancements that are happening. What are some transformative use cases for AI that you think are kind of around the corner. One of them in my opinion is Copilot that GitHub released with Codex, like accelerating the innovation loop of writing code as something that's like pretty exciting and will have a lot of downstream effects, but I'm curious what are the use cases you're excited about?
Kevin Scott (14:57):
Yeah, I mean the way that I have started thinking about these tools is they are here to assist us in doing cognitive work, and we have lots and lots and lots of tools that we've been using for the entire history of the human race to assist us with doing physical work and even in the 17th or 18th century, we came up with a set of physical theories and engineering principles around what physical work actually is and what the science of that is, and we're very early days in having the same thing for cognitive work. But Copilot, it's a good example. No one is going to argue that programming is not cognitive work. You're sitting down and sort of using your understanding of a machine and your understanding of the problem that you're trying to solve and your understanding of this very complicated set of tools that we have for solving programming problems and you're trying to wrestle with all of that complexity to go write a piece of code that does what you intend it to do.
Kevin Scott (16:10):
So Copilot is a tool for helping people do this sort of cognitive work. One of the things that's been especially helpful for me doing is I am at this point what I would call an occasional programmer. So I'm not, thank goodness, I'm not on any sort of production critical path with the code that I write but I still write code because it's a great way for me to think. But I am at a point in my coding life where the ecosystem of APIs and tools is just so hard that I find myself sitting in front of an editor or a notebook or something and I'm just constantly doing searches for what's the right Pandas invocation to slice this data in this particular way.
Kevin Scott (17:08):
So the point of Copilot is to help you manage some of that complexity so you can be more productive at programming. It is absolutely not about ... It's not about removing the need that the world has for programmers. It's almost a recognition that the world has a profound need for programmers and that we want to take our precious few programmers that we have and help them be more productive at their work. Then I think the next step beyond that which is really exciting is sort of widening the aperture on who can be a programmer. You and I may have chatted about this before, but like this is one of the more exciting things to me about machine learning in general is it may help and I think Copilot is a very big step along this path to change what it means to be a programmer from you have to be an expert no all of these different, very complicated things in order to write any code at all to like providing some sort of entry point for people where they can write a little bit of code by trying to teach the computer what to do, not trying to tell it what it must do in very arcane language. So that I think is super, super exciting, especially in a world where we don't have enough programmers to build all of the software that the world needs.
Kevin Scott (18:45):
You touched on a couple of other like really interesting things, so these breakthroughs that we're having with these simulation systems where we're getting orders of magnitude improvement in performance, like that's not just, "Oh gee, we can do a little bit more," it sort of transforms what you're able to do. I'm really excited about what's happening in drug discovery, in protein design right now. So we already saw using these bag of techniques allowed us to come up with structures, protein structures that have super high binding affinity for the receptor binding domain on SARS-CoV-2 but the thing that's even more exciting than that is being able to model the interactions that a synthetic protein like that might have inside of the entire human proteome and like that is computationally infeasible right now. So even if you had solved structures for the entire human proteome, being able to do all of those pair-wise interactions would be prohibitively complicated or prohibitively expensive. But if you had a system that makes the computation of these interactions in a simulation environment a million times faster than they were before, like all of a sudden, you can do just extraordinarily interesting things.
Alex Wang (20:18):
Totally. Yeah yeah yeah. And one thing that you touched on there which I think AI has the ability to really level the playing field and democratize access to technology over time, which is something that's incredibly exciting. There's sort of the famous quote which is, "The future is here, it isn't equally distributed yet." I think that for a lot of us in the tech world, we obviously get very excited because we see the future, we see these large scale machine learnings, we see what the potential is, but I do think one thing that is maybe underappreciated is the degree to which this technology is ultimately going to democratize the future. So maybe, I'm curious to hear your thoughts a little bit on how that happens. I know this is something that you've written a lot about in the past. You grew up in West Virginia, in rural America, and it's something that I know you think a lot about, not only from a technological sense but also a sociological sense. Because I'm curious how you think this will play out.
Kevin Scott (21:18):
Well I think this whole notion of models as platforms, of APIs, of computing systems that allow you to sort of explain in human language and effect you're trying to accomplish versus having to learn a programming language in computer science in order to be able to get a computer to do something programmatically that a programmer hasn't already coded for you is really exactly this step towards democratization. Like the way that we will know that we're being successful here is how many new businesses are being created because they have access to these technologies, because they are able to build a brand new thing that wasn't possible before because they can get access to one of these APIs or because they have their scientists and their engineers figuring out how to apply these new techniques to problems they've been working on for a while.
Kevin Scott (22:30):
I think we really do have both an opportunity and a responsibility to make sure that as we make these platforms available that they're available to everyone. I'm just a firm believer that we at tech companies, in Silicon Valley startups, can't possibly imagine all of the interesting and valuable things that can be done with technology, and that we need not just tens or hundreds of thousands of people who work at these companies to be building things but we need hundreds of millions or billions of people able to harness the power of these machines. Where they get to decide what the machines do, not have someone decide on their behalf what the machine can do for them. I really do think we have that opportunity right now. Like these tools are actually open, like you can be an entrepreneur in rural America, and like this has even changed for the better over the past year and a half with the pandemic. Like you can be anywhere and be an entrepreneur.
Kevin Scott (23:56):
You don't need a team of data scientists or AI engineers sitting right next to you in some place like Gladys, Virginia where I grew up which has a few hundred people, like you could start a business, collaborating with people, using videoconferencing technology and the internet, wherever you all are in the world and running your applications on a cloud and using all of these opensource tools and these APIs that are available to you. I think it's really exciting, and like part of what we have to do is just show people that it's possible.
Alex Wang (24:36):
Yeah. Totally. I want to drill in on one very specific component of kind of the democratization of AI which is one of the things that a lot of these new methods require is they require huge amounts of computer, and over time, I think it's been increasingly obvious that there's sort of like almost an AI compute bottleneck where we just need more and more computer to train these great algorithms. How do you think the AI community in the United States or just even globally should think about addressing this sort of like compute bottleneck where the current resources and funding are highly skewed in favor of a small subset of the community or the industry where there's access to huge amounts of compute resources, but ideally we want obviously this technology to be accessible to all.
Kevin Scott (25:29):
Yeah. So I completely agree that this is a huge problem, but before I get into like why and what we should do about it, like I will start by saying there is some good news. Like the fact that you do have models that operate as platforms and companies that are willing to put APIs around these models and to provide access to the means that not everyone necessarily needs to have a multi-hundred million dollar AI super computer to train a model from scratch. And even if we at Microsoft look at how our machine learning work is progressing over time, we're going from this mode where every team at the company has for maybe the past 10 years thought about ... It's my team and my data scientists and my data and my model and my experimentation framework and my feedback loop and my ML ops to deploy into production and my product iteration process where you're basically building everything and now with these platform models, we're able to ... For natural language, give folks big models that they then are fine-tuning and deploying and if you look at what we were spending just in terms of computation and data wrangling across the entire breadth of like all of these separate machine learning efforts, like it's actually ...
Kevin Scott (27:11):
Even though we're spending an enormous amount of resources training bigger models, we're spending less money training smaller models. So it is, I believe in the long run, especially given how fast we're able to improve the efficiency of the training process that this is going to net out and be less expensive over the long run. But you are absolutely right that ... In the moment right now, if you want to build a 10 trillion parameter model, there just aren't that many places in the world that have enough compute in one place where you can go build such a thing, and so I think it is really important in the United States for our policymakers to think about how we can provide a lot more funding to universities and independent public research labs so they can go build similar sorts of things and help there be a diversity of thought about the systems architecture aspects of this. Because it's super complicated what we're doing and I think Microsoft and Google and Facebook and some of the other folks who are building these really big models are publishing a lot and sharing some of what they're doing but it's not the same as if you had all of academia being able to actively participate in the whole process.
Kevin Scott (28:42):
So the thing also that I'm hopeful for is I don't know that it's always going to be this expensive in terms of compute. So we've had these step function breakthroughs in the efficiency of training, in the efficiency of inference over the past handful of years where I think there are just a huge amount of gains to be had and like one of the intuitive reasons that everyone should believe that is you might have a megawatt cluster for training a really big model. You have a training cluster sitting between your ears that burns tens of watts of power. So there is a biological proof point that training could be much more efficient than it is right now and so what we're doing in Microsoft is we're investing super heavily in scale-up because it's an engineering problem at this point. Like we know how to ... It's complicated, but you know how you go do the work and you spend you spend the money and like this thing is going to scale, but like we're also investing fairly heavily in alternatives to the prevailing paradigm because we believe that it can be better.
Alex Wang (30:08):
I actually, like to your point on American policy, one thing that I'm really curious to hear your thoughts on. So the United States, we kind of led the world on internet standards and 4G standards, and right now we're kind of in this painful battle with China on 5G standards. I know this is something that Microsoft actually sort of stepped in on, did a huge amount of work on, working to make sure that the United States had viable alternatives when it came to 5G.
Alex Wang (30:41):
Something that's unreported, underappreciated and actually frankly kind of misunderstood is the growing battle on AI standards between the United States and China. What do you think that we can be doing as a country or the U.S. policymakers can be doing to ensure that we don't end up in another 5G situation, but where we really come out ahead on standards in governance for AI?
Kevin Scott (31:05):
Yeah. I think the place where it starts is having good public-private partnerships on these things and I know that we have been pretty heavily involved in the National Security Council's commission on AI and Eric Horvitz who is our chief scientific officer and like one of the luminaries in AI machine learning was one of the co-authors of that document along with his colleagues and peers from Google and a bunch of other places. So it is the case that I think we have a robust conversation happening between practitioners in industry in academia and in government about where AI is headed and what some of the policies are. I think that there's a bunch of things that we should be thinking about in terms of national competitiveness and just sort of policies and incentives that we're putting in place just because we have a ... It's not just the U.S., I think it's sort of ... You have this notion of liberal democratic nations have a particular ethos around what it is they want technology to do for their citizens, and how it is they want their citizens accordingly to participate in the development of the technology, and I think by and large, that has served us extraordinarily well over the past century and arguably beyond.
Kevin Scott (32:51):
So I think it's really important for all of us to pay attention and to continue having these conversations and to invest, and like I would argue that we need to be spending more of our public resources in building a healthy set of foundations for AI, starting with making sure that we've got good silicon investments. Again, you hit the nail of the head that we are compute constrained right now and compute starts with compute, so we need to make sure that we can build the chips that we need to ensure that we can build these AI supercomputers. It's super important that we continue to have robust pipelines of diverse students who are getting computer science degrees and PhDs in computer science and focusing and specializing in artificial intelligence and machine learning.
Kevin Scott (33:51):
Then I think, I write about this in my book, but with the Apollo program, we spent something on the order of 2% of GDP every year for 10 years to get to the moon, and it was such a transformative thing for society that now you literally ... I mean like we talk about moonshots, as like a super ambitious thing that we all want to do, and like everybody's got a moonshot. Like the moonshot came from the Apollo program, like where we literally taught the world what high ambition looks like. But in a sense, going to the moon was super arbitrary. But what it took to get to the moon was not, and it gave us our entire defense industrial base, it gave us our aerospace industry, it gave us like just an unimaginable acceleration and boost in technology and I think we could do the same thing by spending an even smaller percentage of GDP than 2% on machine learning and picking as our next moonshot something that is like a problem we really need solved. Like affordable healthcare for everyone or a set of solutions to global warming or ...
Kevin Scott (35:18):
This is the one that I think about a lot, the demographic immersion that we're in the middle of right now where we have a very rapidly aging population in almost the entirety of the industrialized world, including China, Japan and Korea, where if we don't have massive technological improvements to productivity and to problems like healthcare, we're going to be in trouble because you're going to have more retired people aging and in declining health than you have workers to do all of the work that they were doing before. So it's sort of like a two for one, like who's going to help take care of the aging and like who's going to do the work that they no longer are doing because they're not in the workforce and in my mind, that's a thing that we have to have more AI machine learning and the things that will come from that to solve.
Alex Wang (36:18):
I think one of the things that I always love when I talk to you is like you're definitely a very big AI optimist and I think actually what's even more true is you probably look towards AI as one of the few answers to some of these big societal questions that we have. What do you think ... You kind of alluded to this before, which is like hey, Codex and Copilot, which is Codex is not the end of programming at all. It may be the end of programming as we know it today. It might up-level the activity and might make us more efficient. How do you think about this in a general sense? Like how do you think we can best prepare as a country for the ways in which work will change and evolve as a result of AI adoption and what does that mean for people within the AI community for what we can be doing to minimize any negative externalities of that change?
Kevin Scott (37:14):
Yeah, I think ... To answer the last question, I think all of us who are developing AI technologies ought to be thinking about both the good and the bad of what you can do with these tools as we develop them, and it's impossible to imagine all the bad in the same way that it's impossible to imagine all of the good. But you at least have to have some sort of grounding in both directions. Like you can't just be an optimist without being a little bit of a pessimist as well. I think the conversations that we're having right now about what the ... About adverse effects and safety and ethics and bias and ... I think these conversations by and large have been just extraordinarily valuable in trying to get us centered. But I do want to make sure that we don't lose the optimism as we are focused on the bad stuff that potentially can happen because we ...
Kevin Scott (38:28):
Like I really do believe that we need these tools in the world because we've got a bunch of very, very hard problems that are going to be difficult to solve without them and it's not like I'm happy with whatever way it is we ultimately solve the problems but we sort of have to solve them and as an engineer and someone whose entire life has been about looking at things that are broken and imagining how you go fix them, these are very, very promising looking tools. I mean I think the biggest thing that we can do to prepare people for what's coming is just make sure that we all who are in the field are being better teachers for everyone about where the technology is and where we see it going. I probably mentioned this to you before as well but yeah we have Neil deGrasse Tyson who I love goes on late night TV to talk to the American public about astrophysics, which is great. Like we need more scientific literacy in the population at large, but we don't really have that corollary for computer scientists and engineers or God forbid artificial intelligence practitioners who are like really trying to make that effort to go out and talk to the public about what's going on and what they should be hopeful about and what they should be concerned about.
Kevin Scott (40:06):
The last thing that I will say is I think that in addition to all of us taking on those two responsibilities, like thinking about both the good and the bad and like doing more, talking to just folks everywhere about what it is that we're doing to try to get them more involved and inspired is I do think these tools as they become democratized will need people to be literate or numerate on several dimensions of technology but like it also reemphasizes our need to focus on like what is essentially human. Like creativity, our ability to communicate with one another, our ability to reach consensus, to build these stable social equilibria. So I think that's an important thing that we got to educate our kids for. I tell the teachers and the administrators at my kids' school this all the time, it's like they fret about test taking and like are we preparing these kids to get perfect scores on standardized tests and I'm like, "Well the robots already can get perfect scores on standardized tests." It's like that's not the thing that is going to distinguish our children in the future and prepare them to actually use the tools to go do the creative things that only they can do.
Kevin Scott (41:51):
So yeah, I mean I think just us focusing a little bit on what makes us human and it is not doing repetitive cognitive work anymore than we lament the existence of the forklift because it can lift more weight than a human being. Like we're not going to in 10 years from now lament the existence of a machine learning system that can assist us with cognitive work. We're going to think, "Oh the crap that this thing is doing is like that was miserable stuff that I never wanted to do in the first place."
Alex Wang (42:30):
Yep. Yep. Totally. I want to actually go back to moonshots because this is always a fun conversation that we have. Obviously I think ... To your point, sort of the Apollo project was this incredible goal, this incredible mission that created all these positive externalities, all this good, and I think there's been more recent examples of maybe similar kinds of instances where there's some sort of like public-private university partnership around goals that inspire a lot of innovation, for example, the DARPA Urban Grand Challenge, Urban Driving Challenge, kind of created the modern autonomous vehicle industry which is super exciting. What are some other moonshots that ... We have a lot of really bright people, really talented people in the audience today. What are some other moonshots that you've encouraged people to be thinking about or that you want to see more people be doing?
Kevin Scott (43:31):
So I think security would be a very good one. How to use these tools, machine learning and otherwise to help defend all of us from malicious uses of the technology and the infrastructure that is in the world. I would love to see folks thinking about a grand challenge in manufacturing and making. One of the things that we're seeing right now as a consequence of the pandemic is we have just a complete and utter global snarl in our supply chains and so what do we need to do to robustify all of that stuff so that we can continue to build all of the things that society needs and maybe even build more of those things more cheaply and more flexibly than we have been. I think there is certainly a grand challenge on health in general and like that's a very broad category, it's everything from how can you have good expert companionship for the elderly for instance. There are places in rural Maine for instance where for no amount of money can you pay for the amount of elder care that is needed in that population and like that's just sort of a canary in the coalmine for what's about to come.
Kevin Scott (45:14):
So how do you use technology to help with elder care so the elderly can live a healthy, dignified life. There's probably a grand challenge on drug discovery that would be outstanding so like we know that there are an infinity of molecules out there that we can use for good therapeutic effect. So we hopefully ... Well some of us at least have seen the miracle that are these MRNA vaccines that are just ... Now have revolutionized immunology but those same technologies and ones that are even more sophisticated can ... May be the things that are going to help us with cure cancer finally and to deal with neurodegenerative disease. I could go on for another half an hour here.
Alex Wang (46:23):
Yeah no, I think the drug discovery one is such a great example, and it really speaks to when we talk about like what's the potential optimization. It's so massive, right? I remember looking, if you looked at the early coverage on how long is it going to take us to develop and launch a vaccine for COVID, the original estimates were like a decade and then with the recent advancement in biotechnology, we can move so much faster and that's not even with AI. The potential for us to just get better at all these things is massive.
Alex Wang (46:56):
I kind of want to end again, I think we're big AI optimists and so it can be fun to kind of think about think longterm. As you know the quotes usually go, we overestimate what we can do in a year but underestimate what we can do in 10 years. What do you kind of think are good goals for the AI community to set over the next decade? Like what do you think, looking back, if we don't accomplish, we're going to be pretty disappointed in ourselves if we didn't make like X happen with AI?
Kevin Scott (47:31):
I think we should have objectives around real democratization of the technology. If the bulk of the value that gets created from AI accrues to a handful of companies in the West Coast of the United States, like that is a failure. I think we need to have objectives around how do you measure real human benefit so ... Like all of the folks who are practitioners at your event or like understand that these AI systems are very good at optimizing objective functions and like we just need to choose objective functions that are ... That really take into consideration human benefit and sort of social good. I think we need real innovation and the fundamental algorithms here because we have like a set of things that are working but like obviously they are very inefficient relative to where they could be. So I don't know. Maybe those three things I think in my mind ... Like they are literally top of mind for me. Like it's not even 10 years, like if we can't get those things right in the next two years, we're going to have big problems.
Alex Wang (49:07):
Yeah. No I love it. Well with that, thank you so much for taking the time, Kevin. This was a super fun conversation, we covered a lot of ground and thanks for coming back to Transform.
Kevin Scott (49:17):
Yeah. Thanks for having me.