The Next Five Years of Keras & Tensor Flow with Francois Chollet
François Chollet is a software engineer at Google, where he leads the Keras team. He is the author of a popular textbook on deep learning. He also does research on abstraction, reasoning, and how to achieve greater generality in artificial intelligence.
Francois Chollet discusses the latest developments in the Keras & TensorFlow ecosystem and how Google is preparing for the next generation of deep learning research and applications.
Brad Porter: Thank you Alex and thank you Drago for joining us. Our next speaker is Francois Chollet. Francois is a software engineer at Google, as well as the creator of Keras. Keras is an open source, deep learning library designed to enable rapid experimentation with deep neural networks. Francois is also the author of the popular textbook, Deep Learning with Python. Francois is a world-class engineer as well as a philosopher and thought leader in the field of AI and machine learning. Many of you in our audience are likely familiar with Keras, which serves as the interface for deep learning libraries like TensorFlow. Francois joins us today to talk about the latest developments, and the next five years at Keras and TensorFlow. Francois, thank you for joining us and please go ahead and take it away.
Francois Chollet: Hello everyone, my name is Francois, I work at Google on the Keras team. My job is to build the tools that support the workflows of machine learning engineers mainly at Google, but outside of Google as well, since Keras is an open source project. I believe that today we are at a very interesting time in the history of machine learning. Machine learning has been around for a while, but right now we are in a transition period, where it’s becoming a ubiquitous tool that’s part of the toolbox of every developer out there, similar to web developments in the late 1990s. And we are already applying machine learning into an amazing range of important problems across domains as different as medical imaging, agriculture, autonomous driving, education, disaster prevention, manufacturing, I think it’s still early days for machine learning and for deep learning in particular. Deep learning has only realized a small fraction of its potential so far, and the full actualization of deep learning will be a multi decade transformation.
Francois Chollet: Over time, deep learning will make its way into every problem where it can help, hence it will become as commonplace as web development is today. And this week is a special week. Keras is turning six, it’s Keras’s birthday and Keras today is bigger than ever. It’s used by approximately 6% of all professional developers in the world, according to a survey that was run by Circular Flow, and six years is a long time in deep learning years. By the way, that’s our mascot character there, it’s a Unicorn. So, today is a good time to reflect on where we are now and where we are going next. And that’s what this presentation will be about. I’ll talk about what Keras and TensorFlow look like today, and about how they will be evolving over the next few years.
Francois Chollet: So let’s start by talking a bit about the relationship between Keras and TensorFlow. Keras and Tensorflow have had a symbiotic relationship for many years. Keras serves as a high-level API for TensorFlow, and TensorFlow what it really is, is an infrastructure layer for differential board programming. So it contains functionality for building any sort of system based on large scale distributing new microcomputing, potentially leveraging (inaudible 00:03:20). So, TensorFlow is not a deep learning library, it’s actually way more general than that, and meanwhile Keras is its user facing interface. So Keras is what makes Tensorflow accessible and productive. It contains deep learning specific concepts like, layers and optimizers and so on. And while TensorFlow is an infrastructure layer for manipulating tensors and gradients and such, Keras is really this UX layer for deep learning specifically. UX is often a neglected aspect of the rubber tubes. I often encounter this belief that tools only have to make things possible, that they don’t have to make things nice or productive and I think that’s wrong. I think that UX design is a hugely important component of any software tool.
Francois Chollet: Here’s why UX matters. If you want to build something correct, a factor that’s critical is the speed at which you can iterate on your ideas. Let’s say you want to win a data science competition, for instance. It’s not the smallest person who wins, it’s not the person who studied it, with the best ideas. It’s actually the person who iterated the most times on the ideas and in deep-learning, iteration cycles depend on three factors. You start from an idea and you use a deep learning framework to implement an experiment, to test the idea, then you run your experiment on the computing infrastructure like, GPUs and TPUs, and finally you analyze and visualize your results and your results feed back into the next idea, and developing great ideas depends entirely on your ability to go through this loop of progress as fast as possible. And these three things are critical, you want good visualization tools, you want fast GPUs or big data centers and finally, you want a software framework that enables you to move from idea to experiment as fast as possible.
Francois Chollet: And that’s what Keras is, it’s a medium for expressing your ideas about deep learning experiments as fast as possible, as easily as possible, with the most flexibility. That’s what Keras is optimized for. So you might think, good UX means it’s easy to use and that’s good mostly for beginners, for basic users. And it’s a take I encounter a lot, but I think it’s a very bad take. The people who stand to benefit most from good UX, aren’t the beginners. It’s actually everyone including the most advanced users because good design makes you more productive by minimizing iteration time. People who use Keras fall into three categories, and each of these categories benefits from highly productive frameworks. There are basic users who could be students for instance, then there are engineers who care about production needs. So for instance, the engineers at Waymo that work in autonomous driving, they are heavy users of Keras and the engineers working on machines learning at YouTube are also heavy users of Keras. And finally you have researchers.
Francois Chollet: So right now, there’re around 250 deep learning papers that I have to bench mark on every month, and they chose Keras, and each of these user profiles need something completely different out of it. So, the question you’re probably just trying to ask is, how do you achieve good productivity across user profiles as different as a first year grad student, or a Kaggle grand master, or a Waymo engineer or a researcher in academia? And the answer to that is, the key design principle that Keras uses is progressive disclosure of complexity. So Keras doesn’t force you to follow a single true way of building and training models. Instead, it enables a wide range of different workflows from the very high level to the very low level, corresponding to different user profiles. So using Keras, (inaudible 00:07:28) just couldn’t fit, and letting the framework do it its thing, writing variable code, or you could be using cast more like NumPy, taking full control of every detail it will take, which is the sort of tool that researchers are looking for.
Francois Chollet: And so the spectrum of workflows is structured around two axes. There’s the model building axis and there’s the model training axis. So let’s take a look at the model building aspect first. The simplest model you can build is the sequential model which only allows a stack of layers, so it’s a good fit for beginners. It’s very simple, very approachable, it looks like a list. Now, if you want to find some instruments for input, somewhere to put outputs, if you want to share layers, if you want more complex model topology, then you can use the functional API. And it still provides a lot of guidance and protection against user errors. So, it’s a good fit for most machine learning practitioners, it has good balance between flexibility and ease of use. But then you can gradually extend and customize your models further by using custom layers, custom losses, custom metrics and at the end of that spectrum, you can use more of safe glossing, in which case you are writing your own code, and Keras only provides a very lightweight scaffolding for structuring your code.
Francois Chollet: And with all these workflows, your models have the Keras model, so they’re still exposing the same API surface which enables things like transitioning your code sharing between teams and so on. All these models can talk to each other because they’re built on top of the same classes, the same objects. Another axis of progressive disclosure of complexity is model training. The simplest way to train the model, is to collaborate in fit method, just like in second. But the fit method only covers supervised learning so if you’re not doing supervised learning, you need to write your own training, but you don’t have to write it from scratch.
Francois Chollet: An easy way to customize what it is doing is to override specific methods on the model class, train step method. And this gives you the ability to implement custom training, that will still support all of the built-in features of fit, like call back support from GPU or CPU distribution. But that’s great for something like generative models or unsupervised learning. And finally if you really don’t want to use fit, then you can write your own low-level level training entirely from scratch. Today, Keras is six years old and we can ask, what about the next few years, like 2025, maybe even 26? What are we going to be building next? The real question we should be answering is actually, where’s the world of deep learning headed? What are the big trends that we’re seeing? And what can we do to facilitate these trends and create the most value for the industry and the research community?
Francois Chollet: Because, as makers of tools, that’s our job. We track trends, and then we facilitate them in the same way that Keras and TensorFlow originally facilitated the rise of deep learning, in 2015 and 2016. So now we’re facilitating the evolution of deep learning as a field, going forward. Now, four trends that I’m seeing. The first one is to have more use of different parts in our workflows, not just model reuse but also feature reuse. So, I think we are going to be seeing larger ecosystems of reusable pre-trained models for many different tasks. Another trend is towards more automation and higher levels of workflows. Today, engineers and researchers are still doing a lot of things by hand through trial and error, and doing things through trial and error is ruining the job, it’s not the job of a human. And so, there’s going to be a big move towards automated hyper-parameter tuning towards architecture search and end to end or domain.
Francois Chollet: Another trend, and that’s really a unilateral thing in computer science, is a trend towards faster specialized chips, and distributed computing at an increasingly large scale, and workflows that are moving away from local hardware and into the cloud. So that’s not exclusive to deep learning, right? You already see it across the entire software industry. And the last thing is this, I think today we are only using deep learning for a tiny fraction of the problems you could solve with deep learning, maybe 10%, 15%. Well deep learning is actually applicable to pretty much every single industry. So I think about the next 5 to 10 years, we’re going to see deep learning move, out of the lab into the real world, where we’re going to tackle what’s remaining, 85% to 90% of problems that we can solve.
Francois Chollet: So let’s talk about the first trend. Today most deep learning workflows are still very inefficient, because they involve recreating the same kinds of models over and over, and retraining these models on similar data sets over and over from scratch, every time, pretty much. Most work we do does not get reused. And so there’s a big contrast between on the one hand, the traditional software development world, where reusing packages is a default and where most engineering actually consists of assembling together existing functionality, and on the other hand, you have the deep learning world where, you have to build your own models and train them pretty much from scratch, every time, instead of just importing what you need. And I only think the most successful programming platforms like Javascript and Python fit your strong network effects, they then combat a multitude of different components and packages, where each component is making the other components in the ecosystem more useful and more valuable.
Francois Chollet: They have an ecosystem of reusable parts that cover pretty much everything you might want to do. Like if you want to write bad English, there is a package for that, if you want it in gravity, just ask for a module for it in gravity, so you can just import it to gravity and start flying. You shouldn’t have to build the things you need from scratch, you should just be able to import them. And I think that’s coming to the deep learning world. So on the Keras side, which when we enable that, we’re going to be expanding our API into domain specific packages that, particularly, KerasCV and KerasNLP, that provides reusable building blocks that you can use to quickly create, any kind of computer vision or natural language, and we’re also expanding, offering pre-trained models and the classifications package. So right now it’s really focused exclusively on image classification and that’s really about to change. We’re going to get in, more computer vision, we’re also going to be expanding it to natural language and pre-trained models as well.
Francois Chollet: The next trend is towards increasing automation. A large fraction of deep learning research today still consists of fiddling with other parameters, and that’s about to be absent because this is really not a job for humans. So, in the next few years, we are going to move beyond handcrafting architectures, beyond manually tuning your learning rates and constantly fiddling with trivial configuration details. Machine learning workflows will become increasingly higher level, and if you think coding model that fits is a high level, think again, because if you stand back, you’ll realize that, this remains a very specialized, fairly low level type of workflow, that still requires significant expertise in terms of, data management, modern architecture choices, and so on. There are still tons of decisions that you have to know how to make, and for many use cases that’s going to change.
Francois Chollet: So, in the relatively far future, like in 5 to 10 years, you can imagine that we’ll have APIs that you connect this. The input that the user will provide, will be in the form of a dataset and a metric to optimize for, as well as some way of specifying expert knowledge about the problem domain. And then, white-box algorithm will take over and will develop a solution, via a combination of search of pre-training feature bangs, architecture search informed by bangs of modern architecture parents, so we can sync things like transformers or Resnet, and finally hyper-parameter tuning. And the output of this process will not just be a trained model, but to deploy the model in the format you need, it could be API, can be multi-location embedded mobile app, and so on. So, that’s where the data scientists will be able to focus on the problem specification and the data curation aspect of the workflow instead of fiddling with the learning or whatever else.
Francois Chollet: So in this graph here, it kind of looks like a one way process where you’re going linearly from the problem specification to the solution. But, in fact, it will be more of a dialogue between the algorithm and the user, because the algorithm is the white box. So the user will be able to visualize what the algorithm is trying and the algorithm will provide feedback to the user as well. So for instance, when the data is insufficient or the objective is badly specified, the search process will be able to tell you what you need to work on, in order to make the program more solvable and improve your results.
Francois Chollet: So this is obviously a very advanced system, it’s not a system that we will build in a day, it will happen layer by layer over the years, with each layer building on top of the foundations that were established by software that previously existed. So pragmatically with the Keras project what we are doing, it’s not quite building this exact system because that’s actually not something we are quite ready to build yet. Instead, we’re focused on building solid foundations that will eventually enable the system, so that the system will be built on top of the foundations that we’re establishing today.
Francois Chollet: And an important part of these foundations is the Keras Tuner package. So Keras Tuner is a very general flexible… It’s kind of what happened when the tuner framework that we’ve released a couple of years ago, so it’s pretty much here now, you can use it today, you can use it or not, internally, and it’s built for Keras, but it’s not limited to Keras. You can actually use it for pretty much anything, you can sync for a second clan for instance, or actually boost models and support a number of built-in strategies, and it’s also fairly easy to add your own strategies, like to a researcher, working on tuning. It also features tuneable models, and tuneable resonance, tuneable exceptions and so on, because it actually takes a lot of effort and skill to craft the right search space for a problem, the right hyper-parameter depth search space. So we’ve made these premade search spaces, that are all events to broad categories of competition tasks. You just add data basically and you get to a very good model without much thought.
Francois Chollet: And, the next layer beyond Keras Tuner, after the automation stack is AutoKeras. So AutoKeras is an AutoML library. It’s end to end, you start from data, you end up with a train model and it’s built entirely on top of Keras Tuner. So now you start to see how these different pieces of software are really, infrastructure layers stacked on top of each other. TensorFlow is the lowest layer, which enables Keras, and then Keras enables Keras Tuner, and then Keras Tuner enables AutoKeras at the highest level. So, AutoKeras automates a model development process to analyze your dataset, determine the best model architecture templates for your model, and then it will run, update your search hyper-parameter tuning to find the best model. And in the future, we’ll also add feature search of pre-trained feature banks to reduce the need for altering data.
Francois Chollet: The next trend that we are anticipating is, even more parallel chips and more of them, so it can distribute the training and move towards cloud-based workflows. So, in the future it would be as easy to start training and model hundreds of GPUs, as it is to train a model in a core lab notebook or on your laptop today. So, you’ll be able to access remote large-scale distributed resources, with no more friction than access to your local CPU.
Francois Chollet: And this will be hugely impactful for our engineers and researchers, and here’s why. So, remember the loop of progress I was talking about, the quality of your ideas is a direct function of the speed at which you can iterate registering from idea to experiment results, to the next idea. And to iterate this to be as fast as possible, you need three pieces. So, first you need a very proactive deep learning framework, to go from idea to experiment as fast as possible. But if the speed at which you can implement your ideas is no longer the bottleneck, then the next bottleneck is the speed at which you can run your experiments. So with faster hardware and easy to use distributed training, we’re increasing iteration speed again and so, we are accelerating progress. Suppose you have a Python script that contains some Keras model training. So first you debug it locally, so that it’s just running on your laptop, or maybe it’s running in a colab notebook. And once it’s debugged, it’s working, you want to run it as fast as possible.
Francois Chollet: So you’re not going to run it on your laptop, or on the free GPU of colab. So what can you do? Well, you just add one line of code to your script, and it starts your script and it’s set. And now it should be running on about eight workers with four GPUs biomarkers, it will be set with GPUs, or it could be running on a CPU. So, just one line, and what this one line does is this. So, first it will collect your scripts and its dependencies, as well as any local data files that you’ve specified in the direct config. Then it will inject a distribution strategy configuration into your cast mode, so you don’t have to worry about distribution. We can still specify it manually if you want but, it’d be great if we can just turn all these things forth. All right, we’ll create a document with all of these, then it will connect to Google Cloud.
Francois Chollet: It will spin up machines corresponding to the configuration of your choice, it will start training, it will stream the logs and the saved funds like the saved models, to a location of choice. So, you can configure those coolbacks for instance, and it works for all kinds of hardware and it also works well with Keras Tuner. So imagine doing distributed work that you need, without having to worry about things like cluster configuration at work with communication. So we think this is a pretty good product. We are making scaling that easy and we are making it easier in orders of magnitude. So we think this will really help, cast the tuition time from between those families.
Francois Chollet: And one last trend I wanted to talk to you about, I think, research labs are increasingly not where the cool stuff is happening, like spending a billion dollars to train big reinforcement learning models to beat some benchmark, isn’t cool. You know what’s cool? It’s deploying machine learning into the real world. Deploying machine learning to every problem that we can solve with these technologies because, so far we’ve only scratched the surface. So we need to make deep learning possible, it should be possible to just save the Keras model, and run it anywhere, on the mobile device, in a web browser, on an embedded system, on the microcontroller, and so on. And not just for inference but also for fine tuning and retrain. So what are we doing to make models more possible?
Francois Chollet: So, one thing we’ve shared recently is a complete rework, of how we do pre-processing data counts. We are taking the stance that pre-processing should be done as part of the model, not as a step before training, because otherwise a problem you’re going to be facing when you try to load your model with JavaScript, for instance, is that you have to, reimplement the input data pipeline into JavaScript. And it has to be the exact same pipeline because, otherwise your model won’t work or maybe its performance will degrade, even if you’re just tracking off. So that’s what’s called a training setting screw, and this can get very complicated for natural language processing, for instance. So, we’re introducing a new set of layers that do pre-processing as part of your model, and embedded into your model, when you explore it. So Keras models should be, raw data in and prediction out, they can ingest strings, they can ingest raw images and that makes them fully possible.
Francois Chollet: Another thing that we are doing is that we are really betting on TensorFlow lite for mobile devices, and Tensorflow JS for deep learning browser, the ability to run deep learning browser on mobile devices at production level performance, and then also use a range of new applications. When it comes to on-device machine learning, so Tensorflow lite is borrowing machine learning on Google, and with devices across Google products already. It’s used in pretty much every on device and manufactured in large products across Google, it’s used for instance, as well as Yahoo, and Google translates, it’s used in Google Assistant, if you have an Android phone. And when it comes to machine language aspect, transactional GS is a platform for deploying and even training models directly in JavaScript, and this has been enabling some pretty cool applications.
Francois Chollet: Here are a few recent examples. For example, in the top left corner, is a Google doodle, in which users could synthesize music in the style of, Pac, using a machine learning model, running in the browser and this has been pretty popular, and when it got released, over just three days, more than 50 million people were able to create, save and share the music that they made. And the example in the top right corner, that’s Facematch, which is an open source pre-trained model, that you can start using it directly in your own applications. And it’s pretty good for formatted Realty applications, for instance.
Francois Chollet: Another pre-trained model is the body-pix model. Body-pix in the most present segmentation, again, in both single and multiple persons image, and it can identify 24 reports. That’s pretty cool. So for instance, this model can be used to draw faces or draw backgrounds in a video. Another thing which we’re doing to make models possible is the model optimization toolkit. So the model optimization toolkit makes it practical to deploy models, that should be actual models, on mobile and many devices. So it’s a set of activities that can dramatically reduce the size, the memory consumption and compute consumption for models, especially as inference. That’s basically, a way to drain your users battery on phones, for instance, whereas you need to run in a very resource sponsoring environment. And one of this optimization techniques, is post-training quantization, which is where you convert your model weights from floating-point format] to integers, and compared to three different weights, quantized model, can be up to two to four X faster on CPU.
Francois Chollet: We also, I suppose, provide weight pruning, in particular, Pruning-aware mode trainingl. And weight pruning, it’s about setting most connections in a network to zero, which means that your weights fires will become sparse, will become much smaller and to conserve more accuracy, weight pruning needs to be done during the training, according to a specific schedule. And it’s actually that hard to sift through by hand but we have an easy to use API that does it for you, and it can achieve up to 90% spot setting so we can judge smaller size by 10 X, at only a very small loss of features. So to sum up, we have four trends in the deep training world. We are looking at quite a reusable model on that next year, and modules and training features. We are looking at more automation, larger scales for training and seamless access to cloud resources.
Francois Chollet: And finally portable models and real world deployment, and there’s a lot of synergies between these trends. So for instance, to push towards automation, that’s making machine learning more accessible and likewise, making distributed cloud computing easy to use, that’s making machine learning more accessible. You don’t need to be an expert or to have your own hardware in order to train large scale image classifier. And meanwhile, if deep learning is more accessible, then it’s also going be moving to the real world faster, because right now we are leading at a transformative moment, where machine learning is moving at unprecedented rates and problems that we thought were impossible or too complex to solve just a few years ago, they’re now solved by applying deep learning. And we’re not going to realize the full potential of these technologies, if we wait for tech industry researchers like, Google, Microsoft, and Amazon to solve all the problems that need solving, because the tech industry is often not even aware of the problems that need solving.
Francois Chollet: For instance, you can apply deep learning to optimize fish farming in Norway, or to monitor the Amazon rainforest for illegal acts. So in order to realize the full potential of AI, we need to make these technologies radically accessible. We need to put them into the hands of anyone with an idea and some coding skills so that, the people who are most familiar with these problems can start solving these problems on their own, in the same way that today anyone can make websites without needing to wait for the tech industry to make a website to solve the corresponding problem. So, that’s the future that we’re building, layer by layer, incrementally. It’s a future where AI is a radically accessible tool in the hands of everyone, not just an expense. And that’s how we’ll succeed in deploying these technologies to every problem that you can solve with these technologies and realize the full potential of AI. Thank you for listening, and I’m looking forward to see the cool applications which will be with Keras and TensorFlow.