TensorFlow supports the following environments, Mayes said: common web browsers, server sites via Node.js, mobile native via React Native or progressive web apps, desktop native via the Electron framework, and even IoT devices such as a Raspberry Pi via Node. Developers can run existing models, retrain them via transfer learning, or write their own models from scratch. TensorFlow.js is open source.
Now the tool can support the execution of other forms of TensorFlow.js models, too, Mayes said, either directly or through a converter.
“We can digest and run TensorFlow Lite, MediaPipe, TensorFlow Hub, and TensorFlow Python models, which can help get more eyes on your research with an audience far larger than before, [since] the web is designed to have high shareability with just a single link.” —Jason Mayes
Mayes discussed some of what developers are creating with TensorFlow.js. For example, the videoconferencing platform InSpace is using real-time toxicity filters in its web conferencing app. That means when a user types something inappropriate or offensive, it’s flagged before it’s sent to the server for processing, he explained. The filter alerts the user that “they might want to reconsider what they’re about to send, creating a more pleasant, conversational experience on the platform,” Mayes said.
Healthcare company IncludeHealth is using TensorFlow.js’s pose estimation models to allow physiotherapy at scale, he added. With many people unable to leave their homes or travel these days, it's especially timely.
“This technology allows for a remote diagnosis from the comfort of their own home, using off-the-shelf technology, like a standard webcam that many people have access to.” —Jason Mayes
Another use case is for enhancing the capabilities of fashion websites. Mayes showed a body segmentation model developed with custom logic for estimating body measurements. This allows the website to automatically select a correctly sized T-shirt, for example, at checkout.
Mayes cited Google Magenta Studio's tone transfer model, which turns your voice into a musical instrument and runs entirely in a browser. “You can simply sing any song or tune that you wish and hear it being played by an instrument of your choice,” he said.
TensorFlow.js apps can “make the world more accessible,” Mayes said. The partner innovation team at Google uses it to understand sign language gestures. The project is built using a combination of ML model outputs for face, hands, and body. That accommodates the fact that sign language is communicated not only with hands as signs, but also with facial expression, overall body position, and posture, he said.
In fact, he said, the library can be used for any sort of gesture recognition because it's generalizable to any human-computer interaction use case. This opens up possibilities for web-based creative experiences, fitness and health applications, and even human computer interaction research, Mayes said.
He also demonstrated using object recognition to run the COCO-SSD model live in a browser to provide bounding boxes for 90 common objects the model has been trained on. This allows users to understand not only where in the image the objects are located, but also how many exist, which Mayes said is much more powerful than image recognition alone.
The face mesh model provides high-resolution face tracking; Mayes said the model can recognize 468 points on the human face across multiple faces—all in real time. Several companies use this with existing web technologies, such as the Modiface app, which combines a face mesh with Web Graphics Library (WebGL) shaders for virtually trying on makeup in augmented reality.
Mayes said research teams at Google contributed to two recently released pose estimation models. The first, MoveNet, is a fast and accurate model that tracks 17 key points such as “left knee” and “right shoulder.” It’s optimized for diverse poses and actions and can run over 120 frames per second on an Nvidia 1070 GPU, on the client side in the browser.
The second, MediaPipe's BlazePose, provides 30 free key points. It is also tailored for a diverse set of poses and can allow gesture-based applications that may be useful in certain projects, Mayes said.
One developer combined WebGL shaders with TensorFlow.js to allow him to appear to shoot lasers from his eyes and mouth, Mayes said. This was done using a face mesh model running in real time in the browser, he said. A practical application of that might be to provide a creative experience for fans during a movie launch.
TensorFlow.js can also be combined with other emerging web technologies, such as WebRTC for real-time communication, AFrame for mixed reality in the browser, and Three.js for 3D. The result would be “digital teleportation” to anywhere in the world in real time.
Mayes gave examples of how to use other pre-made models, such as a text search that uses Google’s BERT question-and-answer model in TensorFlow.js in the browser. This technology allows anyone to train a Q&A system.
People can also write their own models from a blank canvas, he said, if they're familiar with ML. TensorFlow.js has two APIs they can work with.
Google also supports the ingestion of Keras and TensorFlow SavedModels directly within Node.js without any conversion needed. That would allow a user to directly integrate with web teams that are most likely going to be using Node.js as their preferred framework, he said.
Mayes said that to execute homemade Python models in the browser, TensorFlow’s command-line converter will convert the model to the adjacent format needed in the browser.
In terms of performance, the Node.js implementation talks to the same C++ bindings behind the scenes as Python does, he said, “so you get the same hardware code acceleration for both CPU and GPU.”
To ensure privacy is maintained, inference is performed on the client's machine and no data is ever sent to a third-party server. This is particularly important in regulated industries that cannot transfer data to a third party, such as a cloud provider.
Regarding cost, if no data is being sent to the server, there is less bandwidth used and therefore no need for hardware servers. With no need for server-side CPU, GPU, and RAM for inference, costs are kept down, he said. That means users only have to pay for hosting the website assets and model files, which Mayes said, “is far cheaper” than running a full-blown ML server.
The TensorFlow.js team has also created benchmarking tools to help users understand how TensorFlow.js models perform on different devices. This tooling is now available on the TensorFlow.js GitHub Repository.To find more step-by-step resources, Mayes suggested checking out Google Codelab and searching for “TensorFlow.js.”
People can also visit the TensorFlow.js website for more detail on the library or to check out more of Google’s pre-made models. The open source code used can be found on GitHub. Mayes recommended that technical questions be tagged with “TF.js” so the team can spot them easily.
There are “far too many examples out there every single week” to fit into a single presentation, Mayes said, but you can check out what people are doing with the tool by searching on the “#TFJS” hashtag on social media, such as Twitter and LinkedIn. “And if you make something using TensorFlow.js, be sure to use the hashtag too, for a chance to be featured at future events,’’ he said.
Special interest groups have formed to work on the key areas of research; Mayes said anyone is invited to join and help define the future of the TensorFlow.js and related projects.
For more details on how to use TensorFlow.js, watch Mayes’ talk, “Building Next-Generation Web-Based Applications Using TensorFlow.js with Google,” on Scale Exchange.