With investments in projects and ecosystems like TensorFlow, Jax, and PyTorch, Google has used machine learning (ML) to revolutionise several of its services over the past 20 years, including Search, YouTube, Assistant, and Maps.
Many AI systems rely on closed or exclusive methodologies; hence these OSS efforts are crucial. This wall-garden strategy stifles innovation, hinders efforts to make AI explainable, ethical, and equitable, and raises entry barriers for developers.
Google asserted that it is dedicated to open ecosystems because we are adamant that no one company should possess AI/ML innovation. In the blog post, Google explores some of its major OSS AI and ML contributions from recent years. It also discusses how its dedication to open technology might aid businesses in innovating more quickly and adaptably.
According to three pillars, Google’s open source activities support and facilitate AI initiatives:
- Access – The most recent ML technology may be utilised by developers, researchers, and businesses of all sizes thanks to OSS. It is essential to democratising ML innovation, enabling customer choice and variety in software and reducing operating costs while speeding up scalability for all.
- Transparency – Open source data sets, ML algorithms, training models, frameworks, and compilers provide due diligence and community evaluation. This is crucial for ML since it supports reproducibility, interpretability, equity, and increased security.
- Innovation – More innovation happens spontaneously due to increased access and transparency. Google’s clients and partners use open source ML frameworks and toolsets to spur additional innovation in the industry.
TensorFlow, JAX, TFX, MLIR, KubeFlow, and Kubernetes are just a few of the open source software projects that Google has contributed to during the past two decades. It has also sponsored important OSS data science projects, including Project Jupyter and NumFOCUS. By focusing on these efforts, Google Cloud aims to be the most refined platform for the OSS AI community and ecosystem. Initiatives like these have helped Google become the top contributor to the Cloud Native Computing Foundation (CNCF).
Google’s OSS policy covers the full “idea-to-production” lifecycle, from gathering data to training models to managing infrastructure to encouraging experimentation and model improvement, as the dangers of closed technology can manifest at many stages across ML pipelines:
Data acquisition: starting the journey from idea to production-ready ML model
Data is the first step in developing an ML model from a concept. TensorFlow Datasets offers a set of useful APIs that make it simple for users to organise their datasets, whether they build with TensorFlow, Jax, or other ML frameworks, in addition to assisting users in acquiring ready-to-use, adaptable, and highly-optimised datasets (including image, audio, and text).
Model development and training: shortening the path from data to useful ML
OSS libraries support ML algorithms’ designers, implementers, testers, and debuggers. On this front, some of Google’s contributors are:
- The TensorFlow core framework, which provides APIs to assist data scientists and programmers in creating and honing production-grade ML models on distributed and accelerated infrastructure powered by GPUs or TPUs;
- The fact that Google was a founding member of the PyTorch Foundation, placing Google in a position to promote the use of ML by creating an ecosystem of open source projects around PyTorch;
- Developers can easily design and train ML models quickly with Keras, a lightweight and robust ML framework that is well integrated with TensorFlow;
- Model Garden, which offers open source, Google-maintained implementations of numerous cutting-edge computer vision and natural language processing models as well as APIs to quicken training and experimentation;
- Jax is a lean, intuitive, and modular system that combines automatic differentiation (Autograd) and the Accelerated Linear Algebra (XLA) optimising compiler to provide high-performance ML for rapid research and production;
- TensorFlow Hub, a repository of trained ML models that are ready for fine-tuning and deployment; and
- MediaPipe is a cross-platform open source project that allows users to leverage customisable ML solutions for live and streaming media, including text and video.
ML infrastructure management: scaling valuable models with powerful backends
Accessing and administering ML infrastructure, especially at scale, can be a barrier for many organisations. For this reason, Google has invested in efforts like:
- The TensorFlow Extended (TFX) platform provides software frameworks and tooling for comprehensive MLOps deployments, assisting developers with data automation, model tracking, performance monitoring, and model retraining;
- Kubeflow, which makes it easy, portable, and scalable to install ML workflows on Kubernetes; and
- Selected researchers who publish peer-reviewed articles and/or open source code are eligible for admission to the TRC (TPU Research Cloud), which provides free access to a cluster of more than 1,000 Cloud TPU devices.
Experimentation and model optimisation: encouraging discovery and iteration
Without robust processes for experimentation and optimisation, data, model training tools, and infrastructure can only go so far. For this reason, Google has contributed to projects like xManager, which allows anyone to run and monitor ML experiments locally or on Vertex AI, and Tensorboard, which makes tracking and visualising model performance metrics easier.
These areas of emphasis will benefit not only Google’s clients but also the entire open source AI community.