Enabling collaborative data science with Nebari

Authors

Speaker Image

Description

Data scientists increasingly work in teams, but the tools for effective collaboration still face significant challenges. These include: * Easy deployment: Some DevOps experience is required to deploy any Notebook collaborative editing platform on a specific infrastructure. * Environment management: Once configured with a base environment, it can be complicated to add and use new packages and libraries. * Efficient scalability: Deployed on a particular set of resources (CPU/GPU/RAM), it can be difficult to quickly scale workflows to higher resource configurations when necessary. In this talk, we will discuss the challenges in the current collaboration landscape and introduce Nebari, an open source data science platform, designed to fill those gaps. Nebari enables organizations to quickly deploy a collaborative platform on any of the major cloud providers. Once deployed, it gives your teams, small and large, the ability to easily access individual Jupyter Notebook servers in the cloud, and start writing and running reproducible and scalable data science workflows. Integrated with conda-store and Dask, it offers users the ability to not only build, share and access conda virtual environments from their servers, but also launch clusters to handle their compute-intensive tasks. It also allows you to build and share dashboards and data applications within the organization and also manage the platform through a GitOps approach. We will discuss why and how Nebari was developed using open source tools such as Terraform, Kubernetes, JupyterHub, and Keycloak. Ultimately, we hope to equip the audience with the tools and knowledge to promote better collaboration within data science teams. Those interested can choose to adopt Nebari as a ready-to-use platform within their organization, or take it as a model to develop a custom platform based on open source tools.