Data Lakehouses 101 with PySpark, Trino, and Minio

Authors

Speaker Image

Description

In the world of AI, data is a diamond that is normally lost in Swamps because of bad practices with Data Lakes, when many companies try to productionize their data. Data Warehouse is a costly solution for this problem, but increasing the complexity of simple Lakes. Here the Data Lakehouses come into action, being a hybrid solution with the best of both worlds. This workshop aims to introduce the Data Lakehouse pattern as a suitable and flexible solution from small companies to established enterprises. Including a hands-on component of implement your own Data Lakehouse locally with OpenSource solutions, compatibles with Cloud production grade tools.