Comprehensive Hands-on Walk Through of Dremio Cloud Next Gen (Hands-on with Free Trial)
November 12, 2025Walkthrough with the new trial of the Dremio Cloud Platform
Walkthrough with the new trial of the Dremio Cloud Platform
A curated guide to mastering Apache Iceberg, data lakehouse architectures, and the emerging field of Agentic AI for data professionals.
Dive into the world of commercial Iceberg catalogs and discover how they enhance data lakehouse architectures for modern data engineering.
Exploring paths to a universal lakehouse catalog that supports multiple data formats and engines, building on Apache Iceberg's success.
Learn how to leverage Apache Iceberg with Apache Polaris and Apache Spark to build scalable and efficient data lakehouses.
What's Coming in Apache Iceberg v4: A Deep Dive into the Future of Open Table Formats
Understanding Iceberg, Delta Lake, Hudi, Paimon, and DuckLake
What is the Data Lakehouse and the Data Lakehouse Ecosystem? This comprehensive guide covers everything you need to know about the Data Lakehouse architecture, open table formats like Apache Iceberg, Delta Lake, Apache Hudi, and Apache Paimon, and the modern data ecosystem that supports them.
Learn how to automate compaction, snapshot expiration, and layout optimization in Apache Iceberg using metadata-driven triggers and orchestration tools for a self-healing lakehouse.
Learn how to scale Apache Iceberg table optimizations across large datasets using parallelism, checkpointing, and fail recovery to ensure reliability and performance.
Unlocking the Power of Agentic AI with Apache Iceberg and Dremio
Partition evolution in Apache Iceberg is a powerful feature, but if not managed carefully, it can introduce fragmentation and impact compaction performance. Learn how to handle it effectively.
Discover how to use Apache Iceberg's metadata tables to proactively detect small files, bloated manifests, and table fragmentation—so you can trigger compaction only when it's needed.
Learn how to design an effective schedule for compaction and snapshot expiration in Apache Iceberg to balance cost, performance, and data freshness.
Learn how to prevent and clean up metadata bloat in Apache Iceberg by expiring snapshots and rewriting manifests for better performance and manageability.
Improve query performance in Apache Iceberg by organizing your data layout with sorting and Z-order clustering. Learn how to reduce scan cost and improve filter effectiveness.
Learn how to design fast, incremental compaction strategies in Apache Iceberg to support high-throughput streaming pipelines without disrupting freshness or performance.
Learn how standard compaction works in Apache Iceberg and why bin packing your data files is essential for maintaining query performance and cost efficiency.
Learn how Apache Iceberg tables can degrade over time without optimization and what issues this causes for performance, cost, and governance.
Continuing the Understand Apache Iceberg series, this article delves into the Manifest, a critical component of Apache Iceberg's architecture.
Continuing the Understand Apache Iceberg series, this article delves into the Manifest, a critical component of Apache Iceberg's architecture.
Continuing the Understand Apache Iceberg series, this article delves into the Manifest List, a critical component of Apache Iceberg's architecture.
Why Apache Iceberg Works
Introductory Course to Data Engineering for Apache Iceberg Lakehouses
Benefits of Apache Iceberg Partition Evolution and Hidden Partitioning
Benefits of Apache Iceberg
How to run SQL on your Excel files easily
Learning about Apache Iceberg
Java, Rest and the expanding open lakehouse ecosystem
Ingesting Data and Building BI Dashboards
Apache Iceberg, Apache Arrow, Nessie, Ibis, Substrait
Understanding how catalogs work and which one to choose
Disrupting the Snowflake/Databricks status quo
Understanding how catalogs work and which one to choose
Understanding how catalogs work and which one to choose
Understanding how to choose a table format
The Future of Data Platforms
Resources for learning how to Engineer an Open Data Lakehouse
Nessie is the only open-source catalog implementation specifically for Apache Iceberg.
This is where the combined power of Dremio’s Lakehouse Management features and Project Nessie's catalog-level versioning comes into play.
Why is Dremio so useful for Apache Iceberg data lakehouses
How to configure Spark for using Apache Iceberg