BI & Data Warehousing with your Lakehouse

Lakeview Dashboards are in public preview!

Lakeview Dashboards offer a new dashboarding experience, optimized for ease of use, broad distribution, governance and security.

In addition to a brand new UX, making it easier to plot insights, Lakeview Dashboard can be shared with users outside of your organization.

Create your first Dashboard now (video)

Governance and Unity Catalog

Discover and Organize your data in your Lakehouse

Building your semantic layer is getting easier. AI-Generated table comments automatically describe your data assets.

This will improve the new Semantic search capabilities, letting you ask questions on your lakehouse using plain text: (eg: List all the tables related to football)

Track your compute resources: Clusters and node types available as System Tables

System tables offer more insight into your lakehouse usage in plain SQL. They are available for Audit logs, Table and Column lineage, Billable usage, Pricing, Cluster and node types, Marketplace listing access, Predictive Optimization

For more information: Databricks Documentation or install the System tables demo with dbdemos

Ingestion and performances

10x faster DML Delta queries with Deletion Vectors (update, delete, merge)

Deletion vectors are going GA! Updating content in your tables doesn’t require the engine to rewrite data anymore (write amplification). Delta Lake automatically flags the deleted or updated rows as separate information, resulting in 10x operation speed!

Deletion vectors are part of Predictive I/O, bringing AI to your Lakehouse for faster queries: See Predictive I/O documentation.

Deletion vectors will be enabled by default starting in DBR14! (default behavior can be changed in your workspace settings). For more information :

Predictive Optimization : Faster queries and cheaper storage

Predictive Optimization leverages Unity Catalog and Lakehouse AI to determine the best optimizations to perform on your data, and then runs those operations on purpose-built serverless infrastructure (VACUUM, OPTIMIZE…). This significantly simplifies your lakehouse journey, freeing up your time to focus on getting business value from your data.

Set the Predictive optimization field in Account console > Settings > Feature Enablement

In just a click, you’ll get the power of AI-optimized data layouts across your Unity Catalog managed tables, making your data faster and more cost-effective.

Note: Predictive Optimization metrics are available as system tables (eg: Which tables have been optimized recently)

For more information :

ML & AI + LLMs

Foundation LLM models available in the Market

Llama 2 foundation chat models are now available in the Databricks Marketplace for fine-tuning and deployment on private model serving endpoints.

Each model is wrapped in MLflow and saved within Unity Catalog, making it easy to use the MLflow evaluation in notebooks and to deploy with a single click on LLM-optimized GPU model serving endpoints.

Deploy private LLMs using Databricks Model Serving

These endpoints are pre-configured with GPUs and accelerated to serve foundational models, providing the best cost/performance ratio. This allows you to build and deploy GenAI applications from data ingestion and fine-tuning, to model deployment and monitoring, all on a single platform. Watch the video.

Try deploying LLM models now!

Other updates

Unity Catalog: UCX – Unity Catalog Upgrade Toolkit

Need some help to upgrade your data asset to Unity Catalog? Try the new Databricks Lab project. Explore the Github Repo or Get started with a Video

Unity Catalog: Workspace-Catalog binding in Unity Catalog

While Metastore can be shared across multiple workspaces, you can now bind a catalog to a specific workspace, preventing it to be READ or WRITE from other workspace (ex: “Development” workspace can only READ the “prod” catalog)

Watch the recording to get started

Compute: Libraries are now supported in compute policies

If you are a Workspace admin you can now add libraries to compute policies. Compute that use the policy will automatically install the library. Users can’t install or uninstall compute-scoped libraries on compute that use the policy. Read the cluster policies Documentation

Workflows: Pass parameters in Databricks jobs and if/ else condition

You can now add parameters to your Databricks jobs that are automatically passed to all job tasks that accept key-value pairs.. Additionally, you can now use an expanded set of value references to pass context and state between job tasks. Read the Documentation for Parameters

DAB: Databricks Asset bundles

Bundles, for short, facilitate the adoption of software engineering best practices, including source control, code review, testing and continuous integration and delivery (CI/CD)

Demo Center : Databricks Asset Bundles Demo

In a nutshell

Databricks Runtime 14.1 is GA: Link
You can run selected cells in a notebook.
Structured Streaming from Apache Pulsar on Databricks: Link
Declare temporary variables in a session which can be set and then referred to from within queries: Link
Arguments are explicitly assigned to parameters using the parameter names published by the function: Link
Feature Engineering (Feature Store) in Unity Catalog is GA: Link
On-demand feature computation is GA. ML features can be computed on-demand at inference time: Link
Structured Streaming can perform streaming reads from views registered with Unity Catalog.: Link
Databricks AutoML Generated Notebooks are now saved as ML Artifacts: Link
Models in Unity Catalog is GA: Link
You can now drop some table features for Delta tables. Current support includes dropping deletionVectors and v2Checkpoint: Link
Partner connect now supports Dataiku , Rudderstack and Monte Carlo.

Highlights of the Databricks Blog posts

How to pass the Databricks Platform Admin Accreditation?

How to pass the Associate Machine Learning Certification ?

How to pass the Associate Developer for Apache Spark certification?

How to pass the Associate Data Analyst Certification ?

How to pass the Professional Databricks Data Engineering certification ?

How to pass the Associate Databricks Data Engineering Certification ?

La data avec Youssef

Everything you need to know about Databricks / Tout ce qu'il faut connaitre sur Databricks

What’s new in Databricks for December 2023

What’s new in Databricks for November 2023

What’s new in Databricks for October 2023

What’s new in Databricks for September 2023

What’s new in Databricks for July 2023

What’s new in Databricks for June 2023

What’s new in Databricks for October 2023

BI & Data Warehousing with your Lakehouse

Governance and Unity Catalog

Discover and Organize your data in your Lakehouse

Track your compute resources: Clusters and node types available as System Tables

Ingestion and performances

10x faster DML Delta queries with Deletion Vectors (update, delete, merge)

Predictive Optimization : Faster queries and cheaper storage

ML & AI + LLMs

Foundation LLM models available in the Market

Other updates

Articles similaires

Laisser un commentaire Annuler la réponse

BI & Data Warehousing with your Lakehouse

Governance and Unity Catalog

Discover and Organize your data in your Lakehouse

Track your compute resources: Clusters and node types available as System Tables

Ingestion and performances

10x faster DML Delta queries with Deletion Vectors (update, delete, merge)

Predictive Optimization : Faster queries and cheaper storage

ML & AI + LLMs

Foundation LLM models available in the Market

Other updates

Partager :

Articles similaires

Laisser un commentaire Annuler la réponse