What’s new in Databricks for May 2023

Platform

When creating a cluster or pool AWS fleet instance types are now available. Fleet instance types map to multiple comparable aws instance types allowing the cluster to use whichever instance types has the best spot capacity and best on demand availability. For more information https://docs.databricks.com/compute/aws-fleet-instances.html ( AWS Only)
The new Unified navigation experience is in public preview. For more information https://docs.databricks.com/workspace/unified-nav.html
Cluster scoped init scripts on DBFS are deprecated. They should be stored as workspace files instead. For more information https://docs.databricks.com/files/workspace-init-scripts.html
Run file-based SQL queries in a Databricks Workflow
Databricks Runtime 13.1 and Databricks Runtime 13.1 for ML are now available as Beta releases.
Compliance security profile now supports more EC2 instance types. For more information https://docs.databricks.com/security/privacy/security-profile.html#features
New Region : Europe (Paris), South America (Sao Paulo) ( AWS Only)
Databricks has released a recent version of the JDBC driver(2.6.33)
Workspaces with Enterprise pricing tier and the enhanced security and compliance add-on will have the opportunity to configure a monthly or biweekly schedule for automatic restart of compute resources if needed to get the latest images and security. ( AWS Only). For more information https://docs.databricks.com/administration-guide/clusters/scheduled-cluster-updates.html
You can now authenticate to Databricks REST APIs using OAuth tokens for service principals. A service principal is an identity that you create in Databricks for use with automated tools, jobs, and applications. Account admins can create a client secret for a service principal. You can then use the client secret with the client ID, also known as the service principal’s application ID, to request an OAuth token for the service principal. You can use the same OAuth token for both the account and workspaces, as long as the service principal has the correct access. For more information https://docs.databricks.com/dev-tools/authentication-oauth.html
M7g and R7g Graviton3 instances are now supported on Databricks (AWS Only)
AWS provides an API called the IMDS API to read instance metadata in your notebooks. AWS announced IMDS version 2 (IMDSv2), which includes security improvements and a session-oriented flow with requests protected by session authentication. You can configure your workspace to enforce the use of IMDS v2 with a workspace admin setting that is now generally available. ( AWS Only)
Cluster-scoped Python libraries are supported on Databricks Runtime 13.1 and above. Support is also available for Python wheels that are uploaded as workspace files, but not libraries that are referenced using DBFS filepaths, including libraries uploaded to DBFS root. Non-Python libraries are not supported.
Azure Databricks now supports using Azure confidential computing VM types when creating clusters. Azure confidential computing helps protect data in use, preventing the cloud provider from having access to sensitive data. For more information https://learn.microsoft.com/en-us/azure/databricks/clusters/configure#confidential
You can enable secure cluster connectivity (SCC) on an existing workspace to make your VNet have no open ports and Databricks Runtime cluster nodes have no public IP addresses. This feature is configured in Azure templates using the name No Public IP (enableNoPublicIp).
You can enable or disable Azure Private Link on an existing workspace for private connectivity between users and their Databricks workspaces, and also between compute resources and the control plane of Azure Databricks infrastructure.
Terraform provider updated to version 1.17.0

Delta Lake

You can chain multiple stateful operators together, meaning that you can feed the output of an operation such as a windowed aggregation to another stateful operation such as a join. (Require DBR 13.1) For more information https://docs.databricks.com/structured-streaming/stateful-streaming.html
You can now use dropDuplicatesWithinWatermark in combination with a specified watermark threshold to deduplicate records in Structured Streaming. (Require DBR 13.1)
You can now use Trigger.AvailableNow to consume records from Kinesis as an incremental batch with Structured Streaming. For more information https://docs.databricks.com/structured-streaming/kinesis.html#available-now
You can now use CLONE and CONVERT TO DELTA with Iceberg tables that have partitions defined on truncated columns of types int, long, and string. Truncated columns of type decimal are not supported.
You can now use shallow clones to create new Unity Catalog managed tables from existing Unity Catalog managed tables. For more information : https://docs.databricks.com/delta/clone-unity-catalog.html (Require DBR 13.1)

Governance

You can now use Delta Sharing to share notebooks files securely using the Databricks to Databricks sharing flow. For more information https://docs.databricks.com/data-sharing/create-share.html
Account nicknames are now available in the account console. After you give your account a nickname, the name is displayed at the top of the account console.
You can now limit catalog access to specific workspaces in your account, also known as workspace-catalog binding. This feature is especially helpful if you use workspaces to isolate user data access. For more information https://docs.databricks.com/data-governance/unity-catalog/create-catalogs.html#catalog-binding

Databricks SQL

Schema Browser is now generally available in Data Explorer.

On-hover table detail panel showing is less sensitive.
The escape key now closes the autocomplete panel.
View definitions now have syntax highlighting in the Data Explorer details tab.
In the SQL Statement API, the EXTERNAL_LINKS disposition now supports the JSON_ARRAY format. You can extract up to 100 GiB of data in JSON format with pre-signed URLs. The INLINE limit for JSON is 16 MiB.

Partner Connect

You can now connect your databricks workspace to Alation using Partner connect. For more information https://docs.databricks.com/partners/data-governance/alation.html
The workspace admin user role is no longer required to connect to Fivetran using Partner Connect. For more information https://docs.databricks.com/partners/ingestion/fivetran.html#partner-connect

How to pass the Databricks Platform Admin Accreditation?

How to pass the Associate Machine Learning Certification ?

How to pass the Associate Developer for Apache Spark certification?

How to pass the Associate Data Analyst Certification ?

How to pass the Professional Databricks Data Engineering certification ?

How to pass the Associate Databricks Data Engineering Certification ?

La data avec Youssef

Everything you need to know about Databricks / Tout ce qu'il faut connaitre sur Databricks

What’s new in Databricks for December 2023

What’s new in Databricks for November 2023

What’s new in Databricks for October 2023

What’s new in Databricks for September 2023

What’s new in Databricks for July 2023

What’s new in Databricks for June 2023

What’s new in Databricks for May 2023

Articles similaires

Laisser un commentaire Annuler la réponse

Partager :

Articles similaires

Laisser un commentaire Annuler la réponse