What’s new in Databricks for December 2022

Platform 

  • The workspace administrator setting to disable the uploads data UI now applies to the new upload data UI
  • Memory profiling is now enabled for PySpark UDF.(Runtime 12)
  • Jobs are now available in global search
  • DBR 12.0 and 12.0 ML are GA
  • Databricks Terraform provider updated to version 1.7.0
  • Databricks ODBC and JDBC drivers were updated to a newer version

Governance

  • Capturing Lineage data with Unity Catalog is now GA
  • Use SQL to specify schema and catalog level storage locations for UC managed tables

Delta Lake

  • You can now selectively overwrite  data matching an arbitrary expression in a Delta Table using the follow pattern

Insert into table_name replace where predicate append_relation(Runtime 12)

  • Delete and update now use dynamic file and partition pruning instances where it improves performance.(Runtime 12)
  • With Partitioned predicates users can now audit how many rows are deleted when running data manipulation language (Runtime 12)
  • For Delta Live Tables You can now import files from a Databricks Repo as Python modules. You can import files from the current repo path or a specified repo path using sys.path.append().
  • You can now specify watermarks using the Delta Live Tables SQL interface and in SQL queries against streaming DataFrames.

Workflows

  • Enhanced Notifications for your databricks jobs using webhooks or native slack notifications.

Databricks SQL

  • Serverless SQL warehouses now support adding CMK to the workspaces for managed services and Workspace Root S3 Bucket.
  • Databricks SQL Driver Go is GA. For more information check the change log: https://github.com/databricks/databricks-sql-go/blob/main/CHANGELOG.md
  • Dynamic pruning for MERGE INTO.
  • Improved conflict detection in Delta with dynamic file pruning.
  • CONVERT TO DELTA partition detection improvements.
  • Table schemas now support default values for columns.
  • Serverless Only: You can now use 28 new built-in H3 expressions for geospatial processing. These functions are available in Serverless SQL warehouses.

Partner connect

  • Partner connect connecting AtScale

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *