What’s new in Databricks for February 2023

Platform 

  • Serverless Real time inference exposes your Mlflow ML model as a rest api endpoint.
  • Databricks terraform provider updated to version 1.10.1
  • Variable explorer in Databricks Notebooks, you can directly  observe current Python variables and their value in the notebook UI. (Requires DBR12.x)
  • Databricks extension for Visual Studio Code  lets developers leverage the powerful authoring capabilities of IDEs while connecting to Databricks clusters to run the code remotely.
  • Databricks Runtime 12.2 Beta is now available as Beta releases
  • Starting February 21, 2023, legacy global init scripts and cluster-named init scripts are deprecated and cannot be used in new workspaces
  • Notebook cell output results limit increased, the lesser of 10K rows or 2MB is displayed
  • With Databricks Runtime 12.0 and above you can create a Ray Cluster and run Ray applications in Databricks.

Governance

  • Authenticate to Power BI and Tableau using Oauth.
  • Audit Logs include entries for OAuth SSO authentication to the account console.
  • Account SCIM is now GA. It lets an identity provider create users in a Databricks account, give users the proper level of access and remove access when they leave the organization.

Workflows

  • Jobs now supports running continuous jobs
  • You can use File arrival trigger to run your Databricks job when new files arrive in an external location such as S3 or Azure storage

Databricks SQL

  • Support for DESCRIBE DETAILS in the editor.
  • Improved schema browser loading speed.
  • You can now view a list of possible columns on the side panel of a SELECT *.
  • You can now selectively overwrite data matching an arbitrary expression in a Delta table using REPLACE WHERE.
  • DELETE and UPDATE now use dynamic file and partition pruning in instances where it improves performance. 
  • The UNPIVOT clause is now supported by Databricks SQL. Use the UNPIVOT clause to rotate columns of a table-valued expression into column values. 
  • You can now use the SQL syntax, TIMESTAMP AS OF in SELECT statements to specify the version of a Delta Sharing table that’s mounted in a catalog. You can share tables WITH HISTORY.
  • MERGE INTO now supports WHEN NOT MATCHED BY SOURCE.
  • Faster statistics collection for CONVERT TO DELTA.
  • Users can now audit how many rows are deleted when running data manipulation language (DML) operations such as DELETE, TRUNCATE, and replaceWhere with partitioned predicates.

Machine learning

  • Serverless real time inference exposes your ML models as Rest API endpoints. This functionality  uses Serverless Compute
  • You can use existing feature tables in Feature Store to augment the original input dataset for AutoML forecasting problems

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *