Databend Open Source Weekly Issue 106

Databend is a modern cloud data warehouse. Designed for flexibility and efficiency, it will escort your large-scale analysis needs. Free and open source. Experience cloud services immediately: https://app.databend.cn .

What's On In Databend

Explore the new progress of Databend this week and meet the Databend that is closer to your heart.

data desensitization

Databend has added the ability to desensitize data. By setting a masking policy (Masking Policy), you can control how sensitive data is displayed or accessed, thereby protecting confidentiality while allowing authorized users to interact with the data.

-- Create a masking policy
CREATE MASKING POLICY email_mask
AS
  (val string)
  RETURNS string ->
  CASE
  WHEN current_role() IN ('MANAGERS') THEN
    val
  ELSE
    '*********'
  END
  COMMENT = 'hide_email';

-- Associate the masking policy with the 'email' column
ALTER TABLE user_info MODIFY COLUMN email SET MASKING POLICY email_mask;

Data desensitization needs to be upgraded to the enterprise version . For upgrade information, please contact the Databend team .

If you would like to learn more, please review the resources listed below.

Code Corner

Let's explore code snippets or projects in Databend and the surrounding ecosystem.

show()Support method for Python Binding

The method is supported in the Python bindings/packages of PySpark, DuckDB, and DataFusion show()to output the first n rows of results.

Databend has also recently implemented corresponding support for Python Binding through PyO3, the code snippet is as follows.

    #[pyo3(signature = (num=20))]
    fn show(&self, py: Python, num: usize) -> PyResult<()> {
        let blocks = self.collect(py)?;
        let bs = self.get_box();
        let result = blocks.box_render(num, bs.bs_max_width, bs.bs_max_width);

        // Note that println! does not print to the Python debug console and is not visible in notebooks for instance
        let print = py.import("builtins")?.getattr("print")?;
        print.call1((result,))?;
        Ok(())
    }

If you would like to learn more, please review the resources listed below.

Highlights

Here are some notable events that you might find interesting.

  • Support distributed REPLACE INTO.
  • Operators for computing the 2-norm (Euclidean norm) of vectors are supported <->.
  • Added geolocation function : h3_to_center_child// h3_exact_edge_length_m/ h3_exact_edge_length_km/ h3_exact_edge_length_rads/ h3_num_hexagons/ h3_line/ .h3_distanceh3_hex_ring h3_get_unidirectional_edge
  • Read the documentation Docs | ALTER TABLE COLUMN to learn how to modify tables by adding, transforming, renaming, changing or dropping columns.

What's Up Next

We are always open to cutting-edge technologies and innovative ideas, and you are welcome to join the community and breathe life into Databend.

Add storage backend support for Hive Catalog

Previously, Databend's Hive Catalog implementation lacked its own storage backend configuration, and could only be rolled back to the storage backend corresponding to the Default Catalog. As a result, data cannot be read when the storage service pointed to by Hive MetaStore is inconsistent with the Default Catalog configuration.

It is now planned to introduce the CONNECTION option for the Hive Catalog, allowing configuration of the storage backend to solve the Hive acceleration problem under heterogeneous storage.

CREATE CATALOG hive_ctl
TYPE=HIVE
HMS_ADDRESS='127.0.0.1:9083'
CONNECTION=(
    URL='s3://warehouse/'
    AWS_KEY_ID='admin'
    AWS_SECRET_KEY='password'
    ENDPOINT_URL='http://localhost:9000'
);

Issue #12407 | Feature: Add storage support for Hive catalog

If you are interested in this topic, you can try to solve some of the problems or participate in discussions and PR reviews. Alternatively, you can click on https://link.databend.rs/im-feeling-lucky to pick a random question, good luck!

New Contributors

Meet new people in the community and make Databend even better because of you.

Changelog

Head over to the changelog for Databend's daily builds to stay up to date on developments.

Address: https://github.com/datafuselabs/databend/releases

Contributors

Many thanks to the contributors for their excellent work this week.

Connect With Us

Databend is an open source, flexible, low-cost, new data warehouse that can also perform real-time analysis based on object storage. Looking forward to your attention, let's explore cloud-native data warehouse solutions together to create a new generation of open source Data Cloud.

Redis 7.2.0 was released, the most far-reaching version Chinese programmers refused to write gambling programs, 14 teeth were pulled out, and 88% of the whole body was damaged. Flutter 3.13 was released. System Initiative announced that all its software would be open source. The first large-scale independent App appeared , Grace changed its name to "Doubao" Spring 6.1 is compatible with virtual threads and JDK 21 Linux tablet StarLite 5: default Ubuntu, 12.5-inch Chrome 116 officially released Red Hat redeployed desktop Linux development, the main developer was transferred away Kubernetes 1.28 officially released
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5489811/blog/10096065