Databend is a modern cloud data warehouse. Designed for flexibility and efficiency, it will escort your large-scale analysis needs. Free and open source. Experience cloud services immediately: https://app.databend.cn .
What's On In Databend
Explore the new progress of Databend this week and meet the Databend that is closer to your heart.
Understanding Connection Parameters
A connection parameter is a set of authentication and configuration information required to establish a connection to an external storage service supported by Databend, such as Amazon S3. These parameters are surrounded by parentheses and consist of a set of key-value pairs separated by commas or spaces. It will be used when creating Stage COPY INTO
and querying external files.
The following SQL statement shows how to use connection parameters to create a Stage with S3 as the underlying storage.
CREATE STAGE my_s3_stage
URL = 's3://load/files/'
CONNECTION = (
ACCESS_KEY_ID = '<your-access-key-id>',
SECRET_ACCESS_KEY = '<your-secret-access-key>'
);
If you would like to learn more, please review the resources listed below.
Hive Catalog supports configuring storage parameters
In the past week, Databend has introduced storage parameter options for Hive Catalog, allowing it to configure specific storage services, no longer relying on Default Catalog's own storage backend.
The following example shows how to create a Hive Catalog with MinIO as the underlying storage service:
CREATE CATALOG hive_ctl
TYPE = HIVE
CONNECTION =(
ADDRESS = '127.0.0.1:9083'
URL = 's3://warehouse/'
AWS_KEY_ID = 'admin'
AWS_SECRET_KEY = 'password'
ENDPOINT_URL = 'http://localhost:9000/'
)
If you would like to learn more, please review the resources listed below.
- Issue #12407 | Feature: Add storage support for Hive catalog
- PR #12469 | feat: Add storage params in hive catalog
Code Corner
Let's explore code snippets or projects in Databend and the surrounding ecosystem.
gitoxide
Speed up Git dependency downloads with
gitoxide
is a high-performance, modern Git implementation written in Rust. Using the feature (Unstable) cargo
of gitoxide
, you can use gitoxide
crate instead to git2
perform various git operations, so as to obtain several times performance improvement when downloading crates-index and git dependencies.
cargo {build | clippy | test}
Databend recently enabled this feature in CI for , you can also try adding -Zgitoxide
the option when developing locally to speed up the build process:
cargo -Zgitoxide=fetch,shallow-index,shallow-deps build
If you would like to learn more, please review the resources listed below.
Highlights
Here are some notable events that you might find interesting.
SELECT
The clause can also be used alone without being used withVALUES
.- Support for modifying default values when changing columns.
- Add virtual column support for tables in Parquet format
- Support for automatic reclustering of tables after write operations (
COPY INTO
and )REPLACE INTO
What's Up Next
We are always open to cutting-edge technologies and innovative ideas, and you are welcome to join the community and breathe life into Databend.
Enhanced infer_schema
ability to support file paths
Currently, Databend supports querying both the file pointed to by the file path and the file located in the stage, for example:
select * from 'fs:///home/...';
select * from 's3://bucket/...';
select * from @stage;
However, currently infer_schema
only supports processing files located in the Stage:
select * from infer_schema(location=>'@stage/...');
If files located in other paths are required for inference, an error will be reported:
select * from infer_schema(location =>'fs:///home/...'); -- this will panic.
We hope to unify infer_schema
the behavior of the function, allowing it to infer files in all locations, making it more usable.
Issue #12458 | Feature: infer_schema
support normal file path
If you are interested in this topic, you can try to solve some of the problems or participate in discussions and PR reviews. Alternatively, you can click on https://link.databend.rs/im-feeling-lucky to pick a random question, good luck!
Changelog
Head over to the changelog for Databend's daily builds to stay up to date on developments.
Address: https://github.com/datafuselabs/databend/releases
Contributors
Many thanks to the contributors for their excellent work this week.
Connect With Us
Databend is an open source, flexible, low-cost, new data warehouse that can also perform real-time analysis based on object storage. Looking forward to your attention, let's explore cloud-native data warehouse solutions together to create a new generation of open source Data Cloud.
Redis 7.2.0 was released, the most far-reaching version Chinese programmers refused to write gambling programs, 14 teeth were pulled out, and 88% of the whole body was damaged. Flutter 3.13 was released. System Initiative announced that all its software would be open source. The first large-scale independent App appeared , Grace changed its name to "Doubao" Spring 6.1 is compatible with virtual threads and JDK 21 Linux tablet StarLite 5: default Ubuntu, 12.5-inch Chrome 116 officially released Red Hat redeployed desktop Linux development, the main developer was transferred away Kubernetes 1.28 officially released