Get ready to be blown
away! The highly anticipated Microsoft Build 2023 has finally unveiled its
latest and greatest creation: the incredible Microsoft Fabric - an unparalleled
Data Intelligence platform that is guaranteed to revolutionize the tech world!
One of the most exciting things in Fabric I found is OneLake. I was amazed to discover how OneLake is simplified just like
OneDrive! It's a single unified logical SaaS data lake for the whole
organization (no data silos). Over the past couple of months, I've had the
incredible opportunity to engage with the product team and dive into the
private preview of Microsoft Fabric.
I got OneLake installed on my PC and can easily
access the data in the OneLake like OneDrive as shown in Fig 2:
Single unified Managed
and Governed SaaS Data Lake
All
Fabric items keep their data in OneLake so no data silos. OneLake is fully
compatible with Azure Data Lake Storage Gen 2 at the API layer which means it
can be accessible as ADLS Gen 2.
Let's
investigate some of the benefits of OneLake:
- OneLake comes automatically
provisioned with every Microsoft Fabric tenant with no infrastructure to
manage.
- Any data in OneLake works with out-of-the-box governance such as data linage, data protection, certification, catalog integration, etc. Please note that this feature is not part of the public preview.
- OneLake enables distributed ownership. Different
workspaces allow different parts of the organization to work independently
while still contributing to the same data lake
- Each workspace can have its own administrator, access
control, region, and capacity for billing
Yes, your requirement is covered through OneLake. If
you're concerned about how to effectively manage data across multiple countries
while meeting local data residency requirements, fear not - OneLake has got you
covered! With its global span, OneLake enables you to create different
workspaces in different regions, ensuring that any data stored in those
workspaces also reside in their respective countries. Built on top of the
mighty Azure Data Lake Store gen2, OneLake is a powerhouse solution that can
leverage multiple storage accounts across different regions, while virtualizing
them into one seamless, logical lake. So go ahead and take the plunge - OneLake
is ready to help you navigate the global data landscape with ease and
confidence!
Data Mesh as a Service:
OneLake gives a true data mesh as a service. Business
groups can now operate autonomously within a shared data lake, eliminating the
need to manage separate storage resources. The implementation of the data mesh
pattern has become more streamlined. OneLake enhances this further by
introducing domains as a core concept. A business domain can have multiple
workspaces, which typically align with specific projects or teams.
Open Data Format
Simply, no matter which item you start
with, they will all store their data in OneLake similar to how Word, Excel, and
PowerPoint save documents in OneDrive.
You will see files and folders just like you
would in a data lake today. All workspaces are going to be folders, each data
item will be a folder. Any tabular data will be stored in delta lake format.
There are no new proprietary file formats for Microsoft Fabric. Proprietary
formats create data silos. Even the data warehouse will natively store its data in Delta
Lake parquet format. While Microsoft Fabric data items will standardize on
delta parquet for tabular data, OneLake is still a Data Lake built on top of
ADLS gen2. It will support any file type, structured or unstructured.
Shortcuts/Data Virtualization
Shortcuts virtualize data across domains and
clouds. A shortcut is nothing more than a symbolic link that points from one
data location to another. Just like you can create shortcuts in Windows or
Linux, the data will appear in the shortcut location as if it were physically
there.
As shown in above fig 4, if you have existing data lakes stored in ADLS gen2 or in Amazon S3 buckets. These Lakes can continue to exist and be managed externally by OneLake in Microsoft Fabric.
Shortcuts will help to avoid data movements or
duplication. It’s easy to create Shortcuts from Microsoft Fabric as shown in Figure
6:
OneLake Security
In addition, Power BI
reports will continue to work against data in OneLake as the analysis services
can still leverage the security defined in the T-SQL engine through DirectQuery
mode and can sometimes still optimize to DirectLake mode depending on the
security defined.
In summary, OneLake is a revolutionary
advancement in the data and analytics industry, surpassing my initial
expectations. It transforms Data Lakes into user-friendly OneDrive-like
folders, providing unprecedented convenience. The delta file format is the
optimal choice for Data Engineering workloads. With OneLake, Data Engineers,
Data Scientists, BI professionals, and business stakeholders can collaborate
more effectively than ever before. To find out more about Microsoft Fabric and
OneLake, please visit Microsoft Fabric Document.