In this section, we provide an overview of the fundamental principles underlying the use of Ververica Cloud. Our goal is to help you understand and explore these concepts by presenting them in a clear and concise manner. To aid in this understanding, a visual representation of the basic hierarchy of concepts within Ververica Cloud is included below.
A Workspace is a logical environment in which real-time processing applications can be created, configured, deployed, and managed. Each Workspace has isolated computing resources and an independent development console. It provides a secure and dedicated container to run real-time applications, ensuring smooth and optimal performance.
To learn more about specific features and capabilities of the Workspace, read our Operations Guide and Development Guide located on the left sidebar of the documentation website.
A Deployment is the core resource abstraction used in Ververica Cloud to manage Apache Flink jobs. It specifies the desired state and configuration of a Flink job. Ververica Cloud tracks and reports the status of each Deployment and derives other resources from it. Whenever the Deployment specification is modified, Ververica Cloud ensures that the corresponding Flink job eventually reflects this change.
You can create Deployment using SQL, Jar or Python. To Learn more, read our Deployment Guide.
Within Ververica Cloud’s SQL Editor, the scripts that are created by users are referred to collectively as “Drafts”. It is important to note that this designation is specific to SQL development and does not extend to JAR or Python scenarios.
Ververica Cloud is equipped with a comprehensive array of integrated connectors that facilitate seamless data reading, writing, and synchronization between multiple upstream and downstream systems. Additionally, it provides support for custom connectors, which further enhances the platform’s flexibility and customization capabilities.
Within the Ververica Cloud, users have access to a wide range of function capabilities, including both built-in functions as well as the option to create and utilize custom functions.
Within the context of Ververica Cloud, “Artifacts” refers to binary files that are generated when an application is built. These files encapsulate the compiled code and dependencies necessary to run the application and are usually packaged as JAR files or Python egg files.
Efficient metadata management is a pivotal component of data processing, and Ververica Cloud offers a robust metadata management system that provides users with comprehensive metadata information. This includes details pertaining to databases, tables, fields, partitions, and relevant information stored in databases or external systems.
In Ververica Cloud, “Session Cluster” refers to a type of Flink cluster that provides a persistent and shared runtime environment for running multiple Flink applications. Unlike traditional “Job Clusters,” which are designed to execute a single Flink job and are terminated once the job is complete, Session Cluster remains active even after a job is completed, allowing multiple jobs to be executed within the same shared environment.
Session Clusters are better suited for development and testing rather than production, as they can create issues due to conflicting requirements and dependencies. They are also harder to manage and monitor than traditional Job Clusters, which are better suited for ensuring optimal performance in production environments.
The fundamental measurement unit employed by Ververica Cloud is known as the Compute Unit (CU). This metric represents the allocated computing resources for a given real-time processing application. Typically, a single CU within Ververica Cloud consists of 1 CPU core and 4 GiB of memory, providing ample processing power to handle incoming data streams and perform real-time data processing effectively.
The CU utilization of a real-time processing application is dependent upon a variety of factors, including the QPS of the incoming data stream, the complexity of the computation, and the distribution of the input data. To estimate the necessary resources required for a given task, one can evaluate the scale of the business and the computing power required for real-time processing.
Ververica Cloud leverages Ververica Runtime (VVR), an enterprise-class engine that has been specifically designed to provide a scalable and efficient platform for real-time data processing. Based on Apache Flink, VVR is optimized to handle massive volumes of data in real-time, offering a wide range of advanced features and capabilities that are essential for building complex streaming applications in a production environment. With VVR at its core, Ververica Cloud enables users to process real-time data at scale with the utmost efficiency and reliability.