Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


This document will be open for review till October 26. Please leave your comment inline. Thanks!

Primary Reviewers: Azhar Sayeed Vance Shipley Sandeep Karkala Seshu Kumar Mudiganti Huifeng Le (Deactivated)

Content:

→ 1.  Introduction – why telecom industry needs Cloud Native PaaS

...

→ 4.2.3.4 Telco PaaS Solution Proposed by Intel

→ 4.2.3.5 Telco PaaS Solution Proposed by China Mobile

→ 4.2.4 XGVela PaaS Workflow

...

XGVela, as the cloud native PAAS PaaS platform, runs on the container environment by default, and interwork with K8S to realize the orchestration and management of containers. Network Functions (NFs) and applications obtains needed common PaaS capabilities from XGVela. Upper layer telco management systems can select XGVela PaaS capabilities to play as their sub-modules to achieve O&M functions; XGVela PaaS capabilities can also be treated as “platform” resources that support to be orchestrated by management systems.

...

PaaS Service represents the capabilities/functions required by applications, developer and operation stuff, which achieves the core value of PaaS platform. It is managed and orchestrated by PaaS Management. The three types of PaaS capabilities concluded in chapter 2.2 belong 2 belong to PaaS Service which are also separated as three categories (General PaaS, General PaaS + Adaptation Layer, Telco PaaS) based on using scenarios. The number of PaaS services keeps increasing with the diverse of use cases. And PaaS services keep upgrading based on customers’ needs. Currently, some functions/services have been summarized for different categories. For detailed description and requirements of each service, please refer to chapter 4.2.2 and 4.2.3.

...

  • It is required that API Gateway to support identity authentication and authorization of API calls, verifying the legitimacy and authority of API calling entity (user, other software, etc.), releasing secure access control, avoiding security threats on PaaS service.
  • It is required that API Gateway to support forwarding user request to correct processing unit. The forwarding process supports load balancing, and traffic control actions based on pre-defined traffic management policies, which may include health check, rate limiting, time out & retry, circuit breaker, etc.
  • It is recommended that API Gateway to support protocol processing, including HTTP, HTTP2, GRPC, web socket, etc.
  • It is required that API Gateway support managing the service API of PaaS services, including API definition (path, parameters, etc.), APU publishingAPI publishing/ suspending/ online/ offline/ withdraw/ etc. This feature is mainly used by developers/operators to provide API-type PaaS services for external usage.
  • It is required to support monitoring API usage and API data analysis, which cover performance, usage, alarm, log, etc.

...

  • General PaaS shall complete service lifecycle management through PaaS Management. There are usually two ways for users to use a PaaS service: call API of API-type PaaS service and create an instance-type PaaS service. API-type PaaS services (like AI service-facial recognition, image processing, etc.) are managed by API Gateway of PaaS Management. Instance-type PaaS Services (like DB/ LB, etc.) are managed by Service Lifecycle Management.
  • General PaaS services shall be packaged as Operator or Helm Chart. Related image and package can be stored in local Image & Package Repository, as well as in remote repository while pre-configuring automatic access to the remote repo.
  • All General PaaS service shall support to general monitoring data, logs, events, alarms, etc., and report to Service & Resource Monitoring function and Service & Resource Log & Event function in PaaS Management.
  • General PaaS services shall support to be deployed on any Kubernetes cased CaaS (Container as a Service) layer.
  • General PaaS shall support custom configuration. The configuration parameters are designed by developersPaaS Service Provider, the configuration contents are provided by users according to application requirements following certain rules.
  • It is recommended to select General PaaS software from commonly used CNCF projects.

...

Commonly used open-source tools are Kong, Tyk, 3-Scale, Istio, EMCO.

Service Discovery & Registration

General PAAS shall provide Service Discovery & Registration functions and software to help micro services of application/system to obtain each other's access information

  • Service Discovery & Registration functions shall maintain real-time microservice access info, which includes adding address of new microservice, update microservice instance address, deleting information of fault microservice, etc.
  • Commonly used open -source software include CoreDNS,etcd,Zookeeper,Netflix,Nacos, among which CoreDNS, etcd, Zookeeper are popular choice.

...

However, typically, in Kubernetes each pod only has one network interface (apart from a loopback) which is not enough for a production ready telco network function. For this problem, there is an open-source solution, Multus, that can solve it. Multus is a container network interface (CNI) plugin for Kubernetes that enables attaching multiple network interfaces to pods. With Multus, you can create a multi-homed pod that has multiple interfaces. Multus CNI has already support many network CNIs, including Calico, Flannel, Userspace CNI (oveovs-dpdk/vpp), OVN-Multi CNI, SRIOV-NIC CNI and SMartNIC CNI. Developers can specify different network CNI for different network planes to realize container multi network planes. For example, control network plane can chose traditional CNIs with lower performance like Calico, Flannel; while data/user network plane can chose CNIs with better forwarding performance, for example SRIOV-NIC CNI.

According to above contents, we can see that Multus ensures pod to implement multiple virtual network interface/card and implement multiple network plane for containers at infrastructure level. However, for developers of network functions, they still need to understand the working principles and using methods of different CNIs. The difficulty of network development has nor not been reduced. Therefore, based on Multus solution, a Telco PaaS function named NMaaS is proposed in this Chapter.

NMaaS, Network Management as a Service, is proposed to expose the service of container multiple network planes. It has the following features:

  • Exposes NB APIs for Orchestration/Management systems or developers to configure the Infrastructure NIC, add/delete interface to NF at runtime, SRIOV configuration etc.
  • Shields the different using methods of different CNIs and provides consistent user experience on NIC management. For example, developers can simply specify the number of vNICs and SLA requirements on these vNICs to complete container vNIC configuration.
  • Supports customized value settings for vNIC parameters such as latency, jitter, bandwidth, throughput, performance, and trigger CaaS layer to setup the required underlaying driver/software based on these requirements.

Image RemovedImage Added

Figure 4-7 Proposed NMaaS Diagram (need update)

Figure 4-7 shows the basic working principle of NMaaS. NMaaS pod receives a vNIC request from orchestration/management systems or command line through standard NMaaS NBI. NMaaS will convert vNIC request into contents that can be understand by CaaS layer, including vNIC type, amount, configutation, etc. And the converted information will be sent to Multus to setup required underlaying driver/software. Then, NMaaS will trigger K8S create new vNIC for target Pod, and update the traffic and vNIC configuration in Pod.

4.2.3.

...

5 Telco PaaS Solution Proposed by China Mobile

For telecommunication industry, network functions are designed following a paradigm and in distributed architecture. No matter it is PNF, VNF or CNf, each NF is desgined  in distributed architecture with multiple modules, which typically includes interfaces module (which is in charge of south-north communication with other NFs), business processing modules (which deals with NF business logics), data storage module (which stores NF data), and operation andf management module (which is in charge of NF operation and management). Within all these functional modules, operation and management module (OAM module), as it has similar functions to different NFs, can be shared as common module.

The common operation and management functions used by NF usually include but not limited to: configuration management, heartbeat management, performance management, log management, alarm management. If checking those open-sourced network function projects, such as OAI and Free5GC, it is shown that operation and management functions are non-business-logic related functions, which are usually not included in network functions source code. However, these OAM functions are necessary in NF productions, which usually are developed by vendors or operators. Besides, according to cloud native application design principles, observability is one of the most important feature an application should have. With these OAM functions, NF can achieve observability easily.

As OAM functions are common, reusable, non-business-log-related and necessary functions to NF, they can be treated as PaaS capabilities provided by cloud service provider to NF developers. Using platform provided OAM fucntions can help the NF developers focue more on core business design while be reliefed from repetitive OAM implementation. So cloud native OAM, which is a group of operation and management functions for NFs, can be treated as Telco PaaS capabilities.

Typical functions of cloud native OAM:

  • NF configuration management function: this function manages configurations of NFs. It supports to accept NF configuration command from orchestration systems, manage current and historical configurations of NF, and configuring NF.  As NF's configurations are usually stored in configuration files and has vendor-different formats, the management target for this function is only configuration file without defining the file contents. For some newly developed NF strictly following cloud native principle and using Kubernetes/Docker as resources, their configurations can be stored directly in etcd as configmaps.
  • NF heartbeat management function: this function helps to monitors the NF's health status. This function has different levels of implementation. The easiest-level implementation is collecting NF-level heartbeat outside NF. This relies on the NF itself to generate heartbeat within each microservice instances, monitoring internal heartbeats, determine NF health status internally with overall microservice heartbeat data. The heartbeat collected on this level can only reflect whether the NF is normal or not, while not knowing whether it is healthy inside. The middle-level implementation is collecting NF microservice-level heartbeat. This relies on each NF microservice to report its heartbeat, with which the function can determine whether the NF is healthy based on some pre-defined rules. The hardest-level implementation, which is also the most cloud native one, is monitors the health status of pod through kubernetes liveness probe, and analyze microservices-level and NF-level health status sutomatically. As current NF are mostly self-contained, the easiest-level implementation is mostly used, while the middle-level implementation and hardest-level implementation are prefered to be followed in the future.
  • NF performance management function: this function helps to monitors the metrics of NF. It supports to collect the metrics data and do analysis based on pre-defined rules. Prometheus is the most commonly selected software. 
  • NF log management function: this function helps to collect and store logs of NF.
  • NF alarm management function: this functions helps to collect alarms of NF both at NF-level and resource-level. It also supports to clean the alarm data.

For more design and interface details, please go to:

4.2.4 XGVela PaaS Workflow

After summarizing the technical architecture and functional requirements, XGVela related workflows will be covered in this Chapter. These workflows are applicable to all PaaS platforms, which includes XGVela. In this Chapter, we only cover the simple and general workflows that have no telecom features, while the interaction with NFVO, VNFM and other telecom systems will be considered in future release.

Figure 4-7 8 PaaS Platform Workflow

We simplify the PaaS technical architecture and its relationship with outer management systems as blocks in figure 4-7, which includes NF/application, User, Operator/Management systems, PaaS Service, PaaS Management, and CaaS. The major workflow are listed below.

...

Figure 5-1 describes typical deployment scenarios of telecom network cloud. The telco network cloud usually can be separated into the following type: core cloud, reginal cloud, edge cloud and far edge nodes. The features of each deployment scenario are displayed in figure 5-1.

Image Added

Figure 5-1 Typical Deployment Scenarios of Network Cloud

As each deployment scenario has different resource, carries different applications, and may have different architectures, they will have different HA requirements. For example, as core cloud carried the most importance telco core NFs and has sufficient resources, the HA level should be high, while edge mage has lower HA level due to lack of enough resources.

...