Introducing the Project Argus Datacenter-ready Secure Control Module design specification

By admin

6 min
TIME

read

Historically, data center servers have used motherboards that included all key components on a single circuit board. The DC-SCM (Datacenter-ready

Secure Control Module
ORG

) decouples server management and security functions from a traditional server motherboard, enabling development of server management and security solutions independent of server architecture. It also provides opportunities for reducing server printed circuit board (

PCB
ORG

) material cost, and allows unified firmware images to be developed.


Today
DATE

,

Cloudflare
ORG

is announcing that it has partnered with

Lenovo
ORG

to design a

DC
GPE

-SCM for our next-generation servers. The design specification has been published to the

OCP
ORG

(Open Compute Project) contribution database under the name

Project Argus
ORG

.

A brief introduction to baseboard management controllers

A baseboard management controller (

BMC
ORG

) is a specialized processor that can be found in virtually every server product. It allows remote access to the server through a network connection, and provides a rich set of server management features. Some of the commonly used

BMC
ORG

features include server power management, device discovery, sensor monitoring, remote firmware update, system event logging, and error reporting.

In a typical server design, the

BMC
ORG

resides on the server motherboard, along with other key components such as the processor, memory,

CPLD
ORG

and so on. This was the norm for generations of server products, but that has changed in

recent years
DATE

as motherboards are increasingly optimized for high-speed signal bandwidth, and servers need to support specialized security requirements. This has made it necessary to decouple the

BMC
ORG

and its related components from the server motherboard, and move them to a smaller common form factor module known as

the Datacenter Secure Control Module
ORG

(DC-SCM).

Figure

1
CARDINAL

is a picture of a motherboard used on

Cloudflare
ORG

’s previous generation of edge servers. The

BMC
ORG

and its related circuit components are placed on the same printed circuit board as the host CPU.

Figure

1
CARDINAL

: Previous Generation Server Motherboard

For

Cloudflare
ORG

’s next generation of edge servers, we are partnering with

Lenovo
ORG

to create a

DC
GPE

-SCM based design. On the left-hand side of Figure

2
CARDINAL

is the printed circuit board assembly (

PCBA
ORG

) for

the Host Processor Module
ORG

(HPM). It hosts the

CPU
ORG

, the memory slots, and other components required for the operation and features of the server design. But the

BMC
ORG

and its related circuits have been relocated to a separate

PCBA
ORG

, which is the DC-SCM.

Figure

2
CARDINAL

: Next Generation HPM and DC-SCM

Benefits of DC-SCM based server design


PCB
ORG

cost reduction

As of

today
DATE

, DDR5 memory runs at 6400MT/s (mega transfers per

second
ORDINAL

). In the future DDR5 speed may even increase to

7200MT/s or 8800MT
ORG

/s. Meanwhile, PCIe Gen5 is running at

32
CARDINAL


GT/s
PRODUCT

(giga transfers per

second
ORDINAL

), doubling the speed rate of PCIe Gen4. Both DDR5 and PCIE Gen5 are key interfaces for the processors used on our next-generation servers.

The increasing rates of high-speed IO signals and memory buses are pushing the next generation of server motherboard designs to transition from low-loss to ultra-low loss dielectric printed circuit board (

PCB
ORG

) materials, and higher layer counts in the

PCB
ORG

. At the same time, the speed of

BMC
ORG

and its related circuitry are not progressing so quickly. For example, the physical layer interface of

ASPEED AST2600 BMC
ORG

is only at PCIe Gen2 (

5
CARDINAL

GT/s).


Ultra
ORG

-low loss dielectric

PCB
ORG

material and higher

PCB
ORG

layer count are both driving factors for higher

PCB
ORG

cost. Another driving factor of

PCB
ORG

cost is the size of the

PCB
ORG

. In a traditional server motherboard design, the size of the server motherboard is larger, since the

BMC
ORG

and its related circuits are placed on the same

PCB
ORG

as the host CPU.

By decoupling the

BMC
ORG

and its related circuitry from the host processor module (HPM), we can reduce the size of the relatively more expensive

PCB
ORG

for the HPM.

BMC
ORG

and its related circuitry can be placed on relatively cheaper

PCB
ORG

, with reduced layer count and lossier

PCB
ORG

dielectric materials. For example, in the design of

Cloudflare
ORG

’s next generation of servers, the server motherboard

PCB
ORG

needs to be

14
CARDINAL

or more layers, whereas the

BMC
ORG

and its related components can be easily routed with

8
CARDINAL

or

10
CARDINAL

layers of

PCB
ORG

. In addition, the dielectric material used on DC-SCM

PCB
ORG

is low-loss dielectric — another cost saver compared to ultra-low loss dielectric materials used on HPM PCB.

Modularized design enables flexibility

DC-SCM modularizes server management and security components into a common add-in card form factor, enabling developers to remove customer specific solutions from the more complex components, such as motherboards, to the DC-SCM. This provides flexibility for developers to offer multiple customer-specific solutions, without the need to redesign multiple motherboards for each solution.

Developers are able to reuse the DC-SCM from a previous generation of server design, if the management and security requirements remain the same. This reduces the overall cost of upgrading to a new generation of servers, and has the potential to reduce e-waste when a server is decommissioned.

Likewise, management and security solution upgrades within a server generation can be carried out separately by modifying or replacing the DC-SCM. The more complex components on the HPM do not need to be redesigned. From a data center perspective, it speeds up the upgrade of management and security hardware across multiple server platforms.

Unified interoperable OpenBMC firmware development

Data center secure control interface (

DC-SCI
ORG

) is a standardized hardware interface between

DC
GPE

-SCM and

the Host Processor Module
ORG

(HPM). It provides a basis for electrical interoperability between different DC-SCM and host processor module (HPM) designs.

This interoperability makes it possible to have a unified firmware image across multiple

DC
GPE

-SCM designs, concentrating development resources on a single firmware rather than an array of them. The publicly-accessible OpenBMC repository provides a perfect platform for firmware developers of different companies to collaborate and develop such unified OpenBMC images. Instead of maintaining a separate

BMC
ORG

firmware image for each platform, we now use a single image that can be applied across multiple server platforms. The device tree specific to each respective server is automatically loaded based on device product information.

Using a unified OpenBMC image significantly simplifies the process of releasing

BMC
ORG

firmware to multiple server platforms.

Firmware
ORG

updates and changes are propagated to all supported platforms in a single firmware release.


Project Argus
ORG

The DC-SCM specifications have been driven by

the Open Compute Project
ORG

(OCP) Foundation hardware management workstream, as a way to standardize server management, security, and control features.


Cloudflare
ORG

has partnered with

Lenovo
ORG

on what we call

Project Augus
ORG

,

Cloudflare
ORG

’s

first
ORDINAL

DC-SCM implementation that fully adheres to the DC-SCM 2.0 specification. In the DC-SCM 2.0 specifications, a few design items are left open for implementers to decide on the most suitable architectural choices. With the goal of improving interoperability of

Cloudflare DC-SCM
ORG

designs across server vendors and server designs,

Project Argus
ORG

includes documentation on implementation details and design decisions on form factor, mechanical locking mechanism, faceplate design,

DC-SCI
ORG

pin out,

BMC
ORG

chip, BMC pinout,

Hardware Root of Trust
ORG

(HWRoT), HWRoT pinout, and minimum bootable device tree.

Figure

3
CARDINAL

:

Project Argus DC-SCM
ORG

2.0

At the heart of the

Project Argus DC-SCM
ORG

is

the ASPEED
ORG

AST2600 BMC System on Chip (SoC), which when loaded with a compatible OpenBMC firmware, provides a rich set of common features necessary for remote server management.

ASPEED AST1060
ORG

is used on

Project Argus DC-SCM
ORG

as the HWRoT solution, providing secure firmware authentication, firmware recovery, and firmware update capability.

Project Argus DC-SCM 2.0
ORG

uses

Lattice
PRODUCT

MachXO3D CPLD with secure boot and dual boot ability as the

DC
GPE

-SCM CPLD to support a variety of

IO
ORG

interfaces including

LTPI
ORG

,

SGPIO
PERSON

, UART and GPIOs.

The mechanical form factor of

Project Argus DC-SCM 2.0
ORG

is the horizontal External Form Factor (

EFF
ORG

).


Cloudflare
ORG

and

Lenovo
ORG

have contributed

Project Argus Design Specification
ORG

and reference design files to the

OCP
ORG

contribution database. Below is a detailed list of our contribution:

SPI, I2C/I3C, UART,

LTPI
ORG

/

SGPIO
PERSON

block diagrams

DC-SCM

PCB
PRODUCT


stackup

DC-SCM Board
PERSON

placements (TOP and BOTTOM layers)

DC-SCM schematic PDF file

DC-SCI pin definition PDF file

Power sequence PDF file

DC-SCM bill of materials

Excel
PRODUCT

spreadsheet

Minimum bootable device tree requirements

Mechanical Drawings PDF files, including card assembly drawing and interlock rail drawing

The security foundation for our

Gen 12
LAW

hardware

Cloudflare has been innovating around server design for

many years
DATE

, delivering increased performance per watt and reduced carbon footprints. We are excited to integrate

Project Argus DC-SCM
ORG

2.0 into our next-generation, Cloudflare Gen

12
CARDINAL

servers. Stay tuned for more exciting updates on

Cloudflare Gen 12
PRODUCT

hardware design!