Mercury Inference Cluster


Cluster Overview

Basic Information

Cluster Type

AI Inference

Status

Healthy

Total Racks

4

Location

Datacenter

US-EAST-1

Halls

Hall A

Description

Low-latency inference cluster optimized for serving AI models

Cluster Utilization

5.2%

87.1%

390 active jobs

Power Consumption

19.5 MW

PUE: 1 • 610.2 kW/node

GPU Health

96%

All GPUs operational

Compute Performance

390.0K TFLOPS

195 jobs queued

Cluster Specifications

Compute Resources

Total Nodes

32

CPU Cores

2,048

Memory

16 TB

Storage

0.5 PB

GPU Configuration

Total GPUs

128

GPU Models

L4, L40S

Topology

CUSTOM

Interconnect

ETHERNET

GPU Utilization

87%

Network Configuration

Compute Fabric

INFINIBAND_NDR

Topology

DRAGONFLY

Bandwidth

39 Tbps

Latency

1980 ns

Management Subnet

10.27.8.0/24

Cluster Utilization

Loading cluster utilization data...

Rack Composition

Rack R1-1

COMPUTE

GPUs

48

92% healthy

Power

78.4 / 55 kW

Cooling

rear door

Temps

19°C → 35°C

Space

28/48U (20U free)

Rack R1-2

COMPUTE

GPUs

48

100% healthy

Power

75.9 / 55 kW

Cooling

rear door

Temps

14°C → 42°C

Space

28/48U (20U free)

Rack R1-3

NETWORK

Power

11.5 / 35 kW

Cooling

rear door

Temps

10°C → 50°C

Space

18/48U (30U free)

Rack R1-4

STORAGE

Power

12.3 / 35 kW

Cooling

rear door

Temps

26°C → 22°C

Space

16/48U (32U free)

Workload Scheduler

Type

PBS

Endpoint

https://annual-transparency.biz/

Version

1.0.19

Jobs Running

390

Jobs Queued

195

Configuration

Auto Scaling

Enabled

Power Capping

Enabled

Power Limit

19.5 MW

Maintenance Window

Tue 9:00 (4h)

Connected Storage Systems

VAST Data PlatformHEALTHY

VASTVAST

Capacity

600

/ 1200 TB

Performance

IOPS

950.0K

Throughput

95 GB/s

1 degraded drives

NetApp ONTAPHEALTHY

NETAPPNETAPP

Capacity

1000

/ 2000 TB

Performance

IOPS

1.3M

Throughput

125 GB/s

1 degraded drives

Metadata

Created

7/5/2025, 8:29:55 AM

Last Updated

7/5/2025, 8:29:55 AM

Tags

productionproduction