Skip to content

Rig Architecture

A GeForce NOW "rig" is a Windows VM running on bare-metal NVIDIA GPU servers. Each rig hosts a single game session at a time.

Hardware Layer

Rigs run on NVIDIA's nvmetal bare-metal infrastructure. The example rig from the backup:

PropertyValue
Hostname[REDACTED_HOSTNAME]
Zone[REDACTED_ZONE] (Amsterdam)
GPUNVIDIA L40 (full)
Instance Type[REDACTED_INSTANCE_TYPE]
PlatformNGN Platform v2.1
HypervisorXen
OSWindows 11 (QCOW2 image)

Virtualization

Rigs run as Xen VMs. The framework includes:

  • LocalXen / RemoteXen — VM lifecycle management (start, stop, restart)
  • PCI passthrough — GPU pinned directly to VM
  • Xen toolsxenstore_client.exe for reading VM metadata
  • Hostname sync from Xen VM name via wmic computersystem

Zone Architecture

Zones are geographic clusters of rigs:

Zone: [REDACTED_ZONE] (Amsterdam)
├── Provision Managers (PM): [REDACTED_IP], [REDACTED_IP], ...
├── Game Seat Gateway (GSG): gsg.[REDACTED_ZONE].svc.cluster.local:443
├── DNS Cache: [REDACTED_IP]
├── KMS Host: consumerkms.nvidiangn.net:1688
├── Seat Pool: [REDACTED_POOL]
└── Logging Server: [REDACTED_IP]

Zone Properties

Stored in Redis and accessible via asgard_util_zone_properties:

SectionContents
ZonePropertiesZone name, mode (gaming/pro), region
NetworkTopologyIP ranges, subnets, routing
GameMachineGPU type, instance config
NATVMNAT virtual machine settings

Machine Roles

RoleDescription
awsseatsGame seat VMs
natvmNAT/gateway VMs
pmProvision Manager nodes
redisState database nodes
storageStorage servers

State Database

Every rig connects to a Redis instance (port 6399) for:

  • Configuration storage
  • Session state tracking
  • Zone property caching
  • Service registration

Seat Pool & Instance Types

The seat pool [REDACTED_POOL] determines rig capabilities:

Instance TypeGPU FractionUse Case
[REDACTED_INSTANCE_TYPE_PATTERN]Full L40Premium tier (4K, HDR, 120fps)
[REDACTED_INSTANCE_TYPE_PATTERN]Half L40Standard tier
[REDACTED_INSTANCE_TYPE_PATTERN]Quarter L40Free tier
ga10g_2.*_largeHalf A10GStandard tier
gt10_2.*_mediumHalf T10Legacy tier

Half/quarter GPUs have restrictions like:

  • AV1 encoding disabled on half GPUs
  • Video encoder perf check skipped
  • Lower resolution caps

L1 Validation Tests

Before a rig accepts sessions, l1test.py validates:

  1. Environment variables (AG_LOGS, AG_HOME)
  2. Required services installed:
    • nvcloudinit
    • KioskPwdChanger
    • seatinitservice
    • NvContainerLocalSystem
  3. User accounts (kiosk, xen)
  4. GPU present and drivers installed
  5. Network connectivity (DHCP/ping)
  6. Rollback state clean

Crash Detection & Recovery

A Windows Scheduled Task (CrashDetector.xml) runs detector.exe at boot:

  • Monitors for crash dumps: nvstreamer.exe, nvcontainer.exe, NvRtcStreamer
  • Collects .dmp files from:
    • C:\ProgramData\NVIDIA Corporation\Crashdumps\
    • C:\Windows\MEMORY.DMP
    • C:\Windows\minidump\
    • C:\asgard\logs\AutoOnboarder\
  • Reports crashes via telemetry events:
    • NGS_NvStreamerCrash
    • NGS_NvContainerCrash
    • NGS_KernelCrash
    • NGS_CTMTCrash
    • NGS_TasCrash

Windows Licensing

Rigs use KMS activation:

KMS Host: consumerkms.nvidiangn.net:1688

The rearmWindows.py script:

  • Checks license grace period via slmgr.vbs
  • Rearms when remaining time < 15 days
  • Disables Windows Activation Technologies (WatAdminSvc)
  • Triggers reboot after rearm

Performance Monitoring

GridPerf Templates

Windows Performance Monitor templates (GridPerfMonTemplate.xml) track:

  • Process metrics: nvstreamer, NvRtcStreamer, NvGridSvc, apptracer
  • GPU counters: Temperature, clock speeds, utilization, frame buffer usage
  • System: Memory, disk, CPU, network

GPU Performance Counters

Custom NVIDIA counter manifest (nvPerfProvider.man):

CounterUnit
TemperatureDegrees Celsius
Graphics ClockMHz
Memory ClockMHz
Graphics Utilization%
Frame Buffer Utilization%
Video Utilization%
Fan/Cooler Level%

admindesk.top — Reversed & documented from Asgard rig backups and GCIS plugin binaries.