-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Temperature Monitoring System Feature Request
Is your feature request related to a problem?
Currently, the Unraid API provides limited temperature monitoring capabilities. While disk temperatures are available through the DisksService using smartctl, there's no comprehensive temperature monitoring system that covers:
- CPU temperatures (package and per-core)
- Motherboard temperatures
- GPU temperatures
- NVMe drive temperatures
- Chipset temperatures
- System-wide temperature aggregation and alerts
Users need a unified API to monitor all temperature sensors for system health monitoring, alerting, and integration with third-party monitoring solutions like Grafana, Home Assistant, or custom dashboards.
Describe the solution you'd like
A comprehensive temperature monitoring system integrated into the Unraid API that:
-
Provides real-time temperature data for all available sensors
-
Supports multiple temperature sources:
- CPU (package temp, per-core temps)
- Motherboard sensors (chipset, VRM, ambient)
- GPU temperatures
- Storage devices (HDDs, SSDs, NVMe)
- Custom/additional sensors via IPMI or USB sensors
-
Features GraphQL queries and subscriptions for:
- Current temperature readings
- Historical temperature data (with configurable retention)
- Temperature alerts and thresholds
- Real-time temperature updates via subscriptions
-
Supports multiple monitoring tools:
- lm-sensors for motherboard/CPU sensors
- smartctl for disk temperatures (already partially implemented)
- nvidia-smi for NVIDIA GPU temperatures
- sensors from IPMI (if available)
Describe alternatives you've considered
- External monitoring stacks (Telegraf/InfluxDB/Grafana) - Requires additional containers and configuration complexity
- SNMP monitoring - Limited sensor support and requires additional SNMP configuration
- Dynamix System Temp plugin - WebGUI only, no API access
- Custom scripts - No standardized API, difficult to maintain
Additional context
Binary Management Strategy
IMPORTANT: Temperature monitoring binaries should be downloaded from safe sources during the plugin build process and included in the plugin's TXZ package:
// Enhancement to the plugin build process
// Location: plugin/builder/build-txz.ts
// Add function to download monitoring tools during build
const downloadMonitoringTools = async (targetDir: string) => {
console.log("Downloading temperature monitoring tools from safe sources...");
const tools = [
{
name: 'sensors',
url: 'https://github.com/lm-sensors/lm-sensors/releases/download/v3.6.0/sensors-3.6.0-x86_64',
sha256: 'abc123...', // Verify integrity
},
{
name: 'smartctl',
url: 'https://sourceforge.net/projects/smartmontools/files/smartmontools/7.4/smartctl-7.4-x86_64',
sha256: 'def456...', // Verify integrity
},
{
name: 'nvidia-smi',
url: 'https://developer.nvidia.com/downloads/nvidia-smi-545.29.06-x86_64',
sha256: 'ghi789...', // Verify integrity
}
];
const monitoringDir = join(targetDir, 'usr/local/emhttp/plugins/unraid-api/monitoring');
await fs.mkdir(monitoringDir, { recursive: true });
for (const tool of tools) {
console.log(`Downloading ${tool.name}...`);
const response = await fetch(tool.url);
const buffer = await response.arrayBuffer();
// Verify SHA256 checksum
const hash = crypto.createHash('sha256');
hash.update(Buffer.from(buffer));
if (hash.digest('hex') !== tool.sha256) {
throw new Error(`Checksum verification failed for ${tool.name}`);
}
// Save binary
const toolPath = join(monitoringDir, tool.name);
await fs.writeFile(toolPath, Buffer.from(buffer));
await fs.chmod(toolPath, 0o755);
console.log(`✓ ${tool.name} downloaded and verified`);
}
};
// Call during TXZ build process
await downloadMonitoringTools(sourceDir);This approach:
- Downloads binaries from trusted, official sources during build time
- Verifies integrity using SHA256 checksums
- Includes binaries in the plugin TXZ package
- Ensures consistent versions across all installations
- Avoids runtime downloads or system package dependencies
- Maintains security by verifying all downloaded binaries
Integration with Existing Metrics Module
Temperature monitoring should be integrated into the existing MetricsResolver rather than creating a separate module:
// Extend existing MetricsResolver
// Location: api/src/unraid-api/graph/resolvers/metrics/
metrics/
├── metrics.module.ts # Update to include temperature service
├── metrics.resolver.ts # Extend with temperature fields
├── metrics.model.ts # Add temperature types
├── temperature/
│ ├── temperature.service.ts # Core temperature service
│ ├── temperature.model.ts # Temperature-specific models
│ └── sensors/
│ ├── sensor.interface.ts
│ ├── lm-sensors.service.ts
│ ├── smartctl.service.ts
│ └── gpu.service.ts
└── __tests__/
└── temperature.service.spec.tsNestJS Model Extensions (Integrated with Metrics)
// Location: api/src/unraid-api/graph/resolvers/metrics/temperature/temperature.model.ts
import { Field, Float, Int, ObjectType, registerEnumType } from '@nestjs/graphql';
import { Node } from '@unraid/shared/graphql.model.js';
import { IsEnum, IsNumber, IsOptional, IsString } from 'class-validator';
export enum TemperatureUnit {
CELSIUS = 'CELSIUS',
FAHRENHEIT = 'FAHRENHEIT',
}
registerEnumType(TemperatureUnit, {
name: 'TemperatureUnit',
});
export enum TemperatureStatus {
NORMAL = 'NORMAL',
WARNING = 'WARNING',
CRITICAL = 'CRITICAL',
UNKNOWN = 'UNKNOWN',
}
registerEnumType(TemperatureStatus, {
name: 'TemperatureStatus',
});
export enum SensorType {
CPU_PACKAGE = 'CPU_PACKAGE',
CPU_CORE = 'CPU_CORE',
MOTHERBOARD = 'MOTHERBOARD',
CHIPSET = 'CHIPSET',
GPU = 'GPU',
DISK = 'DISK',
NVME = 'NVME',
AMBIENT = 'AMBIENT',
VRM = 'VRM',
CUSTOM = 'CUSTOM',
}
registerEnumType(SensorType, {
name: 'SensorType',
description: 'Type of temperature sensor',
});
@ObjectType()
export class Temperature {
@Field(() => Float, { description: 'Temperature value' })
@IsNumber()
value!: number;
@Field(() => TemperatureUnit, { description: 'Temperature unit' })
@IsEnum(TemperatureUnit)
unit!: TemperatureUnit;
@Field(() => Date, { description: 'Timestamp of reading' })
timestamp!: Date;
@Field(() => TemperatureStatus, { description: 'Temperature status' })
@IsEnum(TemperatureStatus)
status!: TemperatureStatus;
}
@ObjectType({ implements: () => Node })
export class TemperatureSensor extends Node {
@Field(() => String, { description: 'Sensor name' })
@IsString()
name!: string;
@Field(() => SensorType, { description: 'Type of sensor' })
@IsEnum(SensorType)
type!: SensorType;
@Field(() => String, { nullable: true, description: 'Physical location' })
@IsOptional()
@IsString()
location?: string;
@Field(() => Temperature, { description: 'Current temperature' })
current!: Temperature;
@Field(() => Temperature, { nullable: true, description: 'Minimum recorded' })
@IsOptional()
min?: Temperature;
@Field(() => Temperature, { nullable: true, description: 'Maximum recorded' })
@IsOptional()
max?: Temperature;
@Field(() => Float, { nullable: true, description: 'Warning threshold' })
@IsOptional()
@IsNumber()
warning?: number;
@Field(() => Float, { nullable: true, description: 'Critical threshold' })
@IsOptional()
@IsNumber()
critical?: number;
}
@ObjectType()
export class TemperatureSummary {
@Field(() => Float, { description: 'Average temperature across all sensors' })
@IsNumber()
average!: number;
@Field(() => TemperatureSensor, { description: 'Hottest sensor' })
hottest!: TemperatureSensor;
@Field(() => TemperatureSensor, { description: 'Coolest sensor' })
coolest!: TemperatureSensor;
@Field(() => Int, { description: 'Count of sensors at warning level' })
@IsNumber()
warningCount!: number;
@Field(() => Int, { description: 'Count of sensors at critical level' })
@IsNumber()
criticalCount!: number;
}
@ObjectType({ implements: () => Node })
export class TemperatureMetrics extends Node {
@Field(() => [TemperatureSensor], { description: 'All temperature sensors' })
sensors!: TemperatureSensor[];
@Field(() => TemperatureSummary, { description: 'Temperature summary' })
summary!: TemperatureSummary;
}
// Extend existing Metrics model
// Location: api/src/unraid-api/graph/resolvers/metrics/metrics.model.ts
import { TemperatureMetrics } from './temperature/temperature.model.js';
@ObjectType({ implements: () => Node })
export class Metrics extends Node {
// ... existing fields ...
@Field(() => TemperatureMetrics, {
nullable: true,
description: 'Temperature metrics'
})
temperature?: TemperatureMetrics;
}Binary/Tool Setup Requirements
// temperature.service.ts - Use plugin-bundled binaries
import { join } from 'path';
import { ConfigService } from '@nestjs/config';
export class TemperatureService implements OnModuleInit {
private readonly binPath: string;
private availableTools: Map<string, string> = new Map();
constructor(private readonly configService: ConfigService) {
// Use binaries bundled with the plugin
this.binPath = this.configService.get(
'API_MONITORING_BIN_PATH',
'/usr/local/emhttp/plugins/unraid-api/monitoring'
);
}
async onModuleInit() {
// Use bundled binaries instead of system tools
await this.initializeBundledTools();
// Initialize sensor detection for available tools
if (this.availableTools.has('sensors')) {
await this.initializeLmSensors();
}
if (this.availableTools.has('smartctl')) {
// Already available through DisksService
}
if (this.availableTools.has('nvidia-smi')) {
await this.initializeNvidiaMonitoring();
}
}
private async initializeBundledTools(): Promise<void> {
const tools = [
'sensors', // lm-sensors
'smartctl', // smartmontools
'nvidia-smi', // NVIDIA driver
'ipmitool', // IPMI tools
];
for (const tool of tools) {
const toolPath = join(this.binPath, tool);
try {
await execa(toolPath, ['--version']);
this.availableTools.set(tool, toolPath);
this.logger.log(`Temperature tool available: ${tool} at ${toolPath}`);
} catch {
this.logger.warn(`Temperature tool not found: ${tool}`);
}
}
}
// Use bundled binary paths for all executions
private async execTool(toolName: string, args: string[]): Promise<string> {
const toolPath = this.availableTools.get(toolName);
if (!toolPath) {
throw new Error(`Tool ${toolName} not available`);
}
const { stdout } = await execa(toolPath, args);
return stdout;
}
}Integration with Existing MetricsResolver
// Extend existing MetricsResolver (don't create separate resolver)
// Location: api/src/unraid-api/graph/resolvers/metrics/metrics.resolver.ts
@Resolver(() => Metrics)
export class MetricsResolver implements OnModuleInit {
constructor(
private readonly cpuService: CpuService,
private readonly memoryService: MemoryService,
private readonly temperatureService: TemperatureService, // Add temperature service
private readonly subscriptionTracker: SubscriptionTrackerService,
private readonly subscriptionHelper: SubscriptionHelperService
) {}
onModuleInit() {
// Existing CPU and Memory polling...
// Add temperature polling with 5 second interval
this.subscriptionTracker.registerTopic(
PUBSUB_CHANNEL.TEMPERATURE_METRICS,
async () => {
const payload = await this.temperatureService.getMetrics();
pubsub.publish(PUBSUB_CHANNEL.TEMPERATURE_METRICS, {
systemMetricsTemperature: payload
});
},
5000
);
}
// Add temperature field to Metrics type
@ResolveField(() => TemperatureMetrics, { nullable: true })
public async temperature(): Promise<TemperatureMetrics> {
return this.temperatureService.getMetrics();
}
// Add temperature subscription following existing pattern
@Subscription(() => TemperatureMetrics, {
name: 'systemMetricsTemperature',
resolve: (value) => value.systemMetricsTemperature,
})
@UsePermissions({
action: AuthActionVerb.READ,
resource: Resource.INFO,
possession: AuthPossession.ANY,
})
public async systemMetricsTemperatureSubscription() {
return this.subscriptionHelper.createTrackedSubscription(
PUBSUB_CHANNEL.TEMPERATURE_METRICS
);
}
}
// Update MetricsModule
@Module({
imports: [ServicesModule],
providers: [
MetricsResolver,
CpuService,
MemoryService,
TemperatureService, // Add temperature service
],
exports: [MetricsResolver],
})
export class MetricsModule {}Configuration Options
// api/dev/configs/api.json additions
{
"temperature": {
"enabled": true,
"polling_interval": 5000,
"history_retention": 86400,
"default_unit": "celsius",
"thresholds": {
"cpu_warning": 70,
"cpu_critical": 85,
"disk_warning": 50,
"disk_critical": 60,
"gpu_warning": 80,
"gpu_critical": 90
},
"sensors": {
"lm_sensors": {
"enabled": true,
"config_path": "/etc/sensors3.conf"
},
"smartctl": {
"enabled": true
},
"gpu": {
"enabled": true,
"nvidia": true,
"amd": false
},
"ipmi": {
"enabled": false,
"host": "localhost",
"username": "",
"password": ""
}
}
}
}Environment (if relevant)
Unraid OS Version: 6.12+ (requires lm-sensors support)
Pre-submission Checklist
- I have searched existing issues to ensure this feature hasn't already been requested
- This is not an Unraid Connect related feature
- I have provided clear examples and implementation details for the feature
Bounty Development Guidelines
For developers interested in implementing this feature:
- Download binaries during plugin build - Use build-txz.ts to fetch from safe sources
- Integrate with existing MetricsResolver - Don't create a separate temperature module
- Start with the core TemperatureService that aggregates data from bundled tools
- Implement lm-sensors integration first as it provides the most sensor coverageno
- Enhance the existing DisksService temperature monitoring with history tracking
- Add GPU temperature support (NVIDIA first, AMD if feasible)
- Extend MetricsResolver with temperature fields and subscriptions
- Add comprehensive unit tests for all services
- Document the API endpoints and configuration options
- Consider performance - use caching where appropriate to avoid excessive tool invocations
- Follow existing patterns in the codebase (especially systemMetricsCpu/systemMetricsMemory)
- Make the feature modular - gracefully handle missing tools
Binary Management
The plugin build process (build-txz.ts) should:
- Download monitoring tools from official, trusted sources
- Verify SHA256 checksums for security
- Include binaries in the plugin TXZ at
/usr/local/emhttp/plugins/unraid-api/monitoring/ - Set proper executable permissions
- Ensure compatibility across different Unraid versions
Testing Requirements
- Unit tests for all services
- Integration tests for GraphQL resolvers
- Mock data for systems without temperature sensors
- Performance tests to ensure polling doesn't impact system performance
- Test with bundled binaries on different Unraid versions
Deliverables
- Temperature service implementation within metrics module
- Enhancement to build-txz.ts for downloading monitoring tools
- NestJS models with GraphQL decorators for temperature types
- Unit and integration tests
- Documentation (API docs and configuration guide)
- Example GraphQL queries for temperature data
Metadata
Metadata
Assignees
Labels
Type
Projects
Status