Virtual Machine Monitoring
Monitor virtual machines across Proxmox, KVM, QEMU, Xen, and Hyper-V with comprehensive metrics and insights
Virtual Machine Monitoring
Netwarden provides comprehensive monitoring for virtual machines across multiple hypervisors including Proxmox, libvirt (KVM/QEMU/Xen), and Hyper-V, giving you complete visibility into your virtualization infrastructure.
Overview
VM monitoring collects detailed metrics about virtual machine resource usage, performance, and health status. The agent automatically detects your hypervisor and begins collecting metrics without requiring guest agent installation.
Supported Hypervisors
Proxmox VE
- Versions: 6.0+ supported
- API Integration: REST API v2
- Authentication: User/password or API tokens
- Cluster Support: Yes, multi-node monitoring
KVM/QEMU (via libvirt)
- Detection: libvirt daemon presence
- Connection: Local or remote libvirt
- Metrics: Via libvirt API
- Live Migration: Tracked automatically
Xen
- Versions: Xen 4.0+ supported
- XenServer/XCP-ng: Full support
- Dom0 Metrics: Included
- PV and HVM: Both supported
Hyper-V
- Versions: Windows Server 2016+
- Detection: Hyper-V role enabled
- Integration: WMI/PowerShell
- Cluster Support: Yes
Configuration
Enable VM monitoring in /etc/netwarden/netwarden.conf:
ini# Enable VM monitoring enable_vms = true # Hypervisor type: auto, proxmox, libvirt, kvm, xen, qemu, hyperv vm_hypervisor = "auto" # Stats collection interval vm_stats_interval = "60s" # Include/exclude specific VMs (regex patterns) # vm_include = ["prod-*", "app-*"] # vm_exclude = ["test-*", "dev-*"] # Performance settings vm_parallel_stats = 10 # Query 10 VMs in parallel vm_cache_timeout = "5m" # Cache VM list for 5 minutes
Proxmox Configuration
ini# Proxmox-specific settings proxmox_api = "https://proxmox.example.com:8006" proxmox_username = "monitoring@pve" proxmox_password = "your_password" # OR use API token (recommended) proxmox_token_id = "monitoring@pve!monitor-token" proxmox_token_secret = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" # Optional settings proxmox_node = "" # Empty for all nodes, or specify node name proxmox_skip_tls_verify = false # Set true for self-signed certs
Libvirt Configuration (KVM/QEMU/Xen)
ini# Libvirt-specific settings libvirt_uri = "qemu:///system" # Local connection # Remote connection examples: # libvirt_uri = "qemu+ssh://root@host/system" # libvirt_uri = "qemu+tcp://host/system" # libvirt_uri = "xen+ssh://root@xenhost/system" # Custom socket path (if needed) # libvirt_socket = "/var/run/libvirt/libvirt-sock"
Metrics Collected
VM Resource Metrics
- CPU: Usage %, vCPU count, CPU time, ready time
- Memory: Used, available, balloon, swap usage
- Disk I/O: Read/write IOPS, throughput, latency
- Network: Packets, bytes, errors per interface
VM State Information
- Power State: Running, stopped, paused, suspended
- Uptime: VM running duration
- Guest OS: Detected operating system
- Configuration: vCPUs, RAM, disk sizes
Hypervisor Metrics
- Host CPU: Overall usage, per-VM allocation
- Host Memory: Total, used, available, overcommit
- Storage: Datastore usage, thin provisioning
- Network: Virtual switch statistics
Proxmox Monitoring Setup
1. Create Monitoring User
bash# On Proxmox host pveum user add monitoring@pve pveum passwd monitoring@pve pveum aclmod / -user monitoring@pve -role PVEAuditor
2. Create API Token (Recommended)
bash# Create API token for monitoring pveum user token add monitoring@pve monitor-token --privsep=0 # Output will show: # Token: monitoring@pve!monitor-token # Secret: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
3. Configure Agent
ini# /etc/netwarden/netwarden.conf enable_vms = true vm_hypervisor = "proxmox" proxmox_api = "https://192.168.1.100:8006" proxmox_token_id = "monitoring@pve!monitor-token" proxmox_token_secret = "your-token-secret-here"
KVM/QEMU Setup via Libvirt
1. Install Required Packages
bash# Debian/Ubuntu sudo apt-get install libvirt-clients # RHEL/CentOS sudo yum install libvirt-client
2. Grant Access to Libvirt
bash# Add netwarden user to libvirt group sudo usermod -aG libvirt netwarden # Or use polkit rule for more control cat > /etc/polkit-1/rules.d/50-netwarden.rules << EOF polkit.addRule(function(action, subject) { if (action.id == "org.libvirt.unix.monitor" && subject.user == "netwarden") { return polkit.Result.YES; } }); EOF
3. Configure Agent
ini# /etc/netwarden/netwarden.conf enable_vms = true vm_hypervisor = "libvirt" libvirt_uri = "qemu:///system"
VM Dashboards
Infrastructure Overview
- Total VMs by state (running, stopped, paused)
- Resource allocation vs capacity
- Top VMs by CPU/memory usage
- VM distribution across hosts
- Storage utilization by VM
Individual VM Details
- Real-time CPU and memory graphs
- Disk I/O performance metrics
- Network traffic per interface
- Console screenshot (if available)
- Configuration changes history
- Snapshot information
Alerting on VM Metrics
Common VM Alerts
ini# High CPU Usage Alert: VM CPU Critical Metric: vm.cpu.usage.percent VM: prod-* Threshold: > 90% Duration: 5 minutes # Memory Pressure Alert: VM Memory High Metric: vm.memory.usage.percent VM: * Threshold: > 85% Duration: 10 minutes # Disk I/O Latency Alert: VM Disk Latency High Metric: vm.disk.latency.ms VM: database-* Threshold: > 20 Duration: 5 minutes # VM State Change Alert: VM Unexpected Shutdown Metric: vm.power.state VM: prod-* Value: != running Duration: 1 minute # Snapshot Age Alert: Old VM Snapshot Metric: vm.snapshot.age.days VM: * Threshold: > 7 Duration: 1 hour
Multi-Hypervisor Environments
For environments with multiple hypervisors:
ini# Agent on Proxmox host enable_vms = true vm_hypervisor = "proxmox" proxmox_api = "https://localhost:8006" # Agent on KVM host enable_vms = true vm_hypervisor = "libvirt" libvirt_uri = "qemu:///system" # Central dashboard shows all VMs across both
Performance Optimization
Reduce API Load
ini# Increase cache timeout for stable environments vm_cache_timeout = "15m" # Reduce parallel queries for busy hosts vm_parallel_stats = 5 # Increase collection interval vm_stats_interval = "120s"
Exclude Non-Critical VMs
ini# Only monitor production VMs vm_include = ["prod-*", "staging-*"] # Exclude templates and test VMs vm_exclude = ["template-*", "test-*", "tmp-*"]
Troubleshooting
No VMs Detected
- Verify hypervisor is running:
bash# Proxmox systemctl status pveproxy # KVM/libvirt systemctl status libvirtd # Xen xl list
- Test API connectivity:
bash# Proxmox curl -k https://proxmox.local:8006/api2/json/version # Libvirt virsh -c qemu:///system list --all
- Check agent logs:
bashsudo journalctl -u netwarden -n 50 | grep -E "vm|hypervisor"
Authentication Failures
For Proxmox:
bash# Test authentication curl -k -d "username=monitoring@pve&password=yourpass" \ https://proxmox:8006/api2/json/access/ticket # Verify token curl -k -H "Authorization: PVEAPIToken=monitoring@pve!token=secret" \ https://proxmox:8006/api2/json/version
For libvirt:
bash# Test connection virsh -c qemu:///system version # Check permissions groups netwarden # Should include 'libvirt'
Missing Metrics
- Verify guest tools/agents:
bash# Proxmox - QEMU Guest Agent qm agent <vmid> ping # VMware - VMware Tools vim-cmd vmsvc/tools.status <vmid>
- Check VM state:
bash# VM must be running for most metrics virsh list --all qm list
Best Practices
- Use API Tokens: More secure than passwords for Proxmox
- Read-Only Access: Agent only needs read permissions
- Monitor Host Resources: Track host capacity alongside VMs
- Set Resource Limits: Define CPU/memory limits on VMs
- Regular Snapshots: Monitor snapshot age and size
- Network Segmentation: Use dedicated monitoring VLAN if possible
Security Considerations
Proxmox Security
- Use API tokens instead of passwords
- Create dedicated monitoring user with minimal permissions
- Enable TLS certificate verification in production
- Restrict API access by IP if possible
Libvirt Security
- Use polkit rules for fine-grained access control
- Avoid root access for libvirt connections
- Use SSH keys for remote libvirt connections
- Enable SASL authentication for TCP connections
Integration with Cloud Platforms
Azure VM Monitoring
For Azure VMs, Netwarden can collect metrics via:
- Azure Monitor API integration
- Guest agent installation
- Resource Graph queries
AWS EC2 Monitoring
For AWS EC2 instances:
- CloudWatch metrics integration
- Direct agent installation
- VPC endpoint connection
Google Cloud VMs
For GCP Compute Engine:
- Cloud Monitoring API
- Ops Agent compatibility
- Service account authentication
Advanced Features
Live Migration Tracking
Automatically follows VMs during live migration:
ini# Metrics continue during migration vm.migration.status = "active" vm.migration.progress = 45% vm.migration.source = "host1" vm.migration.destination = "host2"
Capacity Planning
Track resource trends for capacity planning:
- VM growth rate
- Resource utilization trends
- Overcommit ratios
- Storage growth projection
Compliance Monitoring
Ensure VMs meet compliance requirements:
- Resource allocation compliance
- Backup verification
- Patch status (with guest agent)
- Configuration drift detection