Introduction: Network Observability
You can't manage what you can't measure. Network management protocols provide visibility into device health, traffic patterns, and system events—essential for troubleshooting and capacity planning.
Series Context: This is Part 16 of 20 in the Complete Protocols Master series. These protocols operate at the Application Layer but monitor all layers of the network stack.
1
Part 1: OSI Model & Protocol Foundations
Network layers, encapsulation, TCP/IP model
2
Physical & Data Link Layers
Ethernet, Wi-Fi, VLANs, MAC addressing
3
Network Layer & IP
IPv4, IPv6, ICMP, routing protocols
4
Transport Layer
TCP, UDP, QUIC, ports, sockets
5
Session & Presentation Layers
TLS handshake, encryption, serialization
6
Web Protocols
HTTP/1.1, HTTP/2, HTTP/3, WebSockets
7
API Protocols
REST, GraphQL, gRPC, SOAP
8
DNS Deep Dive
DNS hierarchy, records, DNSSEC
9
Email Protocols
SMTP, IMAP, POP3, SPF/DKIM/DMARC
10
File Transfer Protocols
FTP, SFTP, SCP, rsync
11
Real-Time Protocols
WebRTC, SIP, RTP, VoIP
12
Streaming Protocols
HLS, DASH, RTMP, media delivery
13
IoT Protocols
MQTT, CoAP, Zigbee, LoRaWAN
14
VPN & Tunneling
IPsec, OpenVPN, WireGuard
15
Authentication Protocols
OAuth, SAML, OIDC, Kerberos
16
Network Management
SNMP, NetFlow, Syslog
You Are Here
17
Security Protocols
TLS/SSL, certificates, PKI
18
Cloud Provider Protocols
AWS, Azure, GCP APIs
19
Emerging Protocols
QUIC, HTTP/3, WebTransport
20
Web Security Standards
CORS, CSP, HSTS, SRI
Concepts
Three Pillars of Network Monitoring
Network Observability Pillars:
1. METRICS (SNMP)
• Device health (CPU, memory, uptime)
• Interface statistics (bytes, errors)
• Configuration state
2. FLOWS (NetFlow/IPFIX)
• Traffic patterns (who talks to whom)
• Bandwidth usage per application
• Security anomaly detection
3. LOGS (Syslog)
• Events and alerts
• Configuration changes
• Security incidents
Together they answer:
• SNMP: "Is the device healthy?"
• NetFlow: "What traffic is flowing?"
• Syslog: "What events occurred?"
SNMP: Simple Network Management Protocol
SNMP is the universal language for device monitoring. Query router interfaces, check printer ink, monitor server CPU—if it's networked, SNMP can probably monitor it.
Components
SNMP Architecture
SNMP Components:
1. MANAGER (NMS)
• Polls agents for data
• Receives traps/informs
• Examples: Nagios, Zabbix, PRTG
2. AGENT
• Runs on managed device
• Responds to queries
• Sends traps on events
3. MIB (Management Information Base)
• Database of manageable objects
• Hierarchical tree structure
• Standard + vendor-specific
4. OID (Object Identifier)
• Unique address for each object
• Numeric: 1.3.6.1.2.1.1.1.0
• Named: iso.org.dod.internet.mgmt.mib-2.system.sysDescr.0
Operations:
• GET: Read single value
• GETNEXT: Walk through tree
• GETBULK: Efficient bulk read (v2c+)
• SET: Write value (configuration)
• TRAP: Unsolicited alert
• INFORM: Acknowledged trap (v2c+)
Versions
SNMP Versions
| Version | Security | Features | Use Case |
| v1 | Community string (plaintext) | Basic | Legacy only |
| v2c | Community string (plaintext) | GETBULK, INFORM | Most common |
| v3 | Auth + Encryption | User-based security | Secure deployments |
# SNMP queries with snmpwalk and snmpget
# Get system description
snmpget -v2c -c public 192.168.1.1 1.3.6.1.2.1.1.1.0
# Response: STRING: "Cisco IOS Software, C2900 Software..."
# Get system uptime
snmpget -v2c -c public 192.168.1.1 sysUpTime.0
# Walk entire system tree
snmpwalk -v2c -c public 192.168.1.1 system
# Walk interface table
snmpwalk -v2c -c public 192.168.1.1 ifTable
# Get interface stats (index 1)
snmpget -v2c -c public 192.168.1.1 \
ifInOctets.1 ifOutOctets.1 ifInErrors.1
# SNMPv3 with authentication
snmpget -v3 -l authPriv \
-u admin \
-a SHA -A "authpassword" \
-x AES -X "privpassword" \
192.168.1.1 sysDescr.0
# SNMP monitoring with Python (pysnmp)
from pysnmp.hlapi import *
def snmp_get(host, community, oid):
"""Get single SNMP value"""
iterator = getCmd(
SnmpEngine(),
CommunityData(community, mpModel=1), # v2c
UdpTransportTarget((host, 161)),
ContextData(),
ObjectType(ObjectIdentity(oid))
)
errorIndication, errorStatus, errorIndex, varBinds = next(iterator)
if errorIndication:
print(f"Error: {errorIndication}")
return None
elif errorStatus:
print(f"Error: {errorStatus.prettyPrint()}")
return None
else:
for varBind in varBinds:
return varBind[1].prettyPrint()
def snmp_walk(host, community, oid):
"""Walk SNMP tree"""
results = []
for (errorIndication, errorStatus, errorIndex, varBinds) in nextCmd(
SnmpEngine(),
CommunityData(community, mpModel=1),
UdpTransportTarget((host, 161)),
ContextData(),
ObjectType(ObjectIdentity(oid)),
lexicographicMode=False
):
if errorIndication:
print(f"Error: {errorIndication}")
break
elif errorStatus:
print(f"Error: {errorStatus.prettyPrint()}")
break
else:
for varBind in varBinds:
results.append((varBind[0].prettyPrint(),
varBind[1].prettyPrint()))
return results
# Example usage
print("System Description:", snmp_get('192.168.1.1', 'public',
'1.3.6.1.2.1.1.1.0'))
print("\nInterfaces:")
for oid, val in snmp_walk('192.168.1.1', 'public', '1.3.6.1.2.1.2.2.1.2'):
print(f" {oid}: {val}")
Common OIDs
Essential SNMP OIDs
Common SNMP OIDs (MIB-2):
System Group (1.3.6.1.2.1.1):
• sysDescr.0 - System description
• sysUpTime.0 - Uptime (timeticks)
• sysContact.0 - Admin contact
• sysName.0 - Hostname
• sysLocation.0 - Physical location
Interface Group (1.3.6.1.2.1.2):
• ifNumber.0 - Number of interfaces
• ifDescr.X - Interface name
• ifSpeed.X - Bandwidth (bps)
• ifAdminStatus.X - Admin up/down
• ifOperStatus.X - Operational status
• ifInOctets.X - Bytes received
• ifOutOctets.X - Bytes sent
• ifInErrors.X - Input errors
• ifOutErrors.X - Output errors
CPU/Memory (vendor-specific):
• Cisco: 1.3.6.1.4.1.9.9.109
• Linux (HOST-RESOURCES): 1.3.6.1.2.1.25
NetFlow: Traffic Analysis
NetFlow captures metadata about network conversations—who's talking to whom, on what ports, how much data. Essential for capacity planning and security monitoring.
Key Insight: NetFlow doesn't capture packet contents—just flow metadata. Like phone records vs call recordings.
Flow Data
What's in a Flow?
NetFlow v5 Record (7 Tuple):
1. Source IP
2. Destination IP
3. Source Port
4. Destination Port
5. Protocol (TCP/UDP/ICMP)
6. Type of Service (ToS)
7. Input Interface
Plus:
• Packet count
• Byte count
• Start/end timestamps
• TCP flags
• Next-hop IP
NetFlow v9 / IPFIX:
• Template-based (flexible fields)
• IPv6 support
• Variable length fields
• Vendor extensions
# NetFlow architecture
[Router]
|
| NetFlow export (UDP 2055)
↓
[Collector]
|
| Store flows
↓
[Analyzer]
|
| Reports, dashboards
↓
[Admin UI]
Popular Collectors:
• ntopng (open source)
• Elasticsearch + Logstash
• SolarWinds NTA
• PRTG
# Enable NetFlow on Cisco
interface GigabitEthernet0/0
ip flow ingress
ip flow egress
ip flow-export version 9
ip flow-export destination 10.0.0.100 2055
ip flow-export source Loopback0
# Parse NetFlow with Python
def parse_netflow_concepts():
"""NetFlow analysis concepts"""
print("NetFlow Analysis Use Cases")
print("=" * 50)
print("""
# Using ntopng API or custom collector
Common Analysis Queries:
1. TOP TALKERS (bandwidth hogs)
SELECT src_ip, SUM(bytes) as total
FROM flows
GROUP BY src_ip
ORDER BY total DESC
LIMIT 10
2. APPLICATION BREAKDOWN
SELECT dst_port, SUM(bytes) as total
FROM flows
GROUP BY dst_port
ORDER BY total DESC
Port 443: HTTPS
Port 80: HTTP
Port 22: SSH
3. SECURITY: Unusual traffic
SELECT src_ip, dst_ip, dst_port
FROM flows
WHERE dst_port NOT IN (80, 443, 22, 53)
AND bytes > 1000000
4. BANDWIDTH OVER TIME
SELECT
DATE_TRUNC('hour', timestamp) as hour,
SUM(bytes) as total_bytes
FROM flows
GROUP BY hour
ORDER BY hour
""")
print("\nNetFlow Sampling:")
print("• 1:100 sampling = analyze 1% of packets")
print("• Reduces collector load")
print("• Statistically accurate for top-N analysis")
parse_netflow_concepts()
Syslog: Centralized Logging
Syslog is the standard for forwarding log messages. Every Unix system, router, and firewall speaks syslog—send logs to a central server for analysis and retention.
Format
Syslog Message Structure
Syslog Components:
FACILITY (source type):
0 kernel 4 auth 8 uucp
1 user 5 syslog 9-15 local0-7
2 mail 6 lpr 16-23 local0-7
3 daemon 7 news
SEVERITY (0-7, lower=worse):
0 Emergency System unusable
1 Alert Immediate action needed
2 Critical Critical conditions
3 Error Error conditions
4 Warning Warning conditions
5 Notice Normal but significant
6 Info Informational
7 Debug Debug messages
MESSAGE FORMAT (RFC 5424):
VERSION TIMESTAMP HOSTNAME APP-NAME PROCID MSGID MSG
Example:
<34>1 2026-01-31T10:30:00Z router1 sshd 12345 - - Failed login from 10.0.0.5
PRI = Facility * 8 + Severity
<34> = 4*8 + 2 = auth.critical
# Syslog configuration
# Linux rsyslog.conf - send to remote server
*.* @syslog.example.com:514 # UDP
*.* @@syslog.example.com:514 # TCP
# Filter by facility/severity
auth.* @syslog.example.com:514
*.err @syslog.example.com:514
# Cisco IOS
logging host 10.0.0.100
logging trap informational
logging facility local7
# Test syslog
logger -p local0.info "Test message from $(hostname)"
# View logs
tail -f /var/log/syslog
journalctl -f
# Syslog server with Python
import socketserver
import re
from datetime import datetime
class SyslogHandler(socketserver.BaseRequestHandler):
"""Simple syslog receiver"""
def handle(self):
data = self.request[0].strip().decode('utf-8')
# Parse syslog message
match = re.match(r'<(\d+)>(.+)', data)
if match:
pri = int(match.group(1))
message = match.group(2)
facility = pri // 8
severity = pri % 8
facilities = ['kern', 'user', 'mail', 'daemon',
'auth', 'syslog', 'lpr', 'news']
severities = ['emerg', 'alert', 'crit', 'err',
'warning', 'notice', 'info', 'debug']
fac_name = facilities[facility] if facility < len(facilities) else f'local{facility-16}'
sev_name = severities[severity]
print(f"[{datetime.now()}] {self.client_address[0]} "
f"{fac_name}.{sev_name}: {message}")
def start_syslog_server(port=514):
"""Start UDP syslog server"""
server = socketserver.UDPServer(('0.0.0.0', port), SyslogHandler)
print(f"Syslog server listening on UDP {port}")
server.serve_forever()
# Production: Use rsyslog, syslog-ng, or ELK stack
# start_syslog_server()
print("""
Production Syslog Stack:
[Devices] → [rsyslog/syslog-ng] → [Kafka] → [Logstash] → [Elasticsearch]
↓
[Kibana]
""")
NMS Integration
Network Management Systems (NMS) combine SNMP, NetFlow, and syslog into unified dashboards. Popular options include Zabbix, Nagios, PRTG, and cloud-native solutions like Datadog.
Tools
NMS Options
| Tool | Type | Best For |
| Zabbix | Open source | Enterprise monitoring |
| Nagios | Open source | Alert-focused |
| PRTG | Commercial | Windows environments |
| Grafana | Open source | Visualization |
| Datadog | SaaS | Cloud-native |
| LibreNMS | Open source | Autodiscovery |
# Zabbix SNMP monitoring example
# 1. Add host with SNMP interface
# 2. Link template (e.g., Template Net Cisco IOS SNMPv2)
# 3. Items automatically created from MIB
# Zabbix agent configuration (alternative)
Server=zabbix.example.com
ServerActive=zabbix.example.com
Hostname=web-server-01
# Custom user parameter
UserParameter=custom.metric,/path/to/script.sh
# Prometheus metrics (modern alternative)
# SNMP Exporter + Prometheus + Grafana stack
Summary & Next Steps
Key Takeaways:
- SNMP: Universal device monitoring (metrics)
- NetFlow: Traffic analysis (flows)
- Syslog: Centralized logging (events)
- SNMPv3: Use for security (auth + encryption)
- NMS: Unified monitoring dashboards
Quiz
Test Your Knowledge
- SNMP v2c vs v3 security? (Community string vs auth+encryption)
- What's an OID? (Object Identifier, unique address)
- NetFlow captures what? (Flow metadata, not content)
- Syslog severity 0? (Emergency, worst)
- SNMP TRAP vs INFORM? (Unacknowledged vs acknowledged)