Overview
Observability enables understanding complex systems through data - logs, traces, and metrics. Build systems that are easy to debug and monitor.
Three Pillars
- Logs: Records of discrete events
- Traces: End-to-end request tracking
- Metrics: Numeric measurements over time
Logging with Winston
npm install winston
import winston from 'winston'
const logger = winston.createLogger({
level: 'info',
format: winston.format.json(),
transports: [
new winston.transports.File({ filename: 'error.log', level: 'error' }),
new winston.transports.File({ filename: 'combined.log' }),
new winston.transports.Console({
format: winston.format.simple()
})
]
})
logger.info('Server started', { port: 3000 })
logger.error('Database connection failed', { error: 'ECONNREFUSED' })
Distributed Tracing with OpenTelemetry
npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/sdk-trace-node
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node'
import { registerInstrumentations } from '@opentelemetry/instrumentation'
const provider = new NodeTracerProvider()
provider.register()
const tracer = provider.getTracer('my-app')
const span = tracer.startSpan('processOrder')
try {
// Do work
span.addEvent('order_processed', { orderId: 123 })
} finally {
span.end()
}
Metrics with Prometheus
npm install prom-client
import { register, Counter, Histogram, Gauge } from 'prom-client'
const requestCounter = new Counter({
name: 'http_requests_total',
help: 'Total HTTP requests',
labelNames: ['method', 'route', 'status']
})
const requestDuration = new Histogram({
name: 'http_request_duration_seconds',
help: 'HTTP request latency',
labelNames: ['method', 'route']
})
const activeConnections = new Gauge({
name: 'active_connections',
help: 'Number of active connections'
})
// Express middleware
app.use((req, res, next) => {
const start = Date.now()
activeConnections.inc()
res.on('finish', () => {
const duration = (Date.now() - start) / 1000
requestCounter.labels(req.method, req.route.path, res.statusCode).inc()
requestDuration.labels(req.method, req.route.path).observe(duration)
activeConnections.dec()
})
next()
})
app.get('/metrics', (req, res) => {
res.set('Content-Type', register.contentType)
res.end(register.metrics())
})
Proper observability reduces mean-time-to-resolution (MTTR) by 80%.