Skip to main content

Monitoring

This guide covers the monitoring capabilities of the Flowise service for tracking system health, performance, and logs.

Monitoring Overview

Flowise provides comprehensive monitoring capabilities to help you ensure the reliability, performance, and availability of your AI workflows. These monitoring tools enable you to track system health, identify issues, and optimize resource usage.

Monitoring Endpoints

Health Check

Get the current health status of the Flowise service.
GET /api/v1/health

curl "https://flowise.moodmnky.com/api/v1/health" \
  -H "x-api-key: your_api_key"
Response:
{
  "status": "healthy",
  "version": "1.3.8",
  "uptime": 259200,
  "startTime": "2024-03-29T12:00:00Z",
  "environment": "production",
  "memoryUsage": {
    "rss": 150540288,
    "heapTotal": 115343360,
    "heapUsed": 89457664,
    "external": 3688524
  },
  "cpuUsage": 12.5
}

System Metrics

Get detailed system metrics for the Flowise service.
GET /api/v1/metrics

curl "https://flowise.moodmnky.com/api/v1/metrics" \
  -H "x-api-key: your_api_key"
Response:
{
  "system": {
    "cpu": {
      "usage": 12.5,
      "cores": 4,
      "processes": 24
    },
    "memory": {
      "total": 8589934592,
      "free": 3221225472,
      "used": 5368709120,
      "usage": 62.5
    },
    "uptime": 259200
  },
  "application": {
    "requests": {
      "total": 15824,
      "success": 15542,
      "failed": 282,
      "averageResponseTime": 325
    },
    "chatflows": {
      "active": 18,
      "total": 25
    },
    "tools": {
      "active": 12,
      "failed": 2
    },
    "database": {
      "connections": 8,
      "queryLatency": 45
    }
  }
}

Logs Retrieval

Retrieve application logs.
GET /api/v1/logs?limit=100&level=error

curl "https://flowise.moodmnky.com/api/v1/logs?limit=100&level=error" \
  -H "x-api-key: your_api_key"
Response:
{
  "logs": [
    {
      "timestamp": "2024-04-01T12:34:56Z",
      "level": "error",
      "message": "Failed to connect to external API",
      "metadata": {
        "chatflowId": "chatflow-1234",
        "nodeId": "node-5678",
        "statusCode": 503
      }
    },
    {
      "timestamp": "2024-04-01T10:23:45Z",
      "level": "error",
      "message": "Memory cache overflow",
      "metadata": {
        "chatflowId": "chatflow-5678",
        "memoryUsage": 95.2
      }
    }
  ],
  "pagination": {
    "total": 282,
    "limit": 100,
    "offset": 0,
    "hasMore": true
  }
}

Configure Logging

Configure logging settings.
POST /api/v1/admin/logging

curl -X POST "https://flowise.moodmnky.com/api/v1/admin/logging" \
  -H "x-api-key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "level": "info",
    "format": "json",
    "destination": "file",
    "filePath": "/var/log/flowise.log",
    "maxSize": 10485760,
    "maxFiles": 5
  }'
Response:
{
  "success": true,
  "config": {
    "level": "info",
    "format": "json",
    "destination": "file",
    "filePath": "/var/log/flowise.log",
    "maxSize": 10485760,
    "maxFiles": 5,
    "updatedAt": "2024-04-01T12:00:00Z"
  }
}

Alert Configuration

Configure alert settings.
POST /api/v1/admin/alerts

curl -X POST "https://flowise.moodmnky.com/api/v1/admin/alerts" \
  -H "x-api-key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "channels": [
      {
        "type": "email",
        "recipients": ["[email protected]", "[email protected]"],
        "events": ["system.error", "chatflow.failure"]
      },
      {
        "type": "webhook",
        "url": "https://alerts.example.com/webhook",
        "headers": {
          "Authorization": "Bearer your-token"
        },
        "events": ["system.critical", "chatflow.failure"]
      }
    ],
    "thresholds": {
      "cpuUsage": 90,
      "memoryUsage": 85,
      "errorRate": 5,
      "responseTime": 2000
    }
  }'
Response:
{
  "success": true,
  "config": {
    "enabled": true,
    "channels": [
      {
        "type": "email",
        "recipients": ["[email protected]", "[email protected]"],
        "events": ["system.error", "chatflow.failure"]
      },
      {
        "type": "webhook",
        "url": "https://alerts.example.com/webhook",
        "events": ["system.critical", "chatflow.failure"]
      }
    ],
    "thresholds": {
      "cpuUsage": 90,
      "memoryUsage": 85,
      "errorRate": 5,
      "responseTime": 2000
    },
    "updatedAt": "2024-04-01T12:00:00Z"
  }
}

Monitoring Parameters

Logs Parameters

ParameterTypeDescriptionDefault
limitnumberMaximum number of logs to return100
offsetnumberOffset for pagination0
levelstringLog level (debug, info, warn, error, critical)info
startDatestringStart date in ISO format24 hours ago
endDatestringEnd date in ISO formatCurrent time
chatflowIdstringFilter logs by chatflow IDAll chatflows
searchstringSearch term within log messagesNone

Alert Thresholds

ThresholdTypeDescriptionDefault
cpuUsagenumberCPU usage percentage threshold90
memoryUsagenumberMemory usage percentage threshold85
errorRatenumberError rate percentage threshold5
responseTimenumberAverage response time threshold in ms2000
diskUsagenumberDisk usage percentage threshold85
concurrentRequestsnumberConcurrent requests threshold100

Alert Event Types

Event TypeDescription
system.warningSystem warning events
system.errorSystem error events
system.criticalSystem critical events
chatflow.failureChatflow execution failures
chatflow.timeoutChatflow execution timeouts
resource.lowResource usage approaching thresholds
security.violationSecurity policy violations

Usage Examples

Basic Health Monitoring

// Check system health
async function checkSystemHealth() {
  try {
    const health = await client.flowise.getHealth();
    
    console.log(`System status: ${health.status}`);
    console.log(`Uptime: ${Math.floor(health.uptime / 3600)} hours`);
    console.log(`Memory usage: ${Math.round(health.memoryUsage.heapUsed / 1024 / 1024)} MB`);
    
    // Alert if not healthy
    if (health.status !== "healthy") {
      sendAlert("System health check failed", health);
      return false;
    }
    
    return true;
  } catch (error) {
    console.error("Health check failed:", error.message);
    sendAlert("Health check failed", { error: error.message });
    return false;
  }
}

// Schedule regular health checks
setInterval(checkSystemHealth, 300000); // Every 5 minutes

Performance Monitoring Dashboard

// Collect metrics for dashboard
async function collectMetricsForDashboard() {
  try {
    const metrics = await client.flowise.getMetrics();
    
    // Extract key performance indicators
    const kpis = {
      systemHealth: {
        cpuUsage: metrics.system.cpu.usage,
        memoryUsage: metrics.system.memory.usage,
        uptime: formatUptime(metrics.system.uptime)
      },
      applicationPerformance: {
        requestsPerMinute: metrics.application.requests.total / (metrics.system.uptime / 60),
        successRate: (metrics.application.requests.success / metrics.application.requests.total) * 100,
        averageResponseTime: metrics.application.requests.averageResponseTime
      },
      resourceUtilization: {
        activeChatflows: metrics.application.chatflows.active,
        databaseConnections: metrics.application.database.connections,
        activeTools: metrics.application.tools.active
      }
    };
    
    // Update dashboard with KPIs
    updateDashboard(kpis);
    
    // Check for threshold violations
    checkThresholds(metrics);
    
    return kpis;
  } catch (error) {
    console.error("Failed to collect metrics:", error.message);
    return null;
  }
}

function formatUptime(seconds) {
  const days = Math.floor(seconds / 86400);
  const hours = Math.floor((seconds % 86400) / 3600);
  const minutes = Math.floor((seconds % 3600) / 60);
  return `${days}d ${hours}h ${minutes}m`;
}

function checkThresholds(metrics) {
  if (metrics.system.cpu.usage > 80) {
    sendAlert("High CPU Usage", { usage: metrics.system.cpu.usage });
  }
  
  if (metrics.system.memory.usage > 80) {
    sendAlert("High Memory Usage", { usage: metrics.system.memory.usage });
  }
  
  if (metrics.application.requests.failed / metrics.application.requests.total > 0.05) {
    sendAlert("High Error Rate", { 
      errorRate: (metrics.application.requests.failed / metrics.application.requests.total) * 100 
    });
  }
}

// Schedule regular metrics collection
setInterval(collectMetricsForDashboard, 60000); // Every minute

Error Log Analysis

// Analyze error logs for patterns
async function analyzeErrorLogs() {
  try {
    // Get recent error logs
    const logs = await client.flowise.getLogs({
      level: "error",
      limit: 500,
      startDate: new Date(Date.now() - 24 * 60 * 60 * 1000).toISOString() // Last 24 hours
    });
    
    // Group by error type
    const errorGroups = logs.logs.reduce((groups, log) => {
      const messageKey = log.message.split(':')[0]; // Simple grouping by message prefix
      groups[messageKey] = groups[messageKey] || [];
      groups[messageKey].push(log);
      return groups;
    }, {});
    
    // Sort by frequency
    const errorFrequency = Object.entries(errorGroups)
      .map(([messageKey, logs]) => ({
        type: messageKey,
        count: logs.length,
        samples: logs.slice(0, 3), // Include a few examples
        chatflows: [...new Set(logs.map(log => log.metadata.chatflowId).filter(Boolean))]
      }))
      .sort((a, b) => b.count - a.count);
    
    console.log("Error analysis results:");
    console.log(`Total errors in last 24 hours: ${logs.logs.length}`);
    console.log("Top error types:");
    errorFrequency.slice(0, 5).forEach(error => {
      console.log(`- ${error.type}: ${error.count} occurrences (${error.chatflows.length} chatflows affected)`);
    });
    
    // Return detailed analysis
    return {
      totalErrors: logs.logs.length,
      errorTypes: errorFrequency,
      timeRange: {
        start: logs.logs[logs.logs.length - 1]?.timestamp,
        end: logs.logs[0]?.timestamp
      }
    };
  } catch (error) {
    console.error("Failed to analyze error logs:", error.message);
    return null;
  }
}

// Run error analysis daily
setInterval(analyzeErrorLogs, 86400000); // Every 24 hours

Integration with Monitoring Tools

Prometheus Integration

// Configure Prometheus metrics endpoint
const prometheusConfig = {
  enabled: true,
  path: "/metrics",
  collectDefaultMetrics: true,
  labels: {
    environment: "production",
    instance: "flowise-1"
  }
};

await client.flowise.configureMonitoring({
  prometheus: prometheusConfig
});

// Example Prometheus configuration
/* 
scrape_configs:
  - job_name: 'flowise'
    scrape_interval: 15s
    metrics_path: /metrics
    static_configs:
      - targets: ['flowise.moodmnky.com']
    basic_auth:
      username: 'prometheus'
      password: 'your-secure-password'
*/

ELK Stack Integration

// Configure log forwarding to ELK Stack
const elkConfig = {
  enabled: true,
  elasticUrl: "https://elasticsearch.example.com:9200",
  indexPrefix: "flowise-logs",
  apiKey: "your-elasticsearch-api-key",
  logLevels: ["warn", "error", "critical"]
};

await client.flowise.configureMonitoring({
  elkStack: elkConfig
});

Custom Webhook Integration

// Configure webhook notifications for monitoring events
const webhookConfig = {
  enabled: true,
  url: "https://monitoring.example.com/webhook",
  headers: {
    "Authorization": "Bearer your-token",
    "Content-Type": "application/json"
  },
  events: [
    "system.health",
    "system.metrics",
    "chatflow.failure"
  ],
  includeMetadata: true,
  batchInterval: 300 // Send batched events every 5 minutes
};

await client.flowise.configureMonitoring({
  webhook: webhookConfig
});

Best Practices

  1. Health Monitoring Strategy
    • Implement regular health checks
    • Set up automated alerting for unhealthy status
    • Monitor key system resources
    • Establish baseline performance metrics
  2. Log Management
    • Configure appropriate log levels
    • Implement log rotation to manage storage
    • Centralize logs for easier analysis
    • Use structured logging format (JSON)
  3. Alert Configuration
    • Define meaningful alert thresholds
    • Avoid alert fatigue with proper prioritization
    • Configure multiple notification channels
    • Implement escalation procedures for critical issues
  4. Performance Optimization
    • Monitor response time trends
    • Track resource utilization patterns
    • Identify and address performance bottlenecks
    • Implement resource-based scaling

Support & Resources