The AlienApp for Cloudflare presents a number of diagnostic statistics and related feedback, enabling you to assess the status of the AlienApp without having to delve into the sensor log.
The statistics reported on the AlienApp's page are as follows:
- Zone Activity: Each zone's status indicates whether logs are available to be monitored, or whether enterprise log share is disabled for that zone
Error Rate: The number of errors the app detects in its logic, displayed as Errors per Second
Important: The app will retry potentially recoverable errors three times before giving up. See Error Recovery for more information.
- Throttled Events: The percentage of events that are ignored during throttling mode
- Orchestration Action Count: The number of orchestration actions invoked since the last time the sensor was restarted
- Average Event Age: The average age of events coming from Cloudflare
In the event that your sensor is being overloaded by an unusual amount of events per second (EPS), your app may enter throttling mode in an effort to reduce strain on your sensor or lower the bandwidth it is consuming. Throttling mode is automatically enabled any time the app detects that more than 1000 EPS are being generated. When the actual EPS has remained under 1000 for a minute, the app will disengage throttling mode.
While your app is in throttling mode, it throttles the data coming to the sensor to limit the data being pulled. Doing this helps the app to maintain its threshold below 1000 EPS.
When your app is in sampling mode, the Status page indicates this and displays approximately what percentage of data is being skipped:
In the event that the job receives a potentially recoverable error, it will retry that job up to three times before giving up. If it cannot collect the data after the third retry, you will see the failure noted in the scheduler history and the next scheduled job will try to collect the data from the failed job in addition to its own data.
When this happens, you may see some jobs labeled "already running". This means that the job before it took over a minute to complete, so the next scheduled job was skipped because the previous job was still running. The job after a skipped job will then collect both its data and the data from the skipped job, proceeding in this cycle until the app is caught up.
Average Event Age
This metric represents the latency between an event's timestamp in Cloudflare and the moment it is processed by the app. The age of each zone's most recent event is taken and all are averaged to provide the average event age for your app.