face-glassesObservability & monitoring

Observability and monitoring on GLBNXT Platform gives your development team continuous visibility into the health, performance, and behaviour of your platform environment and the AI applications running within it. GLBNXT manages the monitoring infrastructure, collects data across the full stack, and ensures that your team always has the information needed to understand what is happening in the environment, identify issues early, and maintain confidence in production workloads.

This section explains what is monitored, what visibility your team has, and how observability data supports both operational and compliance requirements.

What Observability Covers

Observability on GLBNXT Platform spans four interconnected layers, each providing a different view of the environment.

Infrastructure observability covers the health and performance of the underlying compute, networking, storage, and orchestration components that GLBNXT manages. This includes CPU and GPU utilisation, memory consumption, disk usage, network throughput, and the operational status of Kubernetes workloads and platform services.

Application observability covers the behaviour of the AI applications and services your team deploys on the platform. This includes request volumes, response times, error rates, and availability metrics for the endpoints and interfaces your applications expose.

Model observability covers the performance and behaviour of AI models serving inference requests within your environment. This includes request latency, token throughput, model error rates, and usage patterns across the models available in your Model Hub.

Audit observability covers the complete record of user activity, data access events, API calls, and administrative actions within your environment. Audit data is captured for compliance purposes and is kept separate from operational monitoring data to ensure its integrity.

Infrastructure Monitoring

GLBNXT monitors the infrastructure layer of your environment continuously. Compute health, storage availability, network performance, and Kubernetes cluster status are all observed in real time. Alerting is configured at the platform level to detect infrastructure anomalies and trigger remediation before they affect application availability.

Your team has visibility into infrastructure metrics through the Monitoring and Observability area of the platform console. This gives you the context needed to understand whether application behaviour is driven by platform conditions or application logic, without requiring you to manage the monitoring infrastructure itself.

Infrastructure incidents are managed by GLBNXT. When a platform-level issue is detected, GLBNXT's operational team responds directly. Your team is notified of incidents that affect your environment through the support and communication channels agreed during onboarding.

Model Performance Monitoring

Model observability is particularly important in production AI environments where inference quality, latency, and throughput directly affect the end user experience. GLBNXT Platform captures detailed metrics for every model serving inference requests in your environment.

Key model metrics available through the platform console include:

  • Inference request volume over time

  • Average and percentile response latency

  • Token input and output volumes

  • Error rates and timeout frequencies

  • Compute utilisation per model instance

These metrics allow your team to understand how models are performing under real usage conditions, identify models that may need configuration adjustments, and plan for capacity as your application usage grows.

LLM Tracing and Evaluation

Beyond infrastructure and model metrics, GLBNXT Platform supports deep observability into the behaviour of AI workflows and language model chains through integration with LLM tracing tools available in your environment. These tools capture the full trace of an AI workflow execution, including each step in a chain, the inputs and outputs at every stage, latency at each step, and the final output delivered to the application.

LLM tracing is particularly valuable for debugging complex RAG pipelines, multi-agent workflows, and any AI application where intermediate steps affect the quality of the final output. Traces give your team the ability to inspect exactly what happened inside a workflow execution, identify where quality or performance issues originate, and iterate with precision.

Evaluation capabilities allow your team to measure model and pipeline quality systematically, running assessments against defined criteria and tracking quality metrics over time as models are updated or pipeline configurations change.

Application Logs

All application workloads running on GLBNXT Platform produce logs that are captured and made available through the observability layer. Application logs give your team visibility into the runtime behaviour of the services you deploy, including errors, warnings, and diagnostic output generated by your own application code.

Logs are searchable and filterable through the platform console. For teams that need to integrate log data with existing monitoring or SIEM tooling, log export is available through the platform API. Your GLBNXT contact can advise on the appropriate integration pattern for your organisation's tooling.

Audit Trails

Audit trails on GLBNXT Platform provide a complete, immutable record of all significant events within your environment. Audit data includes user login and access events, model inference requests, data access operations, administrative changes to environment configuration, secrets vault access events, and API calls made by platform services and applications.

Audit logs are timestamped, tamper-evident, and retained according to the compliance requirements agreed for your environment. They are available through the Monitoring and Observability area of the platform console and can be exported for integration with your organisation's compliance, audit, and reporting processes.

For organisations subject to GDPR, ISO 27001, NIS2, or sector-specific regulatory requirements, the audit trail provides the evidence base needed to demonstrate that data access and processing activities comply with applicable obligations. See the Compliance Frameworks section for further guidance on how audit data supports your compliance posture.

Alerting and Incident Notification

GLBNXT configures alerting across the platform stack to detect and respond to conditions that could affect your environment. Alerts are triggered by infrastructure health events, resource threshold breaches, service availability changes, and security-relevant events such as repeated authentication failures.

Infrastructure-level alerts are handled directly by GLBNXT's operational team. Where an alert has potential impact on your applications or your team's work, GLBNXT communicates through the incident notification channels agreed during onboarding.

Your team can configure application-level alerting for conditions relevant to your specific workloads through the platform console. If your organisation requires integration with an existing alerting or incident management platform, your GLBNXT contact can advise on available integration options.

What Your Team Can See

All observability data relevant to your environment and your applications is accessible to your team through the Monitoring and Observability area of the platform console. What individual users can see is governed by the role-based access controls configured for your environment. Developers typically have access to application and model metrics relevant to their work, while administrators have full visibility across infrastructure, audit, and security monitoring data.

If your team needs additional observability capabilities or custom dashboards for specific workloads, discuss your requirements with your GLBNXT contact. The platform observability layer is designed to be extensible and can be configured to surface the data that matters most for your specific use cases.

Last updated

Was this helpful?