Blog

semver-utils: Streamlined semantic versioning from pipelines

Posted on March 30, 2025 by Derek Mortimer

At CECG , a recurring task we run into is the management of semantic version tags in Git repositories from automated pipelines. It’s straightforward in principle — just follow the Semantic Versioning rules! But in practice, we often saw a range of implementations with varying degrees of testing. To address this task in a repeatable and reusable manner, we built semver-utils , which we’re releasing under open-source now.

Multi-Node LLM Serving Using sig LWS and vLLM

Posted on March 24, 2025 by Jingkai He

This article provides a guide on how to serve large language models on multiple nodes running on Kubernetes. Challenges Large Language Model (LLM) serving is a challenging task. Namely:

Deploying Local LLMs for Sentiment Analysis in Platform Engineering

Posted on March 19, 2025 by Senna Semakula

Introduction Sentiment analysis is widely used in support and incident management, often relying on cloud-based AI services from OpenAI, AWS, or Google. However, self-hosting LLMs is becoming an attractive option for platform teams looking to reduce costs, keep data on-premises, and gain more control over model behaviour.

Exploring AIOps

Posted on March 19, 2025 by CECG

Our client’s exploration into AIOps, leveraging Grafana Cloud, marks an exciting new approach to IT operations management. This deeper dive into our journey with AIOps aims to demystify the process, share our learnings, and outline the concrete steps we’ve taken towards a more intelligent IT system.

Comparison: Kubeadmiral and Karmada

Posted on March 17, 2025 by Derek Mortimer

Kubeadmiral, Karmada, and Multi-cluster Federation Standards A client recently asked us for a comparison of two multi-cluster Kubernetes management technologies, specifically Kubeadmiral and Karmada . In this post we’ll introduce these technologies, and the Kubernetes standards that influenced their architecture, to enable multi-cluster workload orchestration. The focus will be entirely on managing the manifests inside existing Kubernetes clusters, specifically excluding discussion of standards such as the Cluster API which are concerned with the provisioning of new clusters from scratch.

Evaluating Large Scale Solutions for Multi Tenant Metrics System

Posted on November 4, 2024 by Korhan Ozturk

In our work with a client, we encountered a challenge with their multi-tenant Kubernetes platform. The platform was designed to provide a flexible environment where each tenant could independently manage their own services and infrastructure. As part of this setup, tenants were encouraged to create and maintain their own monitoring stacks using Prometheus and Alertmanager .

Supporting private service access in GCP from a multi-tenanted kubernetes platform

Posted on October 25, 2024 by Tiago Alves

Having a developer platform that works out of the box is great, but integrating it with other cloud provider resources outside the platform can be tricky:

Serverless Exodus to GKE Autopilot

Posted on September 13, 2024 by Jingkai He

Over the last year CECG has been working on an engagement within a client’s Advertising Technology division to deliver an Ad decision server solution. It comes with the following requirements:

Automated Landing Zones in GCP Organizations

Posted on August 12, 2024 by Derek Mortimer

What is a Landing Zone? As cloud usage increases across organizations and more teams deploy resources, it becomes increasingly important to stay organized as platform operators to be able to ensure security best practices are being applied and also be able to attribute resources to their owners (e.g., for cost attribution, to discover responsible people/teams).

How We Execute Greenfield Projects

Posted on July 15, 2024 by Senna Semakula-Buuza

Planning and executing greenfield projects is no easy feat. It requires meticulous planning and flawless execution. Tune in to see how we unveil our strategy to achieve the utmost client satisfaction and critical decisions agreed upon within 15 minutes.