Configure Portable Object: Localisation in .NET 8 Web API

Localisation is an important feature when building apps that cater to users from different countries, allowing them to interact with our app in their native language. In this article, we will walk you through how to set up and configure Portable Object (PO) Localisation in an ASP.NET Core Web API project.

Localisation is about adapting the app for a specific culture or language by translating text and customising resources. It involves translating user-facing text and content into the target language.

While .NET localisation normally uses resource files (.resx) to store localised texts for different cultures, Portable Object files (.po) are another popular choice, especially in apps that use open-source tools or frameworks.

About Portable Object (PO)

PO files are a standard format used for storing localised text. They are part of the gettext localisation framework, which is widely used across different programming ecosystems.

A PO file contains translations in the form of key-value pairs, where:

  • Key: The original text in the source language.
  • Value: The translated text in the target language.

Because PO files are simple, human-readable text files, they are easily accessible and editable by translators. This flexibility makes PO files a popular choice for many open-source projects and apps across various platforms.

You might wonder why should we use PO files instead of the traditional .resx files for localisation? Here are some advantages of using PO files instead of .resx files:

  • Unlike .resx files, PO files have built-in support for plural forms. This makes it much easier to handle situations where the translation changes based on the quantity, like “1 item” vs. “2 items.”
  • While .resx files require compilation, PO files are plain text files. Hence, we do not need any special tooling or complex build steps to use PO files.
  • PO files work great with collaborative translation tools. For those who are working with crowdsourcing translations, they will find that PO files are much easier to manage in these settings.

SHOW ME THE CODE!

The complete source code of this project can be found at https://github.com/goh-chunlin/Experiment.PO.

Project Setup

Let’s begin by creating a simple ASP.NET Web API project. We can start by generating a basic template with the following command.

dotnet new webapi

This will set up a minimal API with a weather forecast endpoint.

The default /weatherforecast endpoint generated by .NET Web API boilerplate.

The default endpoint in the boilerplate returns a JSON object that includes a summary field. This field describes the weather using terms like freezing, bracing, warm, or hot. Here’s the array of possible summary values:

var summaries = new[]
{
"Freezing", "Bracing", "Chilly", "Cool",
"Mild", "Warm", "Balmy", "Hot", "Sweltering", "Scorching"
};

As you can see, currently, it only supports English. To extend support for multiple languages, we will introduce localisation.

Prepare PO Files

Let’s start by adding a translation for the weather summary in Chinese. Below is a sample PO file that contains the Chinese translation for the weather summaries.

#: Weather summary (Chinese)
msgid "weather_Freezing"
msgstr "寒冷"

msgid "weather_Bracing"
msgstr "冷冽"

msgid "weather_Chilly"
msgstr "凉爽"

msgid "weather_Cool"
msgstr "清爽"

msgid "weather_Mild"
msgstr "温和"

msgid "weather_Warm"
msgstr "暖和"

msgid "weather_Balmy"
msgstr "温暖"

msgid "weather_Hot"
msgstr "炎热"

msgid "weather_Sweltering"
msgstr "闷热"

msgid "weather_Scorching"
msgstr "灼热"

In most cases, PO file names are tied to locales, as they represent translations for specific languages and regions. The naming convention typically includes both the language and the region, so the system can easily identify and use the correct file. For example, the PO file above should be named zh-CN.po, which represents the Chinese translation for the China region.

In some cases, if our app supports a language without being region-specific, we could have a PO file named only with the language, such as ms.po for Malay. This serves as a fallback for all Malay speakers, regardless of their region.

We have prepared three Malay PO files: one for Malaysia (ms-MY.po), one for Singapore (ms-SG.po), and one fallback file (ms.po) for all Malay speakers, regardless of region.

After that, since our PO files are placed in the Localisation folder, please do not forget to include them in the .csproj file, as shown below.

<Project Sdk="Microsoft.NET.Sdk.Web">

...

<ItemGroup>
<Folder Include="Localisation\" />
<Content Include="Localisation\**">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
</ItemGroup>

</Project>

Adding this <ItemGroup> ensures that the localisation files from the Localisation folder are included in our app output. This helps the application find and use the proper localisation resources when running.

Configure Localisation Option in .NET

In an ASP .NET Web API project, we have to install a NuGet library from Orchard Core called OrchardCore.Localization.Core (Version 2.1.3).

Once the package is installed, we need to tell the application where to find the PO files. This is done by configuring the localisation options in the Program.cs file.

builder.Services.AddMemoryCache();
builder.Services.AddPortableObjectLocalization(options =>
options.ResourcesPath = "Localisation");

The AddMemoryCache method is necessary here because LocalizationManager of Orchard Core uses the IMemoryCache service. This caching mechanism helps avoid repeatedly parsing and loading the PO files, improving performance by keeping the localised resources in memory.

Supported Cultures and Default Culture

Now, we need to configure how the application will select the appropriate culture for incoming requests.

In .NET, we need to specify which cultures our app supports. While .NET is capable of supporting multiple cultures out of the box, it still needs to know which specific cultures we are willing to support. By defining only the cultures we actually support, we can avoid unnecessary overhead and ensure that our app is optimised.

We have two separate things to manage when making an app available in different languages and regions in .NET:

  • SupportedCultures: This is about how the app displays numbers, dates, and currencies. For example, how a date is shown (like MM/dd/yyyy in the US);
  • SupportedUICultures: This is where we specify the languages our app supports for displaying text (the content inside the PO files).

To keep things consistent and handle both text translations and regional formatting properly, it is a good practice to configure both SupportedCultures and SupportedUICultures.

We also need to setup the DefaultRequestCulture. It is the fallback culture that our app uses when it does not have any explicit culture information from the request.

The following code shows how we configure all these. To make our demo simple, we assume the locale that user wants is passed via query string.

builder.Services.Configure<RequestLocalizationOptions>(options =>
{
var supportedCultures = LocaleConstants.SupportedAppLocale
.Select(cul => new CultureInfo(cul))
.ToArray();

options.DefaultRequestCulture = new RequestCulture(
culture: "en", uiCulture: "en");
options.SupportedCultures = supportedCultures;
options.SupportedUICultures = supportedCultures;
options.AddInitialRequestCultureProvider(
new CustomRequestCultureProvider(async httpContext =>
{
var currentCulture =
CultureInfo.InvariantCulture.Name;
var requestUrlPath =
httpContext.Request.Path.Value;

if (httpContext.Request.Query.ContainsKey("locale"))
{
currentCulture =
httpContext.Request.Query["locale"].ToString();
}

return await Task.FromResult(
new ProviderCultureResult(currentCulture));
})
);
});

Next, we need to add the RequestLocalizationMiddleware in Program.cs to automatically set culture information for requests based on information provided by the client.

app.UseRequestLocalization();

After setting up the RequestLocalizationMiddleware, we can now move on to localising the API endpoint by using IStringLocalizer to retrieve translated text based on the culture information set for the current request.

About IStringLocalizer

IStringLocalizer is a service in ASP.NET Core used for retrieving localised resources, such as strings, based on the current culture of our app. In essence, IStringLocalizer acts as a bridge between our code and the language resources (like PO files) that contain translations. If the localised value of a key is not found, then the indexer key is returned.

We first need to inject IStringLocalizer into our API controllers or any services where we want to retrieve localised text.

app.MapGet("/weatherforecast", (IStringLocalizer<WeatherForecast> stringLocalizer) =>
{
var forecast = Enumerable.Range(1, 5).Select(index =>
new WeatherForecast
(
DateOnly.FromDateTime(DateTime.Now.AddDays(index)),
Random.Shared.Next(-20, 55),
stringLocalizer["weather_" + summaries[Random.Shared.Next(summaries.Length)]]
))
.ToArray();
return forecast;
})
.WithName("GetWeatherForecast")
.WithOpenApi();

The reason we use IStringLocalizer<WeatherForecast> instead of just IStringLocalizer is because we are relying on Orchard Core package to handle the PO files. According to Sebastian Ros, the Orchard Core maintainer, we cannot resolve IStringLocalizer, we need IStringLocalizer<T>. When we use IStringLocalizer<T> instead of just IStringLocalizer is also related to how localisation is typically scoped in .NET applications.

Running on Localhost

Now, if we run the project using dotnet run, the Web API should compile successfully. Once the API is running on localhost, visiting the endpoint with zh-CN as the locale should return the weather summary in Chinese, as shown in the screenshot below.

The summary is getting the translated text from zh-CN.po now.

Dockerisation

Since the Web API is tested to be working, we can proceed to dockerise it.

We will first create a Dockerfile as shown below to define the environment our Web API will run in. Then we will build the Docker image, using the Dockerfile. After building the image, we will run it in a container, making our Web API available for use.

## Build Container
FROM mcr.microsoft.com/dotnet/sdk:8.0-alpine AS builder
WORKDIR /app

# Copy the project file and restore any dependencies (use .csproj for the project name)
COPY *.csproj ./
RUN dotnet restore

# Copy the rest of the application code
COPY . .

# Publish the application
RUN dotnet publish -c Release -o out

## Runtime Container
FROM mcr.microsoft.com/dotnet/aspnet:8.0-alpine AS runtime

ENV ASPNETCORE_URLS=http://*:80

WORKDIR /app
COPY --from=builder /app/out ./

# Expose the port your application will run on
EXPOSE 80

ENTRYPOINT ["dotnet", "Experiment.PO.dll"]

As shown in the Dockerfile, we are using .NET Alpine images. Alpine is a lightweight Linux distribution often used in Docker images because it is much smaller than other base images. It is a best practice when we want a minimal image with fewer security vulnerabilities and faster performance.

Globalisation Invariant Mode in .NET

When we run our Web API as a Docker container on our local machine, we will soon realise that our container has stopped because our Web API inside it crashed. It turns out that there is an exception called System.Globalization.CultureNotFoundException.

Our Web API crashes due to System.Globalization.CultureNotFoundException, as shown in docker logs.

As pointed out in the error message, only the invariant culture is supported in globalization-invariant mode.

The globalization-invariant mode was introduced in .NET 2.0 in 2017. It allows our apps to run without using the full globalization data, which can significantly reduce the runtime size and improve the performance of our application, especially in environments like Docker or microservices.

In globalization-invariant mode, only the invariant culture is used. This culture is based on English (United States) but it is not specifically tied to en-US. It is just a neutral culture used to ensure consistent behaviour across environments.

Before .NET 6, globalization-invariant mode allowed us to create any custom culture, as long as its name conformed to the BCP-47 standard. BCP-47 stands for Best Current Practice 47, and it defines a way to represent language tags that include the language, region, and other relevant cultural data. A BCP-47 language tag typically follows this pattern: language-region, for example zh-CN and zh-Hans.

Thus, before .NET 6, if an app creates a culture that is not the invariant culture, the operation succeeds.

However, starting from .NET 6, an exception is thrown if we create any culture other than the invariant culture in globalization-invariant mode. This explains why our app throws System.Globalization.CultureNotFoundException.

We thus need to disable the globalization-invariant mode in the .csproj file, as shown below, so that we can use the full globalization data, which will allow .NET to properly handle localisation.

<Project Sdk="Microsoft.NET.Sdk.Web">

<PropertyGroup>
...
<InvariantGlobalization>false</InvariantGlobalization>
</PropertyGroup>

...

</Project>

Missing of ICU in Alpine

Since Alpine is a very minimal Linux distribution, it does not include many libraries, tools, or system components that are present in more standard distributions like Ubuntu.

In terms of globalisation, Alpine does not come pre-installed with ICU (International Components for Unicode), which .NET uses for localisation in our case.

Hence, after we turned off the globalization-invariant mode, we will encounter another issue, which is our Web API not being able to locate a valid ICU package.

Our Web API crashes due to the missing of ICU package, as shown in docker logs.

As suggested in the error message, we need to install the ICU libraries (icu-libs).

In .NET, icu-libs provides the necessary ICU libraries that allow our Web API to handle globalisation. However, the ICU libraries rely on culture-specific data to function correctly. This culture-specific data is provided by icu-data-full, which includes the full set of localisation and globalisation data for different languages and regions. Therefore, we need to install both icu-libs and icu-data-full, as shown below.

...

## Runtime Container
FROM mcr.microsoft.com/dotnet/aspnet:8.0-alpine AS runtime

# Install cultures
RUN apk add --no-cache \
icu-data-full \
icu-libs

...

After installing the ICU libraries, our weather forecast Web API container should be running successfully now. Now, when we visit the endpoint, we will realise that it is able to retrieve the correct value from the PO files, as shown in the following screenshot.

Yay, we can get the translated texts now!

One last thing I would like to share is that, as shown in the screenshot above, since we do not have a PO file for ms-BN (Malay for Brunei), the fallback mechanism automatically uses the ms.po file instead.

Additional Configuration

If you still could not get the translation with PO files to work, perhaps you can try out some of the suggestions from my teammates below.

Firstly, you may need to setup the AppLocalIcu in .csproj file. This setting is used to specify whether the app should use a local copy of ICU or rely on the system-installed ICU libraries. This is particularly useful in containerised environments like Docker.

<Project Sdk="Microsoft.NET.Sdk.Web">

<PropertyGroup>
...
<AppLocalIcu>true</AppLocalIcu>
</PropertyGroup>

</Project>

Secondly, even though we have installed icu-libs and icu-data-full in our Alpine container, some .NET apps rely on data beyond just having the libraries available. In such case, we need to turn on the IncludeNativeLibrariesForSelfExtract setting as well in .csproj.

<Project Sdk="Microsoft.NET.Sdk.Web">

<PropertyGroup>
...
<IncludeNativeLibrariesForSelfExtract>true</IncludeNativeLibrariesForSelfExtract>
</PropertyGroup>

</Project>

Thirdly, please check if you need to configure DOTNET_SYSTEM_GLOBALIZATION_PREDEFINED_CULTURES_ONLY as well. However, please take note that this setting only makes sense when when globalization-invariant mode is enabled.

Finally, you may also need to include the runtime ICU libraries with the Microsoft.ICU.ICU4C.Runtime NuGet package (Version 72.1.0.3), enabling your app to use culture-specific data for globalisation features.

References

From Zero to Gemini: Building an AI-Powered Game Helper

On a chilly November morning, I attended the Google DevFest 2024 in Singapore. Together with my friends, we attended a workshop titled “Gemini Masterclass: How to Unlock Its Power with Prompting, Functions, and Agents.” The session was led by two incredible speakers, Martin Andrews and Sam Witteveen.

Martin, who holds a PhD in Machine Learning and has been an Open Source advocate since 1999. Sam is a Google Developer Expert in Machine Learning. Both of them are also organisers of the Machine Learning Singapore Meetup group. Together, they delivered an engaging and hands-on workshop about Gemini, the advanced LLM from Google.

Thanks to their engaging Gemini Masterclass, I have taken my first steps into the world of LLMs. This blog post captures what I learned and my journey into the fascinating world of Gemini.

Martin Andrews presenting in Google DevFest 2024 in Singapore.

About LLM and Gemini

LLM stands for Large Language Model. To most people, an LLM is like a smart friend who can answer almost all our questions with responses that are often accurate and helpful.

As a LLM, Gemini is trained on large amount of text data and can perform a wide range of tasks: answering questions, writing stories, summarising long documents, or even helping to debug code. What makes them special is their ability to “understand” and generate language in a way that feels natural to us.

Many of my developer friends have started using Gemini as a coding assistant in their IDEs. While it is good at that, Gemini is much more than just a coding tool.

Gemini is designed to not only respond to prompts but also act as an assistant with an extra set of tools. To make the most of Gemini, it is important to understand how it works and what it can (and cannot) do. With the knowledge gained from the DevFest workshop, I decided to explore how Gemini could assist with optimising relic choices in a game called Honkai: Star Rail.

Honkai: Star Rail and Gemini for Its Relic Recommendations

Honkai: Star Rail (HSR) is a popular RPG that has captured the attention of players worldwide. One of the key features of the game is its relic system, where players equip their characters with relics like hats, gloves, or boots to boost stats and unlock special abilities. Each relic has unique attributes, and selecting the right sets of relics for a character can make a huge difference in gameplay.

An HSR streamer, MurderofBirds, browsing through thousands of relics. (Image Source: MurderofBirds Twitch)

As a casual player, I often found myself overwhelmed by the number of options and the subtle synergies between different relic sets. Finding the good relic combination for each character was time-consuming.

This is where LLMs like Gemini come into play. With the ability to process and analyse complex data, Gemini can help players make smarter decisions.

In this blog post, I will briefly show how this Gemini-powered relic recommendation system can analyse a player’s current characters to suggest the best options for them. Then it will also explain the logic behind its recommendations, helping us to understand why certain relics are ideal.

Setup the Project

To make my project code available to everyone, I used Google Colab, a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs. You can access my code by clicking on the button below.

Open In Colab

In my project, I used the google-generativeai Python library, which is pre-installed in Colab. This library serves as a user-friendly API for interacting with Google LLMs, including Gemini. It makes it easy for us to integrate Gemini capabilities directly into our code.

Next, we will need to import the necessary libraries.

Importing the libraries and setup Gemini client.

The first library to import is definitely the google.generativeai. Without it, we cannot interact with Gemini easily. Then we have google.colab.userdata which securely retrieves sensitive data, like our API key, directly from the Colab notebook environment.

We will also use IPython.display for displaying results in a readable format, such as Markdown.

In the Secret section, we will have two records, i.e.

  • HONKAI_STAR_RAIL_PLAYER_ID: Your HSR player UID. It is used later to personalise relic recommendations.
  • GOOGLE_API_KEY: The API key that we can get from Google AI Studio to authenticate with Gemini.
Creating and retrieving our API keys in Google AI Studio.

Once we have initialised the google.generativeai library with the GOOGLE_API_KEY, we can proceed to specify the Gemini model we will be using.

The choice of model is crucial in LLM projects. Google AI Studio offers several options, each representing a trade-off between accuracy and cost. For my project, I choose models/gemini-1.5-flash-8b-001, which provided a good balance for this experiment. Larger models might offer slightly better accuracy but at a significant cost increase.

Google AI Studio offers a range of models, from smaller, faster models suitable for quick tasks to larger, more powerful models capable of more complex processing.

Hallucination and Knowledge Limitation

We often think of LLMs like Gemini as our smart friends who can answer any question. But just like even our smartest friend can sometimes make mistakes, LLMs have their limits too.

Gemini knowledge is based on the data it was trained on, which means it doesn’t actually know everything. Sometimes, it might hallucinate, i.e. model invents information that sounds plausible but not actually true.

Kiana is not a character from Honkai: Star Rail but she is from another game called Honkai Impact 3rd.

While Gemini is trained on a massive dataset, its knowledge is not unlimited. As a responsible AI, it acknowledges its limitations. So, when it cannot find the answer, it will tell us that it lacks the necessary information rather than fabricating a response. This is how Google builds safer AI systems, as part of its Secure AI Framework (SAIF).

Knowledge cutoff in action.

To overcome these constraints, we need to employ strategies to augment the capabilities of LLMs. Techniques such as integrating Retrieval-Augmented Generation (RAG) and leveraging external APIs can help bridge the gap between what the model knows and what it needs to know to perform effectively.

System Instructions

Leveraging System Instructions is a way to improve the accuracy and reliability of Gemini responses.

System instructions are prompts given before the main query in order to guide Gemini. These instructions provide crucial context and constraints, significantly enhancing the accuracy and reliability of the generated output.

System Instruction with contextual information about HSR characters ensures Gemini has the necessary background knowledge.

The specific design and phrasing of the system instructions provided to the Gemini is crucial. Effective system instructions provide Gemini with the necessary context and constraints to generate accurate and relevant responses. Without carefully crafted system instructions, even the most well-designed prompt can yield poor results.

Context Framing

As we can see from the example above, writing clear and effective system instructions requires careful thought and a lot of testing.

This is just one part of a much bigger picture called Context Framing, which includes preparing data, creating embeddings, and deciding how the system retrieves and uses that data. Each of these steps needs expertise and planning to make sure the solution works well in real-world scenarios.

You might have heard the term “Prompt Engineering,” and it sounds kind of technical, but it is really about figuring out how to ask the LLM the right questions in the right way to get the best answers from an LLM.

While context framing and prompt engineering are closely related and often overlap, they emphasise different aspects of the interaction with the LLM.

Stochasticity

While experimenting with Gemini, I noticed that even if I use the exact same prompt, the output can vary slightly each time. This happens because LLMs like Gemini have a built-in element of randomness , known as Stochasticity.

Lingsha, an HSR character released in 2024. (Image Credit: Game8)

For example, when querying for DPS characters, Lingsha was inconsistently included in the results. While this might seem like a minor variation, it underscores the probabilistic nature of LLM outputs and suggests that running multiple queries might be needed to obtain a more reliable consensus.

Lingsha was inconsistently included in the response to the query about multi-target DPS characters.
According to the official announcement, even though Lingsha is a healer, she can cause significant damage to all enemies too. (Image Source: Honkai: Star Rail YouTube)

Hence, it is important to treat writing efficient system instruction and prompt as iterative processes. so that we can experiment with different phrasings to find what works best and yields the most consistent results.

Temperature Tuning

We can also reduce the stochasticity of Gemini response through adjusting parameters like temperature. Lower temperatures typically reduce randomness, leading to more consistent outputs, but also may reduce creativity and diversity.

Temperature is an important parameter for balancing predictability and diversity in the output. Temperature, a number in the range of 0.0 to 2.0 with default to be 1.0 in gemini-1.5-flash model, indicates the probability distribution over the vocabulary in the model when generating text. Hence, a lower temperature makes the model more likely to select words with higher probabilities, resulting in more predictable and focused text.

Having Temperature=0 means that the model will always select the most likely word at each step. The output will be highly deterministic and repetitive.

Function Calls

A major limitation of using system instructions alone is their static nature.

For example, my initial system instructions included a list of HSR characters, but this list is static. The list does not include newly released characters or characters specific to the player’s account. In order to dynamically access a player’s character database and provide personalised recommendations, I integrated Function Calls to retrieve real-time data.

For fetching the player’s HSR character data, I leveraged the open-source Python library mihomo. This library provides an interface for accessing game data, enabling dynamic retrieval of a player’s characters and their attributes. This dynamic data retrieval is crucial for generating truly personalised relic recommendations.

Using the mihomo library, I retrieve five of my Starfaring Companions.

Defining the functions in my Python code was only the first step. To use function calls, Gemini needed to know which functions were available. We can provide this information to Gemini as shown below.

model = genai.GenerativeModel('models/gemini-1.5-flash-8b-001', tools=[get_player_name, get_player_starfaring_companions])

After we pass a query to a Gemini, the model returns a structured object that includes the names of relevant functions and their arguments based on the prompt, as shown in the screenshot below.

The correct function call is picked up by Gemini based on the prompt.

Using descriptive function names is essential for successful function calling with LLMs because the accuracy of function calls depends heavily on well-designed function names in our Python code. Inaccurate naming can directly impact the reliability of the entire system.

If our Python function is named incorrectly, for example, calling a function get_age but it returns the name of the person, Gemini might select that function wrongly when the prompt is asking for age.

As shown in the screenshot above, the prompt requested information about all the characters of the player. Gemini simply determines which function to call and provides the necessary arguments. Gemini does not directly execute the functions. The actual execution of the function needs to be handled by us, as demonstrated in the screenshot below.

After Gemini telling us which function to call, our code needs to call the function to get the result.

Grounding with Google Search

Function calls are a powerful way to access external data, but they require pre-defined functions and APIs.

To go beyond these limits and gather information from many online sources, we can use Gemini grounding feature with Google Search. This feature allows Gemini to google and include what it finds in its answers. This makes it easier to get up-to-date information and handle questions that need real-time data.

If you are getting the HTTP 429 errors when using the Google Search feature, please make sure you have setup a billing account here with enough quota.

With this feature enabled, we thus can ask Gemini to get some real-time data from the Internet, as shown below.

The upcoming v2.7 patch of HSR is indeed scheduled to be released on 4th December.

Building a Semantic Knowledge Base with Pinecone

System instructions and Google search grounding provide valuable context, but a structured knowledge base is needed to handle the extensive data about HSR relics.

Having explored system instructions and Google search grounding, the next challenge is to manage the extensive data about HSR relics. We need a way to store and quickly retrieve this information, enabling the system to generate timely and accurate relic recommendations. Thus we will need to use a vector database ideally suited for managing the vast dataset of relic information.

Vector databases, unlike traditional databases that rely on keyword matching, store information as vectors enabling efficient similarity searches. This allows for retrieving relevant relic sets based on the semantic meaning of a query, rather than relying solely on keywords.

There are many options for vector database, but I choose Pinecone. Pinecone, a managed service, offered the scalability needed to handle the HSR relic dataset and the robust API essential for reliable data access. Its availability of a free tier is also a significant factor because it allows me to keep costs low during the development of my project.

API keys in Pinecone dashboard.

Pinecone’s well-documented API and straightforward SDK make integration surprisingly easy. To get started, simply follow the Pinecone documentation to install the SDK in our code and retrieve the API key.

# Import the Pinecone library
from pinecone.grpc import PineconeGRPC as Pinecone
from pinecone import ServerlessSpec
import time

# Initialize a Pinecone client with your API key
pc = Pinecone(api_key=userdata.get('PINECONE_API_KEY'))

I prepare my Honkai: Star Rail relic data, which I have previously organised into a JSON structure. This data includes information on each relic set’s two-piece and four-piece effects. Here’s a snippet to illustrate the format:

[
{
"name": "Sacerdos' Relived Ordeal",
"two_piece": "Increases SPD by 6%",
"four_piece": "When using Skill or Ultimate on one ally target, increases the ability-using target's CRIT DMG by 18%, lasting for 2 turn(s). This effect can stack up to 2 time(s)."
},
{
"name": "Scholar Lost in Erudition",
"two_piece": "Increases CRIT Rate by 8%",
"four_piece": "Increases DMG dealt by Ultimate and Skill by 20%. After using Ultimate, additionally increases the DMG dealt by the next Skill by 25%."
},
...
]

With the relic data organised in Pinecone, the next challenge is to enable similarity searches with vector embedding. Vector embedding captures the semantic meaning of the text, allowing Pinecone to identify similar relic sets based on their inherent properties and characteristics.

Vector embedding representations (Image Credit: Pinecode)

Now, we can generate vector embeddings for the HSR relic data using Pinecone. The following code snippet illustrates this process which is to convert textual descriptions of relic sets into numerical vector embeddings. These embeddings capture the semantic meaning of the relic set descriptions, enabling efficient similarity searches later.

# Load relic set data from the JSON file
with open('/content/hsr-relics.json', 'r') as f:
relic_data = json.load(f)

# Prepare data for Pinecone
relic_info_data = [
{"id": relic['name'], "text": relic['two_piece'] + " " + relic['four_piece']} # Combine relic set descriptions
for relic in relic_data
]

# Generate embeddings using Pinecone
embeddings = pc.inference.embed(
model="multilingual-e5-large",
inputs=[d['text'] for d in relic_info_data],
parameters={"input_type": "passage", "truncate": "END"}
)

print(embeddings)

As shown in the code above,  we use the multilingual-e5-large model, a text embedding model from Microsoft research, to generate a vector embedding for each relic set. The multilingual-e5-large model works well on messy data and it is good for short queries.

Pinecone ability to perform fast similarity searches relies on its indexing mechanism. Without an index, searching for similar relic sets would require comparing each relic set’s embedding vector to every other one, which would be extremely slow, especially with a large dataset. I choose Pinecone serverless index hosted on AWS for its automatic scaling and reduced infrastructure management.

# Create a serverless index
index_name = "hsr-relics-index"

if not pc.has_index(index_name):
pc.create_index(
name=index_name,
dimension=1024,
metric="cosine",
spec=ServerlessSpec(
cloud='aws',
region='us-east-1'
)
)

# Wait for the index to be ready
while not pc.describe_index(index_name).status['ready']:
time.sleep(1)

The dimension parameter specifies the dimensionality of the vector embeddings. Higher dimensionality generally allows for capturing more nuanced relationships between data points. For example, two relic sets might both increase ATK, but one might also increase SPD while the other increases Crit DMG. A higher-dimensional embedding allows the system to capture these subtle distinctions, leading to more relevant recommendations.

For the metric parameter which measures the similarity between two vectors (representing relic sets), we use the cosine metric which is suitable for measuring the similarity between vector embeddings generated from text. This is crucial for understanding how similar two relic descriptions are.

With the vector embeddings generated, the next step was to upload them into my Pinecone index. Pinecone uses the upsert function to add or update vectors in the index. The following code snippet shows how we can upsert the generated embeddings into the Pinecone index.

# Target the index where you'll store the vector embeddings
index = pc.Index("hsr-relics-index")

# Prepare the records for upsert
# Each contains an 'id', the embedding 'values', and the original text as 'metadata'
records = []
for r, e in zip(relic_info_data, embeddings):
records.append({
"id": r['id'],
"values": e['values'],
"metadata": {'text': r['text']}
})

# Upsert the records into the index
index.upsert(
vectors=records,
namespace="hsr-relics-namespace"
)

The code uses the zip function to iterate through both the list of prepared relic data and the list of generated embeddings simultaneously. For each pair, it creates a record for Pinecone with the following attributes.

  • id: Name of the relic set to ensure uniqueness;
  • values: The vector representing the semantic meaning of the relic set effects;
  • metadata: The original description of the relic effects, which will be used later for providing context to the user’s recommendations. 

Implementing Similarity Search in Pinecone

With the relic data stored in Pinecone now, we can proceed to implement the similarity search functionality.

def query_pinecone(query: str) -> dict:

# Convert the query into a numerical vector that Pinecone can search with
query_embedding = pc.inference.embed(
model="multilingual-e5-large",
inputs=[query],
parameters={
"input_type": "query"
}
)

# Search the index for the three most similar vectors
results = index.query(
namespace="hsr-relics-namespace",
vector=query_embedding[0].values,
top_k=3,
include_values=False,
include_metadata=True
)

return results

The function above takes a user’s query as input, converts it into a vector embedding using Pinecone’s inference endpoint, and then uses that embedding to search the index, returning the top three most similar relic sets along with their metadata.

Relic Recommendations with Pinecone and Gemini

With the integration with Pinecode, we design the initial prompt to pick relevant relic sets from Pinecone. After that, we take the results from Pinecone and combine them with the initial prompt to create a richer, more informative prompt for Gemini, as shown in the following code.

from google.generativeai.generative_models import GenerativeModel

async def format_pinecone_results_for_prompt(model: GenerativeModel, player_id: int) -> dict:
character_relics_mapping = await get_player_character_relic_mapping(player_id)

result = {}

for character_name, (character_avatar_image_url, character_description) in character_relics_mapping.items():
print(f"Processing Character: {character_name}")

additional_character_data = character_profile.get(character_name, "")

character_query = f"Suggest some good relic sets for this character: {character_description} {additional_character_data}"

pinecone_response = query_pinecone(character_query)

prompt = f"User Query: {character_query}\n\nRelevant Relic Sets:\n"
for match in pinecone_response['matches']:
prompt += f"* {match['id']}: {match['metadata']['text']}\n" # Extract relevant data
prompt += "\nBased on the above information, recommend two best relic sets and explain your reasoning. Each character can only equip with either one 4-piece relic or one 2-piece relic with another 2-piece relic. You cannot recommend a combination of 4-piece and 2-piece together. Consider the user's query and the characteristics of each relic set."

response = model.generate_content(prompt)

result[character_avatar_image_url] = response.text

return result

The code shows that we are doing both prompt engineering (designing the initial query to get relevant relics) and context framing (combining the initial query with the retrieved relic information to get a better overall recommendation from Gemini).

First the code retrieves data about the player’s characters, including their descriptions, images, and relics the characters currently are wearing. The code then gathers potentially relevant data about each character from a separate data source character_profile which has more information, such as gameplay mechanic about the characters that we got from the Game8 Character List. With the character data, the query will find similar relic sets in the Pinecone database.

After Pinecone returns matches, the code constructs a detailed prompt for the Gemini model. This prompt includes the character’s description, relevant relic sets found by Pinecone, and crucial instructions for the model. The instructions emphasise the constraints of choosing relic sets: either a 4-piece set, or two 2-piece sets, not a mix. Importantly, it also tells Gemini to consider the character’s existing profile and to prioritise fitting relic sets.

Finally, the code sends this detailed prompt to Gemini, receiving back the recommended relic sets.

Knight of Purity Palace, is indeed a great option for Gepard!
Enviosity, a popular YouTuber known for his in-depth Honkai: Star Rail strategy guides, introduced Knight of Purity Palace for Gepard too. (Source: Enviosity YouTube)

Langtrace

Using LLMs like Gemini is sure exciting, but figuring out what is happening “under the hood” can be tricky.

If you are a web developer, you are probably familiar with Grafana dashboards. They show you how your web app is performing, highlighting areas that need improvement.

Langtrace is like Grafana, but specifically for LLMs. It gives us a similar visual overview, tracking our LLM calls, showing us where they are slow or failing, and helping us optimise the performance of our AI app.

Traces for the Gemini calls are displayed individually.

Langtrace is not only useful for tracing our LLM calls, it also offers metrics on token counts and costs, as shown in the following screenshot.

Beyond tracing calls, Langtrace collects metrics too.

Wrap-Up

Building this Honkai: Star Rail (HSR) relic recommendation system is a rewarding journey into the world of Gemini and LLMs.

I am incredibly grateful to Martin Andrews and Sam Witteveen for their inspiring Gemini Masterclass at Google DevFest in Singapore. Their guidance helped me navigate the complexities of LLM development, and I learned firsthand the importance of careful prompt engineering, the power of system instructions, and the need for dynamic data access through function calls. These lessons underscore the complexities of developing robust LLM apps and will undoubtedly inform my future AI projects.

Building this project is an enjoyable journey of learning and discovery. I encountered many challenges along the way, but overcoming them deepened my understanding of Gemini. If you’re interested in exploring the code and learning from my experiences, you can access my Colab notebook through the button below. I welcome any feedback you might have!

Open In Colab

References

[KOSD] Change of FromQuery Model Binding from .NET 6 to .NET8

Recently, while migrating our project from .NET 6 to .NET 8, my teammate Jeremy Chan uncovered an undocumented change in model binding behaviour that seems to appear since .NET 7. This change is not clearly explained in the official .NET documentation, so it can be something developers easily overlook.

To illustrate the issue, let’s begin with a simple Web API project and explore a straightforward controller method that highlights the change.

[ApiController]
public class FooController
{
[HttpGet()]
public async void Get([FromQuery] string value = "Hello")
{
Console.WriteLine($"Value is {value}");

return new JsonResult() { StatusCode = StatusCodes.Status200OK };
}
}

Then we assume that we have nullable enabled in both .NET 6 and .NET 8 projects.

<Project Sdk="Microsoft.NET.Sdk.Web">

<PropertyGroup>
<Nullable>enable</Nullable>
...
</PropertyGroup>

...

</Project>

Situation in .NET 6

In .NET 6, when we call the endpoint with /foo?value=, we shall receive the following error.

{
"type": "https://tools.ietf.org/html/rfc7231#section-6.5.1",
"title": "One or more validation errors occurred.",
"status": 400,
"traceId": "00-5bc66c755994b2bba7c9d2337c1e5bc4-e116fa61d942199b-00",
"errors": {
"value": [
"The value field is required."
]
}
}

However, if we change the method to be as follows, the error will not be there.

public async void Get([FromQuery] string? value)
{
if (value is null)
Console.WriteLine($"Value is null!!!");
else
Console.WriteLine($"Value is {value}");

return new JsonResult() { StatusCode = StatusCodes.Status200OK };
}

The log when calling the endpoint with /foo?value= will then be “Value is null!!!”.

Hence, we can know that query string without value will be interpreted as being null. That is why there will be a validation error when value is not nullable.

Thus, we can say that, in order to make the endpoint work in .NET 6, we need to change it to be as follows to make the value optional. This will not mark value as a required field.

public async void Get([FromQuery] string? value = "Hello")

Now, if we call the endpoint with /foo?value=, we shall receive see the log “Value is Hello” printed.

Situation in .NET 8 (and .NET 7)

Then how about in .NET 8 with the same original setup, i.e. as shown below.

public async void Get([FromQuery] string value = "Hello")

In .NET 8, when we call the endpoint with /foo?value=, we shall see the log “Value is Hello” printed.

So, what is happening here?

In .NET 7, a new Interface IParsable<TSelf> was introduced. Thus, starting from the .NET 7, IParsable<TSelf>.TryParse API is used for binding controller action parameter values.

Initial research shows that, under the hood, .NET 7 onwards, the new model binding implementation is used and it causes this to happen.

References

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

From Legacy to .NET 8: Migrating with NDepend

Quick note: I received a free license for NDepend to try it out and share my experience. All opinions in this blog post are on my own.

From O2DES.Net to Ea

In 2019, I had the honour of working closely with the team behind the O2DES.NET during my time working at the C4NGP research center in NUS, where I spent around two and a half years. After I left the team in 2022, O2DES.NET has not been actively updated on their GitHub public repository and it is still targeting at .NET Standard 2.1.

While .NET Standard 2.1 is not as old as the .NET Framework, it is considered somewhat outdated compared to the latest .NET versions. In the article “The Future of .NET Standard” written by Immo Landwerth, .NET Standard has been largely superseded by .NET 5 (and later versions), which unify these platforms into a single runtime. Hence, moving to .NET 8 is a forward-looking decision that aligns with current and future software development trends.

Immo Landwerth, program manager on the .NET Framework team at Microsoft, talked about .NET Standard 2.0 back in 2016. (Image Credit: dotnet – YouTube Channel)

Hence, in this article, I will walk you through the process of migrating O2DES.NET from targeting .NET Standard 2.1 to supporting .NET 8. To prevent any confusion, I’ve renamed the project to ‘Ea’ because I am no longer the active developer of O2DES.NET. Throughout this article, ‘Ea’ will refer to the version of the project updated to .NET 8.

In this migration journey, I will be relying on NDepend, a static code analysis tool for .NET developers.

Show Me the Code!

The complete source code of my project after migrating O2DES.NET to target at .NET 8 can be found on GitHub at https://github.com/gcl-team/Ea.

About NDepend: Why Do We Need a Static Code Analysis?

Why do we need NDepend, a static code analysis tool?

Static code analysis is a way of automatically checking our code for potential issues without actually running our apps. Think of it like a spell-checker, but for programming, scanning our codebase to find bugs, performance issues, and security vulnerabilities early in the development process.

During the migration of an older library, such as moving O2DES.NET from .NET Standard 2.1 to .NET 8, the challenges can add up. We are expected to run into outdated code patterns, performance bottlenecks, or even compatibility issues.

The O2DES.NET on GitHub has some of its NuGet references outdated too.

NDepend is designed to help with this by performing a deep static analysis of the entire codebase. It gives us detailed reports on code quality, shows where our dependencies are, and highlights areas that need attention. We can then focus on modernising the code with confidence, knowing that we are not likely introducing new bugs or performance issues as we are updating the codebase.

NDepend also helps enforce good coding practices by pointing out issues like overly complex methods, dead code, or potential security vulnerabilities. With features like code metrics, dependency maps, and rule enforcement, it acts as a guide to help us write better, more maintainable code.

Bringing Down Debt from 6.22% to 0.35%

One of the standout features of NDepend is its comprehensive dashboard, which I heavily rely on to get an overview of the entire O2DES.NET codebase.

Right after targeting the O2DES.NET library to .NET 8, a lot of issues surfaced.

From code quality metrics to technical debt, the dashboard presents critical insights in a visual and easy-to-understand format. Having all this information in one place is indeed invaluable to us during the migration project.

To help us better understand how much effort is needed to fix or improve the codebase, NDepend uses the Debt Ratio and Debt Rating, both of which are part of the SQALE method.

We can configure the SQALE Debt Ratio and Debt Rating.

In the book, the SQALE method for Managing Technical Debt written by Jean-Louis Letouzey, SQALE stands for Software Quality Assessment based on Life Expectations. SQALE is a method used to assess and manage technical debt in software projects. In the context of NDepend, the SQALE method is used to calculate the Debt Ratio and Debt Rating:

Debt Ratio: The percentage of effort needed to fix the technical debt compared to rewriting the code from scratch.

Debt Rating: A letter-based rating (A to E) derived from the Debt Ratio to give a quick overview of the severity of technical debt.

As shown in one of the earlier screenshots, Ea has a Debt Ratio of 6.22% and a B rating. This means that its technical debt is considered moderate and manageable. Nevertheless, it is a signal that it is now time we should start addressing the identified issues before they accumulate.

After just two weeks of code cleanup, we successfully reduced Ea’s Debt Ratio from 6.22% to an impressive 0.35%, elevating its rating to an A. This significant improvement not only enhances the overall quality of the codebase but also positions Ea for better maintainability.

The most recent analysis shows that the Debt Ratio of Ea is down to just 0.35%.

Issues and Trends

In Visual Studio, NDepend also provides interactive UI which indicates the number of critical rules violated and critical issues to solve. Unlike most of the static code analysis tools that show overwhelming number of issues, NDepend has this concept of baseline.

When we first set up an NDepend project, the very first analysis of our code becomes the “baseline.” This baseline serves as a starting point, capturing the current state of our code. As we continue to work on the project, future analyses will be compared against this baseline. The idea is to track how our code changes over time so that we can focus on knowing whether we are improving or introducing more issues to the codebase while we are changing it.

At some point during the code change, we fixed 31 “High” issues (shown in green) while introducing 42 new “High” issues (shown in red).

As shown in the screenshot above, those new issues added since the baseline need to be our priority to fix. This is to make sure the newly written code and refactored code will remain clean.

In fact, when fixing the issues, I get to learn from the NDepend rules. When we click on the numbers, we will be shown the corresponding issues. Then clicking on each of the issue will show us the detailed information about it. For example, as shown in the screenshot below, when we click on one of the green numbers, it shows us a list of issues that have been fixed by us.

As indicated, the issue is one which has been fixed since the baseline.

When we click on the red numbers, as shown in the following screenshot, we will get to see the new issues that we need to fix. The following example shows how the original O2DES.NET has some methods declared with high visibility unnecessarily.

This is an issue that has been newly added since the baseline.

By default, the dashboard also comes with some helpful trend charts. These charts give us a visual overview of how our codebase is evolving over time.

We have made significant progress in Ea library development over the past half month.

These charts give us a visual overview of how our codebase is evolving over time. For those new to static code analysis, think of these charts as the “health check” of the project. During the migration, they help us to track important metrics, like code coverage, issues, or technical debt, and show how they change with each analysis.

Code Dependency Graphs

NDepend offers a Dependency Graph. It is used to visually represent the relationships between different components such as namespaces and classes within our codebase. The graph helps us understand how tightly coupled our code is and how different parts of our codebase depend on each other.

When we are refactoring Ea during the migration, we depend on the Dependency Graph to visually shows us how the different parts of the codebase are connected. We use the insight provided by Dependency Graph to plan how to split components, which will then make the code easier to manage.

A dependency diagram made of all classes in the Ea project.

As shown in the diagram above, we can see a graph made of some entangled classes which are connected with a red bi-directional arrow. This is because in the original O2DES.NET library, there are some classes having circular dependency. This thus makes parts of the code heavily reliant on each other, reducing modularity and making it harder to unit test the code independently.

To further investigate the classes, we can double click the edge between those two classes. Doing so will generate a graph made of methods and fields involved in the dependency between the two classes, as shown in the screenshot below.

The coupling graph between two classes.

This coupling graph is a powerful tool for us as it offers detailed insights into how the two classes interact. This level of detail allows us to focus on the exact code causing the coupling, making it easier to assess whether the dependency is necessary or can be refactored. For instance, if multiple methods are too intertwined, it might be time to extract common logic into a new class or interface.

In addition, the Dependency Matrix is another way to visualise the dependencies between namespaces, classes, or methods. A number in a cell at the intersection of two elements indicates how many times the element in the row depends on the element in the column. This gives us an overview of the dependencies within our codebase.

The Dependency Matrix.

From the Dependency Matrix above, we first should look for cells with large numbers. This is because having large numbers indicating the two methods are highly dependent on each other. We should review those methods to understand why there is so much interaction and to make sure they are not tightly coupled.

If there is a cycle in the codebase, there will be a red square shown on the Dependency Matrix. We then can refactor by breaking the cycle, possibly by introducing new interfaces or decoupling responsibilities between the methods.

Code Metrics View

In the Code Metric View, each rectangle represents a method. The area of a rectangle is proportional to metrics such as the # lines of codes (LOC), cyclomatic complexity (CC), of the corresponding method, field, type, namespace, or assembly.

This treemap shows the # lines of code (LOC) of the methods in our project.

During the migration, the tree view format enables us to navigate our codebase and prioritise areas that require refactoring by spotting those methods that are too big and too complex. In addition, to help quickly identify problem areas, NDepend uses colour coding in the tree view. For example, red may indicate high complexity or large size, while green might indicate simpler, more maintainable code.

The tree view is interactive. Right-clicking on the rectangles provides options such as opening the source code declaration for the selected element, allowing us to navigate directly to the method.

Right-clicking on the rectangles will show the available actions to perform.

Integrating with GitHub Actions

NDepend integrate well with several CI/CD pipelines, making it a valuable tool for maintaining code quality throughout the development lifecycle. It can automatically analyse our code after each build. This ensures that every change in our codebase adheres to defined quality standards before the merge to main branch.

NDepend comes with Quality Gates that enforce standards such as unfixed critical issues. If the code fails to meet the required thresholds, the build can fail in the pipelines.

In NDepend, Quality Gates are predefined sets of code quality criteria that our project must meet before it is considered acceptable for deployment. They serve as automated checkpoints to help ensure that our code maintains a certain standard of quality, reducing technical debt and promoting maintainability.

One of our build failed because there was code violating a critical rule in our codebase.

As shown in the screenshot above, NDepend provides detailed reports on issues and violations after each build. We can also download the detailed report from the CI servers, such as GitHub Actions. These reports help us quickly identify where issues exist in our code.

NDepend report of the build can be found in the Artifacts of the pipeline.

The NDepend report is divided into seven sections, each providing detailed insights into various aspects of your codebase:

  • Overview: It gives a high-level view of the overall code quality and metrics, similar to what is displayed in the NDepend Dashboard within Visual Studio.
  • Issues: A list of source files with unresolved issues. Along with the number of issues, it also shows the “Debt” for each file, which represents the estimated man-time required to resolve the issues.
  • Projects: Similar to the Issues section but focuses on projects instead of individual files. It displays the total issues and associated debt at the project level.
  • Rules: This section highlights the violated rules, showing the issues and debt in terms of the rules that have been broken. It’s another way to assess code quality by focusing on adherence to coding standards.
  • Quality Gates: This section mirrors the Quality Gates you might have seen earlier in the CI/CD pipelines, such as in GitHub Actions.
  • Trend: The Trend section provides a visualisation of trends over time, similar to the trend charts found in the NDepend Dashboard in Visual Studio.
  • Logs: This section contains the logs generated during NDepend analysis.
Number of un-resolved issues and debt of the files in our project.

As described in the NDpend documentation, it has complete support for Azure DevOps, meaning it can be seamlessly integrated into the CI/CD pipelines without a lot of manual setup. We thus can easily configure NDepend to run as part of our Azure Pipelines, generating code quality reports after each build.

For our Ea project, since it is an open-source project hosted on GitHub, we can also integrate NDepend with our GitHub Actions instead.

To integrate with GitHub Actions, firstly, we need to get associate our NDepend license with our GitHub account (or a copy of 28-day trial activation data). To link the NDepend license (eg. ABC012345) with our GitHub account, we will need to visit the link: “https://www.ndepend.com/activation_githubaction?license=ABC012345&#8221;, as demonstrated in the screenshot below.

Linking our NDepend license with our GitHub account.

To introduce NDepend to our GitHub Actions workflow, the very least configuration that we need to add is as follows.

- name: NDepend
uses: ndepend/ndepend-action@ndependv1.0
with:
license: ${{ secrets.NDependLicense }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Read More: Complete Build YAML of Ea

Wrap-Up

In conclusion, NDepend has proven to be an invaluable tool in our journey to modernise and maintain the Ea library.

By offering comprehensive static code analysis, insightful metrics, and seamless integration with CI/CD pipelines like GitHub Actions, it empowers us to catch issues early, reduce technical debt, and ensure a high standard of code quality.

NDepend provides the guidance and clarity needed to ensure our code remains clean, efficient, and maintainable. For any .NET individual or development team serious about improving code quality, NDepend is definitely a must-have in the toolkit.