Kaizen Journey to be Microsoft Certified

In the rapidly evolving fields like software development, staying static in terms of technical skills and knowledge can quickly lead to obsolescence. Hence, the ability to learn independently is a crucial skill in a rapidly changing world. Self-learning allows software developers to acquire new skills and deepen their knowledge in specific areas of interest.

Renew my Azure Developer Associate Certificate

In the September, I was on a business trip to Hanoi, Vietnam. I thus decided to take the opportunity of my time staying in hotel after work to prepare for my Microsoft certificate renewal test.

To Hanoi, from Singapore!

Well, it took me some time to hit refresh on the latest updates in Microsoft Azure because in Samsung, I don’t work daily with it. Fortunately, thanks to Microsoft Learn, I am able to quickly pickup the new knowledge after going through the online resources on the Microsoft Learn platform.

As usual, I took down the notes of what I learned from Microsoft Learn. This year, the exam focuses on the following topics.

  • Microsoft Identity Platform;
  • Azure Key Vault;
  • Azure App Configuration and Monitoring;
  • Azure Container Apps;
  • CosmosDB.

I did pretty well in all the topics above with the exception of Azure Container Apps, where my responses to questions related to Azure Container Registry were unfortunately incorrect. However, I am pleased to share that despite this challenge, I successfully passed the renewal assessment on my first attempt.

Achieving success in my Azure exam at midnight in Hanoi.

Participating in the AI Skills Challenge

Last month, I also participated in an online Microsoft event. It is the Microsoft Learn AI Skills Challenge where we are allowed to choose to complete one out of the four challenges from Machine Learning Challenge, Cognitive Services Challenge, Machine Learning Operations (MLOps) Challenge, and AI Builder Challenge.

The AI Builder Challenge introduces us to AI Builder. AI Builder is a Microsoft Power Platform capability that provides AI models that are designed to optimise the business processes.

The challenge shows us how to build models, and explains how we can use them in Power Apps and Power Automate. Throughout the online course, we can learn how to create topics, custom entities, and variables to capture, extract, and store information in a bot.

Why Taking Microsoft AI Challenge?

Users login the Samsung app using face recognition technology from Microsoft AI (Image Credit: cyberlink.com)

Since last year, I have been working in the AI module in a Samsung app. I am proud to have the opportunity to learn about Microsoft AI and use it in our project to, for example, allow users login to our app using the face recognition feature in Microsoft AI.

Therefore, embracing this challenge provides me with a valuable opportunity to gain a deeper understanding of Microsoft AI, with a specific focus on the AI Builder. The AI Builder platform empowers us to create models tailored to our business requirements or to opt for prebuilt models designed to seamlessly address a wide array of common business scenarios.

In August, I finally completed the challenge and received my certificate from Microsoft.

WRAP-UP

By adopting a growth mindset, applying Kaizen principles, and following a structured learning plan, we can embark on our self-learning journey and emerge as a certified professional.

Besides Microsoft Learn, depends on what you’d like to learn, you can enroll in other online courses on platforms like Coursera, Udemy, and edX which offer comprehensive courses with video lectures, quizzes, and labs.

Once you have chosen your certification, create a structured learning plan. You can then proceed to outline the topics covered in the exam objectives and allocate specific time slots for each.

Anyway, remember, continuous learning is the path to excellence, and getting certification is only one of the steps in that direction. Just as software development involves iterations, so does our learning journey. We shall continuously refine our technical skills and knowledge.

[KOSD] Solving SQL File Encoding Issues on Git with PowerShell

Few days ago, some of our teammates discovered that the SQL files they tried to pull from our GitHub repo had encoding issue. When they did git pull, there would be an error saying “fatal: failed to encode ‘…/xxxxx.sql’ from UTF-16-LE-BOM to UTF-8”.

In addition, on GitHub, the SQL files we committed to the GitHub are all marked as binary files. Thus we couldn’t view the changes we made to those files in the commit.

Cause of the Issue

It turns out that those SQL files are generated from SQL Server Management Studio (SSMS).

Default file encoding of SSMS is Western European (Windows) – Codepage 1252.

By default, the encoding used to save SQL files in SSMS is UTF-16. For my case, my default encoding is the “Western European (Windows) – Codepage 1252”. Codepage 1252 is a single-byte character encoding of the Latin alphabet that was used in Windows for English and many Romance and Germanic languages. This encoding will cause Git to treat the files as binary files.

Solution

The way to resolve this issue is to force the file to use UTF-8 encoding. We can run the following PowerShell script to change the encoding of all SQL files in a given directory and its subdirectories.

$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False

Get-ChildItem "<absolute directory path>" -Recurse *.sql | foreach {
    $FilePath = $_.FullName
    $FileContent = Get-Content $FilePath
    [System.IO.File]::WriteAllLines($FilePath, $FileContent, $Utf8NoBomEncoding)
}

The BOM (Byte Order Mark), a sequence of bytes at the start of a text stream (0xEF, 0xBB, 0xBF), is used to signal the endianness of an encoding, but since endianness is irrelevant to UTF-8, the BOM is unnecessary. This explains why we pass $False to the constructor of UTF8Encoding to indicate that BOM is not needed.

Wrap-Up

That’s all for a short little PowerShell script we used to solve the encoding issue of our SQL files.

There is an interesting discussion on StackOverflow about this issue, please give it a read too.

KOSD, or Kopi-O Siew Dai, is a type of Singapore coffee that I enjoy. It is basically a cup of coffee with a little bit of sugar. This series is meant to blog about technical knowledge that I gained while having a small cup of Kopi-O Siew Dai.

Improve Life with Codes

In the realm of software development and business practices, not automating processes when it could bring significant benefits will normally be considered a missed opportunity or an inefficient use of resources. It could lead to wasted time, increased chances of errors, and reduced productivity.

Background Story

My teammate encountered this strange issue that a third-party core component in the system which run as a Windows service would stop randomly. The service is listening to a certain TCP port. When the service was down, telnet to that port would show that the connection was not successful.

After weeks of intensive log investigation, my teammate still could not figure out the reason why it would stop working. However, a glimmer of insight emerged: restarting the Windows service would consistently bring the component back online.

Hence, he solution is creating an alert system which would trigger email to him and the team to restart the Windows service when it goes down. The alert system is basically a scheduler checking the health of the TCP port which the service is listening to.

Since my teammate was only the few ones who could login to the server, he had to standby during weekends too to restart the Windows service. Not long after that, he submitted his resignation and left the company. Other teammates thus had to take over this manual restarting Windows service task.

Auto Restart Windows Service with C#

In order to avoid teammates getting burnout from manually restarting Window service frequently even at nights and during weekends, I decided to develop a C# programme which will be executed every 10 minutes at the server. The C# programme will make a connection to the port being listened by the Windows service to check whether the service is running or not. If it is not, the programme will restart it.

The code is as follows.

try
{
    using (TcpClient tcpClient = new())
    {
        tcpClient.Connect(serverIpAddress, port);
    }

    Console.WriteLine("No issue...");
}
catch (Exception)
{
    int timeoutMilliseconds = 120000;

    ServiceController service = new(targetService);

    try
    {
        Console.WriteLine("Restarting...");
        int millisec1 = Environment.TickCount;

        TimeSpan timeout = TimeSpan.FromMilliseconds(timeoutMilliseconds);

        if (service.Status != ServiceControllerStatus.Stopped) 
        {
            Console.WriteLine("Stopping...");
            service.Stop();
            service.WaitForStatus(ServiceControllerStatus.Stopped, timeout);
        }

        Console.WriteLine("Stopped!");
        int millisec2 = Environment.TickCount;
        timeout = TimeSpan.FromMilliseconds(timeoutMilliseconds - (millisec2 - millisec1));

        Console.WriteLine("Starting...");
        service.Start();
        service.WaitForStatus(ServiceControllerStatus.Running, timeout);

        Console.WriteLine("Restarted!");
    }
    catch (Exception ex) 
    {
        Console.WriteLine(ex.Message);
    }
}

In the programme above, we implement a timeout of 2 minutes. So after waiting the Windows service to stop, we will use the remaining time to wait for the service to be back to the Running status within the remaining time.

After the team had launched this programme as a scheduler, no one has to wake up at midnight just to login to server to restart the Windows service anymore.

Converting Comma-Delimted CSV to Tab-Delimted CSV

Soon, we realised another issue. The input files sent to the Windows service to process has invalid file content. The service is expecting tab-delimited CSV files but the actual content is comma-delimited. The problem has been there since last year, so there are hundreds of files not being processed.

In order to save his time, I wrote a Powershell script to do the conversion.

Get-ChildItem "<directory contains the files>" -Filter *.csv | 
Foreach-Object {
    Import-Csv -Path $_.FullName -Header 1,2,3,4,5,6,7,8,9 | Export-Csv -Path ('<output directory>' + $_.BaseName + '_out.tmp') -Delimiter `t -NoTypeInformation 

    Get-Content ('<output directory>' + $_.BaseName + '_out.tmp') | % {$_ -replace '"', ''} | Select-Object -Skip 1 | out-file -FilePath ('<output directory>' + $_.BaseName + '.csv')

    Remove-Item ('<output directory>' + $_.BaseName + '_out.tmp')
}

The CSV files do not have the header row and they all have 9 columns. Hence, that is the reason why I use “-Header 1,2,3,4,5,6,7,8,9” to add a temporary header. Otherwise, the script will treat the first line in the file to be header. This means that if the first line has multiple columns having the same value, the Import-Csv will fail. This is the reason why we need to add a temporary header with unique column values.

When using Export-Csv, all fields in the CSV are enclosed in quotation marks. Hence, we need to remove the quotation marks and remove the temporary header before we generate a tab-delimited CSV file as the output.

With this my teammate easily transform all the files to the correct format in less than 5 minutes.

Searching File Content with PowerShell

A few days after that, I found out that another teammate was reading the log files manually to find out the lines containing a keyword “access”. I was shocked by what he was doing because there were hundreds of logs everyday and that would mean he needed to spend hours or even days on the task.

Hence, I wrote him another simple PowerShell just to do the job.

Get-ChildItem "<directory contains the files>" -Filter *.log | 
Foreach-Object {
    Get-Content $_.FullName | % { if($_ -match "access") {write-host $_}}
}

With this, my teammate finally could finish his task early.

Wrap-Up

Automating software development processes is a common practice in the industry because of the benefits it offers. It saves time, reduces errors, improves productivity, and allows the team to focus on more challenging and creative tasks.

From a broader perspective, not automating the process but doing it manually might not be a tragic event in the traditional sense, as it does not involve loss of life or extreme suffering. However, it could be seen as a missed chance for improvement and growth.

Multipart Form Upload using WebClient in .NET Framework 3.5

Prior to my team lead’s resignation, he assigned me a task of updating a legacy codebase written in .NET Framework 3.5. The task itself involved submitting a multipart form request to a Web API, which made it relatively straightforward.

However, despite potential alternatives, my team lead insisted that I continue using .NET Framework 3.5. Furthermore, I was not allowed to incorporate any third-party JSON library into the changes I made.

SHOW ME THE CODE!

The complete source code of this project can be found at https://github.com/goh-chunlin/gcl-boilerplate.csharp/tree/master/console-application/HelloWorld.MultipartForm.

Multipart Form

A multipart form request is a type of form submission used to send data that includes binary or non-textual files, such as images, videos, or documents, along with other form field values. In a multipart form, the data is divided into multiple parts or sections, each representing a different form field or file.

The multipart form data format allows the data to be encoded and organized in a way that can be transmitted over HTTP and processed by web servers. For the challenge given by my team lead, the data consists of two sections, i.e. main_message, a field containing a JSON object, and a list of files.

The JSON object is very simple, which has only two fields called message and documentType, which can only be the value “PERSONAL”, for now.

{
    "message": "...",
    "documentType": "PERSONAL"
}

When the API server handles a multipart form request, it will parse the data by separating the different parts based on the specified boundaries and retrieves the values of each field or file for further processing or storage.

About the Boundaries

In a multipart form, the boundary is a unique string that serves as a delimiter between different parts of the form data. It helps in distinguishing one part from another. The following is an example of what a multipart form data request might look like with the boundary specified.

POST /submit-form HTTP/1.1
Host: example.com
Content-Type: multipart/form-data; boundary=--------------------------8db792beb8632a9

----------------------------8db792beb8632a9
Content-Disposition: form-data; name="main_message"

{
    "message": "...",
    "documentType": "PERSONAL"
}
----------------------------8db792beb8632a9
Content-Disposition: form-data; name="attachments"; filename="picture.jpg"
Content-Type: application/octet-stream

In the example above, the boundary is set to ----------------------------8db792beb8632a9. Each part of the form data is separated by this boundary, as indicated by the dashed lines. When processing the multipart form data on the server, the server uses the boundary to identify the different parts and extract the values of each form field or file.

The format of the boundary string used in a multipart form data request is specified in the HTTP/1.1 specification. The boundary string must meet the following requirements:

  1. It must start with two leading dashes “–“.
  2. It may be followed by any combination of characters, excluding ASCII control characters and the characters used to specify the end boundary (typically dashes).
  3. It should be unique and not appear in the actual data being transmitted.
  4. It should not exceed a length of 70 characters (including the leading dashes).

Besides the leading dashes, the number of dashes depends on how many we want there. It also can be zero, if you like, it is just that more dashes makes the boundary more obvious. Also to make it unique, we use timestamp converted into hexadecimal.

Boundary = "----------------------------" + DateTime.Now.Ticks.ToString("x");

Non-Binary Fields

In the example above, besides sending the files to the server, we also have to send the data in JSON. Hece, we will be having a code to generate the section in following format.

----------------------------8db792beb8632a9
Content-Disposition: form-data; name="main_message"

{
    "message": "...",
    "documentType": "PERSONAL"
}

To do so, we have a form data template variable defined as follows.

static readonly string FormDataTemplate = "\r\n--{0}\r\nContent-Disposition: form-data; name=\"{1}\";\r\n\r\n{2}";

Binary Fields

For the section containing binary file, we will need a code to generate something as follows.

----------------------------8db792beb8632a9
Content-Disposition: form-data; name="attachments"; filename="picture.jpg"
Content-Type: application/octet-stream

Thus, we have the following variable defined.

static readonly string FileHeaderTemplate = "Content-Disposition: form-data; name=\"{0}\"; filename=\"{1}\"\r\nContent-Type: application/octet-stream\r\n\r\n";

Ending BOUNDARY

The ending boundary is required in a multipart form data request to indicate the completion of the form data transmission. It serves as a signal to the server that there are no more parts to process and that the entire form data has been successfully transmitted.

The ending boundary is constructed in a similar way to the regular boundary but with an additional two trailing dashes “–” at the end to indicate its termination. Hence, we have the following codes.

void WriteTrailer(Stream stream)
{
    byte[] trailer = Encoding.UTF8.GetBytes("\r\n--" + Boundary + "--\r\n");
    stream.Write(trailer, 0, trailer.Length);
}

Stream.CopyTo Method

As you may have noticed in our code, we have a method called CopyTo as shown below.

void CopyTo(Stream source, Stream destination, int bufferSize)
{
    byte[] array = new byte[bufferSize];
    int count;
    while ((count = source.Read(array, 0, array.Length)) != 0)
    {
        destination.Write(array, 0, count);
    }
}

The reason we have this code is because the Stream.CopyTo method, that reads the bytes from the current stream and writes them to another stream, is only introduced in .NET Framework 4.0. Hence, we have to write our own CopyTo method.

JSON Handling

To handle the JSON object, the following items are all not available in a .NET Framework 3.5 project.

Hence, if we are not allowed to use any third-party JSON library such as Json.NET from Newtonsoft, then we can only write our own in C#.

string mainMessage =
    "{ " +
        "\"message\": \"" + ConvertToUnicodeString(message) + "\",  " +
        "\"documentType\": \"PERSONAL\" " +
    "}";

The method ConvertToUnicodeString is to support unicode characters in our JSON.

private static string ConvertToUnicodeString(string text)
{
    StringBuilder sb = new StringBuilder();

    foreach (var c in text)
    {
        sb.Append("\\u" + ((int)c).ToString("X4"));
    }

    return sb.ToString();
}

Wrap-Up

This concludes the challenge my team lead presented to me before his resignation.

He did tell me that migrating the project to the modern .NET framework requires significant resources, including development time, training, and potential infrastructure changes. Hence, he foresaw that with limited budgets, the team could only prioritise other business needs over framework upgrades.

After completing the challenge, my team lead has resigned for quite some time. With his departure in mind, I have written this blog post in the hopes that it may offer assistance to other developers facing similar difficulties in their career.

The complete source code of this sample can be found on my GitHub project: https://github.com/goh-chunlin/gcl-boilerplate.csharp/tree/master/console-application/HelloWorld.MultipartForm.

References