Maq Verma, Author at Admiration Tech News

Reverse engineering a chrome extension

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

I’ve been using Revue for my 123dev newsletter and wanted an easier way to save URLs to include in future emails.
If you’re not familiar with it, Revue has a chrome extension so you can send URLs to a queue which shows up next to the editor.

It’s a really handy feature and I wanted to use it without the extension.
Ideally, I could send these URL from my phone via a Siri Shortcut (I haven’t figured this part out yet).

The functionality wasn’t exposed in their API docs so I’d have to figure out another way.
I learned some new things exploring the extension so I thought I’d share how I did it.

Reverse engineering the extension

The first thing I needed was to figure out what URLs the extension was calling.
I tried watching Chrome dev tools for network calls, watching DNS requests, and tcpdump.

Without having a man in the middle to decode https it wasn’t going to work.
Thankfully, someone pointed out the code is available if you have the extension installed.

First, we need to get the extension ID from the installation URL.

The long string in the URL fdnhneinocoonabhfbmelgkcmilaokcg will be in our home folder with the source code.

On my computer it’s under $HOME/.config/google-chrome/Default/Extensions/fdnhneinocoonabhfbmelgkcmilaokcg.
I opened the folder in vscode and looked at the main.*.chunk.js file.

It was minified so first I had to unminify it as best as possible.
Formatting the javascript was as good as I could get it.

From there I looked for POST url verbs to see what it was calling.
I found this relevant code which looked like what I needed.
It’s calling https://www.getrevue.co/extension/add.

A snippet of code from the Revue extension

You’ll see from the code the only thing it’s sending is a POST with a body.
At this point I don’t know what the body should be, but I’ll try to figure that out later.

Now I need to jump over to Chrome to get my session cookie.
Open getrevue.co in a tab and open dev tools.

Go to the Application tab and then find Cookie in the left sidebar.
Copy the value for _revue_session.

a partial view of my user session cookie

Send a curl request

Now we need to send our request and see if it works.

We still don’t know what the body data should look like, but looking at the API objects that are documented I’m going to guess it needs a title and url.

export COOKIE="your cookie session here"

curl -X POST -b "_revue_session=$COOKIE" \
    -H "Content-Type: application/json" \
    -d '{"title": "TESTING", "url": "https://justingarrison.com"}' \
    https://www.getrevue.co/extension/add

Sure enough that worked!

Here’s a snippet of the response

The response gives us a much better idea of the full body data we can use.
Adding a description will be a minimal amount of information that would be useful.

Now we can send items from the CLI but what about from iOS?

[WIP] Siri Shortcut

Siri shortcuts are very powerful but also very cryptic.

I was able to make a shortcut with the “Get contents of URL” function which is able to make a POST call.

I can put in the URL, change the method to POST, and add a body with the required title and url variables.

Unfortunately, when I try to use this shortcut from the share sheet I don’t think it uses my session token so I never get authenticated to the API.

If anyone knows a way to either open a Safari page and perform the action or a way to store an authentication token in the shortcut please reach out on twitter and let me know.

The SOLID Principles in Software Design Explained

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

I planned on getting this finished earlier in the week but ended up shearing some of our sheep 🐑. A few days late, whatever, let’s get cracking.

In this post you’re going to learn about each of the 5 SOLID Principles. I’ve included some code examples to make them a bit more real. I’ve also added some thought exercises / mental models that helped me understand these principles in the hope that they’ll help you too.

These principles are a subset of the principles promoted by Robert C. Martin.

S – The Single Responsiblity Principle
O – The Open/Closed Principle (OCP)
L – The Liskov Substitution Principle (LSP)
I – The Interface Segregation Principle (ISP)
D – The Dependency Inversion Principle (DIP) <!–kg-card-end: markdown–>

1. The Single Responsibility Principle (SRP)

The single responsibility is defined as follows:

Each class should should have RESPONSIBILITY over a single part of the functionality provided by the program.

What does this mean practically though? As a beginner programmer this isn’t very helpful. Let’s expand on the concept.

Examples of single responsibilities :

Validating inputs.
Performing business logic.
Saving and retrieving information to / from a database.
Formatting a document.
Performing calculations for the document. <!–kg-card-end: markdown–>

So if you see a class that is validating inputs, logging events, reading and writing information to the database and performing business logic, you have a class with A LOT of responsibilities; violating the Single Responsibility Principle.

How can you spot a class that may be violating the Single Responsibility Principle?

The class may have:

Tight coupling
Low cohesion
No seperation of concerns <!–kg-card-end: markdown–>

Tight coupling

Changing one class results in having to change a lot of other classes to get the program working again. Sound familiar? 😁

Low Cohesion

The class contains fields and methods/functions that are unrelated to each other in any meaningful way.

A good way to spot this is if methods in a class don’t reuse the same fields. Each method is using different fields from the class.

No Separation Of Concerns

Should my class that deals with validating an input be performing business logic and saving the data to the database? Not likely. Separate the program out into sections that deal with each concern.

A classic real world example of something having too many responsibilities are the multi function knives. They try to do too much and end up doing nothing well.

2. The Open/Closed Principle (OCP)

Software entities (classes, methods, modules) should be open for extension but closed for modification

What does this mean in a practical sense?

You should be able to change the behaviour of a method without changing it’s source code.

For simple methods, adding / changing the logic in the method is perfectly reasonable. If you have to revisit this method 3+ times (not a hard number) due to requirements changing, you should start to think about the Open/Closed Principle.

Closing code to modification, why would you want to do this?

Code that we don’t alter is less likely to create bugs due to unforeseen side effects.

Here’s an example of some code that is not closed for modification. We’ll use a switch statement that will perform something different for each transport type.

If a new transport type needs to be handled by our program then we need to modify the switch statement; violating the Open/Closed Principle.

The SOLID Principles in Software Design Explained — A method that is violating the Open/Closed Principle

So how can we achieve the Open Closed Principle in our code?

Typically we’d use

Parameters
Inheritance
Composition / Injection <!–kg-card-end: markdown–>

Using Parameters:

The AddNumbersClosed method is not Closed for modification. If we have to alter the numbers that it’s adding we have to change the method.

The AddNumbersOpen method is Open and extensible for situations where any two numbers need to be added. We don’t need to modify the method as long as we’re adding two numbers. We can say that this method is closed for modification but open for extension.

Using Inheritance:

The MakeSound() Method here is open for many different animals to make many different sounds.

Using Composition / Injection:

In the following example, the responsibility for making the sound has been moved to the SoundMaker Class.

To add new behaviour, we could add a new class. This new class could provide some new behaviour to the _Dog c_lass.

Why would you create a new class for new behaviour?

We know that stuff we’ve already built isn’t affected.
We can design the class to perfectly suit the new requirement.
New behaviour can be added without interfering with old code. <!–kg-card-end: markdown–> * * *

3. The Liskov Substitution Principle (LSP)

The Liskov Substitution Principle states that:

Subtypes must be substitutable for their base types.

Ok great, but what does that mean practically?

You may have learned about the ‘is-a’ relationship related to OOP inheritance.

e.g. A dog ‘is-a’ animal (I know it should be an animal, cut me some slack for demo purposes ).

The Liskov Substitution Principle is basically stating that this ‘is-a’ relationship is not good enough for maintaining clean code. We should examine the relationship further and explore if we can slot in ‘is-substitutable-for’ instead.

e.g. A dog ‘is-substitutable-for’ an animal. Can we say that we can substitute our dog for the animal?

I’ll show a classic example showing how the ‘is-a’ relationship can break down and cause some problems.

It has a fantastic name; the rectangle-square problem.

Rectangle; 4 sides and 4 right angles.

Square; 4 equal sides and 4 right angles.

So…. A square ‘is-a’ rectangle.

We have a Rectangle class that could look something like this:

And we have a square class that inherits from the rectangle class, because a square ‘is-a’ rectangle. This is what the square class looks like.

Now say we have a method that calculates the area of a rectangle. Should be pretty straightforward. We’ll pass our rectangle as a parameter and return the width multiplied by the height.

This won’t work when we have code like this:

We create a new Rectangle.

It has a Height of 3 and a Width of 2.

Our expected result is an area of 6, however, the actual result is 4.

Why did this happen?

If you look at the code for the square class. When the width property is set, it overwrites the height. So we actually created a square with a width of 2 AND height of 2.

This example is trivial and you can see that we instantiated a square. In real world programs this may not be as easy to spot. You might be receiving the object as a parameter from another class and not know it’s type. This could lead to unintended results as shown above.

It comes down to our Square not being ‘substitutable-for’ a rectangle. A square’s sides must be of equal length but a rectangles width and height can be different. We didn’t perform the check before inheriting from the Rectangle class.

Some clues as to when your code is violating the LSP:

Type checking
Null checks
NotImplementedExceptions <!–kg-card-end: markdown–>

You can also perform the duck test:

If it looks like a duck and quacks like a duck but it needs batteries, you probably have the wrong abstraction – Derick Bailey

4. The Interface Segregation Principle (ISP)

What is the interface segregation principle?

Clients should not be forced to depend upon interfaces that they do not use – Bob Martin

The client in this case is any calling code.

Take a look at this interface name IPersonService.

It has three methods. Any client that implements this interface will have to implement these methods.

Now let’s take this Child class that implements the IPersonService Interface.

For a child, the SetSalary() method and the Salary property do not make sense!

The client (child class) depends on an interface that it does not use. ❌

The interface is only partially implemented 😒.

See that method throwing the NotImplementedException() ? It’s a good sign that you’re violating the Interface Segregation Principle.

This isn’t so bad here, but it will become a problem with larger interfaces. It introduces higher coupling ( I like to think of high coupling as classes being super-glued together and tougher to separate). Future changes to the code will be more difficult.

How do we remedy this?

Split the interface into more cohesive interfaces.

The IPersonSalaryService is an interface that defines members related to a person’s salary.

The IPersonNameService does the same for members related to a person’s name. Both are more cohesive than the original IPersonService.

Now our client code (Child class) can depend on code that it actually uses. Much better. 😎

We can easily implement multiple classes in C# using this syntax. Take a look at this Adult class. It depends on code (the two interfaces) that it actually uses. ✅

5. The Dependency Inversion Principle (DIP)

The Dependency Inversion Principle states that

“High-level modules should not depend on low level modules. Both should depend on abstractions.

Abstractions should not depend on details. Details should depend on abstractions”

<!–kg-card-begin: markdown–><!–kg-card-end: markdown–>

In a C# project, your project references will point in the direction your dependencies.

In domain driven design or onion architecture as it’s sometimes called, the references will point away from low-level (implementation code) and towards your business logic / domain layer.

You can think of high-level code as being more process-orientated. It’s more abstract and more concerned with the business rules.

Low level code is the plumbing code.

Here’s an example showing what high-level code and low-level code might look like in a program.

The Domain layer is just concerned with publishing a course. As we move to lower level code we get more concrete. See changes to the course status id – this is low level code (In the context of a language like C#).

Abstractions are generally achieved using interfaces and abstract base classes.

@ardalis put it really well when he said that they’re generally types that can’t be instantiated (Read, you can’t make a new object from them).

Abstractions define a contract, they don’t do the work

Abstractions specify WHAT should be done without telling us HOW they should be done.

Again, thinking of abstractions in terms of a contract is a useful exercise.

I like to think of an abstraction speaking to any class that depends on the abstraction as saying something like:

_ “This is what you must do, I don’t care how you do it.” -_ Abstraction speaking to a class that depends on it.

This may seem stupid. It helps me understand abstractions and interfaces and how they can be used to make programs easier to design and manage.

If it’s stupid, but works, it ain’t stupid!

So that’s it, by now you’ll have a better understanding of these 5 principles and you can start incorporating them into your work. You’ll be aware of them if nothing else and being aware of them is half the battle.

I first learned about these principles in my University but what really drove them home was the SOLID Principles for C# developers course by Steve Smith on Pluralsight called ‘SOLID Principles for C# Developers’. I highly recommend it.

If you have any questions pop them into the comment section below or reach out to me on twitter where I post coding tips regularly.

WebAssembly and Kubernetes Go Better Together: Matt Butcher

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

In this interview, we delve into details about the WebAssembly (WASM) format and Kubernetes with Matt Butcher, co-founder and CEO of Fermyon. Butcher is passionate about WebAssembly, distributed systems, and artificial intelligence.

Butcher is one of the original creators, or had a hand in the creation of Helm, Brigade, Cloud Native Application Bundles, Open Application Model, Glide and Krustlet. He has written/co-written many books including “Learning Helm” and “Go in Practice.”

In this interview, Butcher talks about the motivation and evolution of serverless functions and how Wasm complements it. He touches upon the use cases of Wasm in edge computing, AI and also on the server.

He also spoke about how SpinKube, recently introduced as a serverless platform on the Kubernetes platform powered by Wasm, was an effort motivated by the adoption of Kubernetes based on a similar orchestrator that was used in the Fermyon Cloud. Finally, Butcher paints a realistic picture of the challenges to the Wasm community especially with the existing communities such as Java.

Can you please introduce yourself, and provide a brief background as to how Wasm came about? Any particular pain point in distributed systems app development that you were intent on solving with Wasm?

I spent my career in software at the intersection of distributed computing, developer tooling, and supply chain security; it wasn’t really one benefit that drew me to Wasm, but how the characteristics of the technology mapped to the problems I was trying to solve (and more generally, the problems we’ve been trying to solve as an industry): portability; startup and execution speed; strong capability-based sandboxing; and artifact size.

At Microsoft, we were looking at Azure Functions and other similar technologies and were seeing a lot of waste. Systems were sitting idle 80% (or more) just waiting for inbound requests. Virtual machines were sitting pre-warmed in a queue consuming memory and CPU, waiting to have a serverless function installed and executed. And all this wastage was done in the name of performance. We asked ourselves: What if we could find a much more efficient form of compute that could cold start much faster? Then we wouldn’t have to pre-warm virtual machines. We could drastically improve density, also, because resources would be freed more often.

So it was really all about finding a better runtime for the developer pattern that we call “serverless functions.” There are too many definitions of serverless floating around. When we at Fermyon say that, we mean the software design pattern in which a developer does not write a server, but just an event handler. The code is executed when a request comes in and shuts down as soon as the request has been handled.

Yes. The first generation of serverless gave us a good design pattern (FaaS, or serverless functions) based on event handling. But from Lambda to KNative, the implementations were built atop runtimes that were not optimized for that style of workload. Virtual machines can take minutes to start, and containers tend to take a few dozen seconds. That means you either have to pre-warm workloads or take large cold-start penalties. Pre-warming is costly, involving paying for CPU and memory while something sits idle. Cold starts, though, have a direct user impact: slow response time.

We were attracted to WebAssembly specifically because we were looking for a faster form of compute that would reduce cold start time. And at under one millisecond, WebAssembly’s cold start time is literally faster than the blink of an eye. But what surprised us was that by reducing cold start, utilizing WebAssembly’s small binary sizes, and writing an effective serverless functions executor, we have been able to also boost density. That, in turn, cuts cost and improves sustainability since this generation of serverless is doing more with fewer resources.

Delving into the intersection of Wasm and edge computing, Is Wasm primarily tailored for edge applications where efficiency and responsiveness are paramount?

Serverless functions are the main way people do edge programming. Because of its runtime profile, Wasm is a phenomenal fit for edge computing — and we have seen content delivery networks like Fastly, and more recently Cloudflare, demonstrate that. And we are beginning to see more use cases of Wasm on constrained and disconnected devices — from 5G towers to the automotive industry (with cars running Spin applications because of that efficiency and portability).

One way to think about this problem is in terms of how many applications you can run on an edge node. Containers require quite a bit of system resources. On a piece of hardware (or virtual machine) that can run about 30-50 containers, we can run upwards of 5,000 serverless Wasm apps. (Fermyon Cloud runs 3,000 user apps per node on our cluster.)

The other really nice thing about WebAssembly is that it is OS- and architecture-neutral. It can run on Windows, Linux, Mac, and even many exotic operating systems. Likewise, it can execute on Intel architecture or ARM. That means the developer can write and compile the code without having to know what the destination is. And that’s important for edge computing where oftentimes the location is decided at execution time.

Navigating the realm of enterprise software: Many enterprises still use Java. Keeping Java developers and architects in mind, can you compare and contrast Java and Wasm? Can you talk about the role of WASI in this context?

Luke Wagner, the creator of Wasm, once said to me that Java broke new ground (along with .NET) in the late 90s and over decades they refined, optimized, and improved. WebAssembly was an opportunity to start afresh on a foundation of 20 years of research and development.

Wasm is indebted to earlier bytecode languages, for sure. But there’s more to it than that. WebAssembly was built with very stringent security sandboxing needs. Deny-by-default and capabilities-based system interfaces are essential to a technology that assumes it is running untrusted code. Java and .NET take the default disposition that the execution context can “trust” the code it is running. Wasm is the opposite.

Secondly, Wasm was built with the stated goal that many different languages (ideally any language) can be compiled to Wasm. JVM and .NET both started from a different perspective that specific languages would be built to accommodate the runtime. In contrast, the very first language to get Wasm support was C. Rust, Go, Python, and JavaScript all followed along soon after. In other words, WebAssembly demonstrated that it could be a target for existing programming languages while after decades neither Java nor .NET have been able to do this broadly.

Moreover, WebAssembly was designed for speed both in terms of cold start and runtime performance. While Java has always had notoriously slow startup times, WebAssembly binaries take less than a millisecond to cold start.

But the greatest thing about WebAssembly is that if it is actually successful, it won’t be long until both Java and .NET can compile to WebAssembly. The .NET team has said they’ll fully support WebAssembly late in 2024. Hopefully, Java won’t be too far behind.

Exploring the synergy between Wasm and Kubernetes, SpinKube was released at the Cloud Native Computing Foundation‘s Kubecon EU 2024 in Paris. Was Kubernetes an afterthought? How does it impact cloud native app development including addressing deficiencies, and redefining containerized applications?

Most of us from Fermyon worked in Kubernetes for a long time. We built Helm, Brigade, OAM, OSM, and many other Kubernetes projects. When we switched to WebAssembly, we knew we’d have to start with developer tools and work our way toward orchestrators. So we started with Spin to get developers building useful things. We built Fermyon Cloud so that developers would be able to deploy their apps to our infrastructure. Behind the scenes, Fermyon Cloud runs HashiCorp’s Nomad. It has been a spectacular orchestrator for our needs. As I said earlier, we’ve been able to host 3,000 user-supplied apps per worker node in our cluster, orders of magnitude higher than what we could have achieved with containers.

But inevitably we felt the pull of Kubernetes’ gravity. So many organizations are operating mature Kubernetes clusters. It would be hard to convince them to switch to Nomad. So we worked together with Microsoft, SUSE, Liquid Reply, and others to build an excellent Kubernetes solution. Microsoft did some of the heaviest lifting when they wrote the containerd support for Spin.

SpinKube’s immediate popularity surprised even us. People are eager to try a serverless runtime that outperforms Knative and other early attempts at serverless in Kubernetes.

Let’s chat about Wasm and Artificial Intelligence. There’s a buzz about Wasm and AI. Can you expand on this? Does this mean that Large Language Models (LLMs), etc. can be moved to the edge with Wasm?

There are two parts to working with LLMs. There’s training a model, which takes huge amounts of resources and time. And there’s using (or “inferring against”) a model. This later part can benefit a lot from runtime optimization. So we have focused exclusively on serverless inferencing with LLMs.

At its core, WebAssembly is a compute technology. We usually think of that in terms of CPU usage. But it’s equally applicable to the GPUs that AI uses for inferencing. So we have been able to build SDKs inside of the WebAssembly host that allow developers to do LLM inferencing without having to concern themselves with the underlying GPU architecture or resources.

Most importantly, Fermyon makes it possible to time-slice GPU usage at a very fine level of granularity. While most AI inferencing engines lock the GPU for the duration of a server process (e.g. hours, days, or months at a time), we lock the GPU only during an inferencing operation, which is seconds or maybe a few minutes at a time. This means you can get much higher density, running more AI applications on the same GPU.

This is what makes Wasm viable on the edge, where functions run for only a few eye blinks. Those edge nodes will need only a modest number of GPUs to serve thousands of applications.

WASM roadmap: What are the gaps hampering Wasm adoption and the community-driven roadmap to address them? Anything else you want to add?

WebAssembly is still gaining new features. The main one in flight right now is asynchronous function calls across components. This would allow WebAssembly modules to talk to each other asynchronously, and this will vastly improve the parallel computing potential of Wasm.

But always the big risk with WebAssembly comes from its most audacious design goal. Wasm is only successful if it can run all major programming languages, and run them in a more or less uniform way (e.g. with support for WASI, the WebAssembly System Interface). We’ve seen many languages add first-tier WebAssembly support. Some have good implementations that are not included in the main language and require separate toolchains. That’s a minor hindrance. Others, like Java, haven’t made any notable progress. While I think things are progressing well for all major languages other than Java, until those languages have complete support, they have not reached their full potential.

The 5 Worst Anti-Patterns in API Management

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Imagine this: you are working in a company named DonutGPT as Head of Platform Engineering, and you sell millions of donuts online every year with AI-generated recipes. You need to make your critical services available to hundreds of resellers through secured APIs. Since nobody on earth wants to see his donut order fail, your management is putting on the pressure to ensure a highly available service.

Your current production environment consists mostly of VMs, but you are in the process of progressively migrating to a cloud native platform. Most of the production services you handle expose APIs, but your team has very little control and visibility over them. Each service is owned by a different developer team, and there is no consistency in languages, deployed artifacts, monitoring, observability, access control, encryption, etc.

Some services are Java-based, “secured” with old TLS 1.1 certificates, and behind a JWT access control policy. Other services are Python-based, use TLS 1.3 certificates, and are behind a custom-made access control policy. This diversity can be extended to the rest of the services in your production environment.

You (reasonably) think that this situation is far from ideal, and you plan to rationalize APIs at DonutGPT with the help of an API Management solution. Your requirements include:

Your APIs should be strongly governed with centralized and consistent security policies
You need advanced traffic management like rate limiting or canary releases
You need real-time observability and usage metrics on all public endpoints

Simply put, you want predictable operations, peace of mind, and better sleep.

It looks like your plan is right, and you are on track for better days (or nights). However, an API journey is long, and the road ahead is full of obstacles. Here are the top five worst anti-patterns you should avoid when you start your API odyssey.

Anti-Pattern 1: Monolith-Microservices

You are about to invest time, money, and effort in setting up an API management solution. In this process, you will centralize many aspects of your exposed services like traffic management, connectivity security, and observability. It’s easy to think, “The more centralized everything is, the more control I have, and the better I will sleep.” Why not use this API management solution to intercept every API call and transform the HTTP body to sanitize it from sensitive data (like private information)?

This would ensure that every API call is clean across the whole system. That’s true, but only in the short term.

Let’s fast forward three years. Your API management platform is now mature and manages hundreds of APIs across dozens of different teams. The initial quick win to sanitize the HTTP body within the API management workflow gradually became a white elephant:

The first quick patch inevitably evolved into more complex requirements, needing to be adapted to every API. Your ten stylish lines of code quickly grow to an unmaintainable 5000-line script.
No one wants to take ownership of this custom script now that it operates many teams’ APIs
Every new version of an API may require updating and testing this piece of code, which is located in the API platform and separated from the services’ code repositories.
It takes a lot of work to test this custom script. If you have any issues, you will first learn of them from live traffic, and you will have a hard time debugging it.
Your API solution is highly resource-intensive. You should avoid delegating the whole HTTP body to your reverse proxy. This consumes most of the CPU allocated to your platform, giving you very little margin for security while making it a super expensive approach.

In short, it’s best to avoid short-term decision-making. What seems like a good idea at the time may not hold up several years down the road. API management is designed to discover, secure, organize, and monitor your APIs. It should not be used as a shortcut to execute application code.

→ Separation of concern is critical when designing your next production platform.

Anti-Pattern 2: Cart Before the Horse

Another interesting anti-pattern is a laser focus on the long-term, possibly idealized, outcome without recognizing or understanding the steps to get there. Your API transformation project is so expensive you want to ensure everything runs smoothly. So, you choose the most feature-rich API management solution to cover all possible future needs despite being unable to take most of its capabilities today.

Sure, it’s more expensive, but it’s a safe bet if it prevents you from a potential migration in three years. This may seem risk-free, but you only see the tip of the API project iceberg.

Fast forward three years with this top-notch & expensive solution:

The transition from the legacy platform took way longer than expected.
This new solution required paid training sessions from the vendor for your team and many developers throughout the company
You STILL have yet to use many features of the solution.
Many developer teams avoided adopting this new platform due to its complexity.
Your initial goal of controlling all API calls within the company has yet to be reached.
You still have inadequate sleep.

At this point, you acknowledge that the most complete (and complex) solution might not be the best option, so you bite the bullet and decide to migrate to a simpler solution that fits your existing needs. In your attempt to avoid an API management migration three years after starting your project, you ended up causing it anyway, only sooner than initially anticipated.

The point here is that while you should aim for your long-term vision (and choose a solution that aligns with it), address your needs today and strategically build towards that vision. This includes planning for progressive training and adoption by the teams. If the product cannot provide you with a progressive learning curve and deployment journey, then you won’t be able to stick to your plan.

Here is an example of a progressive journey with the same product:

Start small with basic ingress resources on Kubernetes.
Then, an API Gateway will be introduced that brings API traffic management and security.
Then, after you have a much better understanding of the features that are important for your business, transition to an API management platform.

In a nutshell, don’t pick a product because of all the bells and whistles. No amount of cool features will solve your challenges if they never get used. Evaluate them based on what it’s like to use to meet your needs today and whether or not they provide a progressive transition to more advanced use cases in the future.

→ Don’t get ahead when transitioning to your API management platform.

Anti-Pattern 3: Good Enough as Code

As a modern Head of Platform Engineering, you strongly believe in Infrastructure as Code (IaC). Managing and provisioning your resources in declarative configuration files is a modern and great design pattern for reducing costs and risks. Naturally, you will make this a strong foundation while designing your infrastructure.

During your API journey, you will be tempted to take some shortcuts because it can be quicker in the short term to configure a component directly in the API management UI than setting up a clean IaC process. Or it might be more accessible, at first, to change the production runtime configuration manually instead of deploying an updated configuration from a Git commit workflow. Of course, you can always fix it later, but deep inside, those kludges stay there forever.

Or worse, your API management product needs to provide a consistent IaC user experience. Some components need to be configured in the UI. Some parts use YAML, others use XML, and you even have proprietary configuration formats. These diverse approaches make it impossible to have a consistent process.

You say, “Infrastructure as a Code is great, but exceptions are ok. Almost Infrastructure as a Code is good enough.”

Fast forward three years:

60% of the infrastructure is fully declared in configuration files and sits in a git repository
Those configuration files are written in five formats: YAML, INI, XML, JSON, and a custom format.
The remaining 40% requires manual operations in some dashboards or files.
There is such diversity in configuration formats or processes that your team is unable to get the platform under control and constantly needs to be rescued by other teams that have knowledge of each format or process.
Human error is so high that your release process is prolonged and unreliable. Any change to the infrastructure requires several days to deploy in production, and this is the best-case scenario.
In the worst-case scenario, a change is deployed in production, creating a major outage. As your team is not able to troubleshoot the issue quickly, the time to recovery is measured in hours. Your boss anxiously looks at the screen over your shoulder, waiting for the miraculous fix to be deployed. Thousands of donut orders are missed in the process.
You don’t even try to sleep tonight.

The conclusion is obvious — setting up API Management partially as code defeats the purpose of reducing costs and risks. It’s only when your API Management solution is 100% as code that you can benefit from a reliable platform, a blazing fast time to market, and fast recovery.

Exceptions to the process will always bring down your platform’s global efficiency and reliability.

→ Never settle for half-baked processes.

Anti-Pattern 4: Chaotic Versioning System

When you start your API journey, planning for and anticipating every use case is difficult. Change is inevitable, but how you manage it is not. As we’ll see in this section, the effects of poor change management can snowball over the years.

Let’s go back to the very beginning: You are launching your brand new API platform and have already migrated hundreds of APIs into production. You are pretty happy with the results; you feel under control and are getting better sleep.

After one year, your state-of-the-art monitoring alerts flood your notifications, pointing to a bunch of API calls from one of your biggest customers with 404 errors. 404 errors are widespread, so you pay little attention to them and quickly forward the issue to the developer team in charge of the API.

During the following months, you see the number of 404 errors and 500 errors rising significantly, affecting dozens of different APIs. You start to feel concerned about this issue and gather your team to troubleshoot and find the root cause.

Your analysis uncovers a more significant problem: your APIs need a consistent versioning system. You designed your platform as if your API contracts would never change, as if your APIs would last forever.

As a result, to handle change management and release new versions of their APIs, each team followed the processes:

Some teams did not bother dealing with compatibility checks and kept pushing breaking changes.
Some teams tried to keep their APIs backward compatible at all costs. Not only did this make the codebase a nightmare to maintain, but it slowly became obvious that it discouraged teams from innovating, as they wanted to avoid breaking changes and maintaining compatibility with all versions.
Some teams followed a more robust process with the use of URL versioning, like https://donutgpt.com/v1/donuts and https://donutgpt.com/v2/donuts. They were able to maintain multiple versions at the same time, with different codebases for each version. The problem was that other teams were using different strategies, like query parameter versioning (https://donutgpt.com/donuts?version=v1) or even header versioning.
Some teams consistently followed a specific versioning strategy like URL versioning but did not provide versioned documentation.

This study makes you realize how creative the human brain is — the developers chose so many different options!

The result is that customers were:

Using outdated documentation
Calling outdated or dead APIs
Calling different APIs with different versioning strategies
Calling unreliable APIs
Receiving donuts with your new “experimental recipe” when they ordered your classic “Legend GPT Donut”

The key takeaways are apparent: No code lasts forever, and change is a natural part of API development. Given this truth, you must have a strong, reliable, and repeatable foundation for your release process and API lifecycle management.

Your choice of API management solution can help, too. Choose a solution that provides a flexible versioning strategy that fits your needs and can be enforced on every API of DonutGPT.

Additionally, ensure teams maintain several versions of their APIs that can be easily accessible as part of a broader change management best practice. This is the only way to maintain a consistent and reliable user experience for your customers.

→ Enforce a uniform versioning strategy for all your APIs.

Anti-Pattern 5: YOLO Dependencies Management

Now that you’ve learned why managing your API versioning strategy is critical, let’s discuss dependency management for APIs — a topic that is often highly underestimated for a good reason. It’s pretty advanced.

After the miserable no-versioning-strategy episode, you were reassured to see versioning policies enforced on every piece of code at DonutGPT. You were even starting to sleep better, but if you’ve read this far, you know this can’t last.

After two more months, your state-of-the-art monitoring again alerts you to several API calls from one of your biggest customers, resulting in 404 errors! You’ve got to be kidding me! You know the rest of the story: task force, troubleshooting, TooManyDonutsErrors, root cause analysis, and (drum roll) …

All your APIs indeed followed the enforced versioning strategy: https://donutgpt.com/v1/donuts. So, what happened?

This was only enforced on the published routes on the API management platform. The services behind the APIs were following a different versioning strategy. Even for those few, there was no dependency management between your API routes and backend services.

In other words, https://donutgpt.com/v1/donuts and https://donutgpt.com/v2/donuts were able to call the same version of a service, which led to a situation similar to the no-versioning-strategy episode, with a terrible customer experience. It gets even more complex if some services call other services.

You start to see my point: you need dependency policies enforced on all your APIs and services. Every API needs to be versioned and call a specific service version (or range), and this same approach should be applied to every service. To achieve this, your API management solution must provide a flexible way to express dependencies in API definitions. Furthermore, it should check for dependencies at the deployment phase through intelligent linters to avoid publishing a broken API dependency chain.

These capabilities are uncommon in API management products, so you must choose wisely.

→ Enforce dependency checks at deployment.

Wrap Up

You dedicated most of your career to building DonutGPT’s infrastructure, solving countless challenges during this adventure. The outcome has been quite rewarding: DonutGPT disrupted the donut market thanks to its state-of-the-art AI technology, producing breathtaking donut recipes.

You are proud to be part of this success story; however, while accelerating, the company now faces more complex problems. The biggest problem, by far, is the industrialization of DonutGPT’s APIs consumed by customers and resellers. During this journey, you tried multiple solutions, started over, and made some great decisions… and some debatable ones. DonutGPT messed up a few donut orders while exploring the API world.

Now that you can stand back and see the whole project, you realize that you have hit what you consider today to be anti-patterns. Of course, you learned a lot during this process, and you started thinking it would be a great idea to give that knowledge back to the community through a detailed blog post, for example.

Of course, this story, the character, and the company are fictitious, even though AI-generated donut recipes might be the next big thing. However, these anti-patterns are very real and have been observed repeatedly during our multiple conversations at Traefik Labs with our customers, prospects, and community members.

While planning your API journey, you should consider these five principles to maximize your return and minimize your effort:

Design your API platform with a strong separation of concerns. Avoid delegating business logic to the platform.
Do not set the bar too high or too fast. Proceed step by step. Start with more straightforward concepts like ingresses and progressively move to more advanced API use cases once you understand them better.
While industrializing your processes, tolerating exceptions will defeat the purpose, and you won’t gain all the expected benefits of a fully automated platform.
Versioning becomes a critical challenge in the long run. Starting your API journey with a strong and consistent versioning strategy across all your APIs will make your platform more scalable, reliable, and predictable.
Within complex infrastructures with many moving parts, controlling and certifying runtime dependencies for all components is crucial to achieving a high level of trust and stability for your platform.

Of course, this list is not exhaustive, but it covers the most common practices. All that said, these recommendations should not prevent you from drifting away and trying different processes. Innovation is built on top of others’ feedback, but still requires creativity.

Internal Developer Platforms: The Heart of Platform Engineering

Posted on August 4, 2024 by Maq Verma

Platform engineering involves creating a supportive, scalable and efficient infrastructure that allows developers to focus on delivering high-quality software quickly. Many of us have been doing this approach for years, just without a proper name tied to it.

To get platform engineering right, many have turned to the concept of platforms as products (PaaP), where you put the emphasis on the user experience of the developers themselves and view your internal developers as customers. This implies platforms should have a clear value proposition, a roadmap and dedicated resources so that our internal developers have all the resources we would arm our external customers with if preparing them to onboard into a new product.

However, we can’t discuss the popular trend of treating PaaP without discussing what lies at the heart of this conversation. The PaaP approach is particularly pivotal in the realm of internal developer platforms (IDPs), which are central to the platform engineering craze because you can’t get your external platform right if your internal one is a mess. Traditional approaches often overlook the necessity of aligning the platform’s capabilities with developers’ needs, leading to poor adoption and suboptimal outcomes.

This is where internal developer platforms come into play, serving as the backbone of this engineering paradigm. These platforms are not just about providing tools and services; they are about crafting an experience that empowers developers to perform their best work. When platforms are designed with a deep understanding of what developers truly need, they can significantly enhance productivity and satisfaction.

IDPs are usually referred to as developer-focused infrastructure platforms (not to be confused with a developer control plane) and were made popular by the well-known “Team Topologies” book (they’re something we’ve prioritized for a long time here at Ambassador). “Team Topologies” focuses on structuring business and technology teams for peak efficiency, and a big focus is highlighting the need for platform teams to offer platforms as an internal product to enable and accelerate other teams.

The benefit of internal platforms is that they enable teams to spend more time on delivering business value, provide guardrails for security and compliance, standardize across teams and create an ease of deployment. Here’s why IDPs are critical to build that solid foundation for your IDP and perfect your platform strategy as a whole:

worldwide so they can code, ship and run apps fast

Why Internal Developer Platforms Are Critical

Enhanced Developer Experience (DX)

Internal developer platforms focus on improving the overall developer experience, making it easier for developers to access the tools and resources they need. Your developers should not be dreading their experience; instead, they should be able to focus on the things that matter most: good development.

The more you focus on making your internal platform friction-free, it will lead to increased efficiency and creativity as developers are able to focus more on solving business problems rather than grappling with infrastructural complexities. With easier access to tools and fewer operational hurdles, developers can experiment and innovate more freely. This environment encourages the exploration of new technologies and approaches, which can lead to breakthroughs in product development.

Friction-free IDPs include well-documented processes, standardized tools and removing manual work where possible (automation is your friend). If you’ve built your IDP to meet these requirements, then your devs will be happier and more productive.

Streamlined Operations and Resource Management

Speaking of standardization — by standardizing development environments, internal platforms reduce variability and streamline operations across the development life cycle. This not only speeds up the development process but also reduces the likelihood of errors, leading to more stable releases.

Having components and tools centralized in an internal developer platform streamlines the foundation for developer self-service, success and responsibility. A developer platform empowers both developers and platform engineers to focus on and excel in their core business areas, and enable faster development cycles and the ability to ship software with speed and safety.

A strong IDP allows organizations to optimize resource usage, ensuring that developers have access to necessary resources without overprovisioning. This can lead to cost savings and more efficient use of infrastructure.

And as a bonus, a comprehensive IDP helps you not just attract new talent but retain it as well. In a competitive tech landscape, devs are looking for environments where they can increase their skills and work on exciting projects that don’t compromise their intelligence or threaten their ability to innovate freely. Well-designed internal developer platforms can be a key differentiator in whether future devs will want to work on your team.

Avoiding Common Anti-Patterns That Undermine Your API Strategy

A recent example made this recommendation very clear. Anti-patterns that can undermine API management are largely the result of a lack of a cohesive strategy and plan, and biting off more than a team can chew. This is where we see the opportunity for a platform approach to API development.

IDPs help you craft your API platform with a clear division of responsibilities, ensuring that business logic remains separate from the platform itself.

How Do I Implement a Successful IDP?

Note that if you choose not to take a proactive approach in your team’s development of their IDP, your developers will find a way to do it anyway.

“I’d argue that anyone who develops software these days has a platform. Accidental platforms, in many cases, are more difficult to manage. Try to take an active role in managing your platform to avoid issues later on,” said Erik Wilde, an industry influencer and recent guest on the “Living on the Edge” podcast.

Therefore, to truly make internal developer platforms a centerpiece of platform engineering and get it right the first time, organizations need to adopt a few strategic practices:

Understand and anticipate developer needs: Continuous feedback mechanisms should be implemented to ensure the platform evolves in line with developer requirements. Any tool you select should be selected with your developers’ needs in mind, applying a holistic lens throughout every piece of your platform. Recognizing that with complex environments with numerous components, managing and certifying runtime dependencies is essential to maintaining high levels of trust with your developers and the stability in your platform.
Be aware that versioning is a significant challenge over time. Implementing a consistent versioning strategy from the start for all your APIs will enhance your platform’s scalability, reliability and predictability.
Invest in scalability: As the organization grows, the platform should be able to scale seamlessly to accommodate more developers and increase workload without performance dips. Ensure the tools you’re building your platform on come with the proper flexibility, room for integrations and composability to expand with your future anticipated growth.
Ensure robust security and compliance: The platform should incorporate strong security measures and compliance controls to protect sensitive data and meet regulatory requirements. Standardization and proper governance can help promote the security of your IDP, but ensure the proper code reviews, protections and protections from security risks are all in place before you socialize your IDP.
Promote internal adoption: Through internal promotion and demonstrating clear value propositions, you can encourage widespread adoption of the platform. Involve your own devs early and often in the process, and consider involving other relevant stakeholders as well (think product managers, business leadership, someone from your operations team, etc.). And remember: While it might be obvious to developers and their managers that an IDP could increase developer productivity and focus, it’s not always obvious to other stakeholders. Chances are you’re going to have to prove that your IDP can unlock developer effectiveness and efficiency by building a business case.

There Is No Platform Engineering Without IDPs

In the end, internal developer platforms (IDPs) are not merely a component of platform engineering; they are its core. As platform engineering evolves, placing IDPs at the heart of this transformation is essential for organizations aspiring to lead in the digital age. With the ongoing migration to the cloud and the customization of platforms atop this infrastructure, a deep understanding of IDPs and their pivotal role in platform engineering is becoming increasingly crucial.

What’s New for JavaScript Developers in ECMAScript 2024

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The ECMAScript standard for JavaScript continues to add new language features in a deliberate way. This year there’s a mix of APIs that standardize common patterns that developers have been writing by hand or importing from third-party libraries — including some aimed specifically at library authors — plus improvements in string handling, regular expressions, multithreading and WebAssembly interoperability.

Meanwhile, the TC39 committee that assesses proposals is also making progress on some of the much larger proposals, like the long-awaited Temporal and Decorators that may be ready for ECMAScript 2025, Ecma vice president Daniel Ehrenberg told The New Stack.

“Looking at what we’ve done over the past year, ECMAScript 2024 is a little similar to ECMAScript 2023 in that it sees smaller features; but meanwhile, we’re building very strongly towards these big features.” Many of those only need “the last finishing touches.”

“You need to access the WebAssembly heap reasonably efficiently and frequently from the JavaScript side, because in real applications you will not have communication between the two.”
– Daniel Ehrenberg, Ecma vice president

In fact, since the completed feature proposals for ECMAScript 2024 were signed off in March of this year, ready for approval of the standard in July, at least one important proposal — Set Methods — has already reached stage four, ready for 2025.

Making Developers Happier With Promises

Although promises are a powerful JavaScript feature introduced in ECMAScript 2015, the pattern the promise constructor uses isn’t common elsewhere in JavaScript, and turned out not to be the way developers want to write code, Ehrenberg explained. “It takes some mental bandwidth to use these weird idioms.”

The hope was that over time on the web platform, enough APIs would natively return promises instead of callbacks that developers wouldn’t often need to use the promise constructor. However, existing APIs haven’t changed to be more ergonomic.

“It comes up at least once in every project. Almost every project was writing this same little helper so it being in the language is one of those really nice developer happiness APIs.”
– Ashley Claymore, Bloomberg software engineer

Instead, developers are left with a cumbersome workaround that many libraries, frameworks and other tools — from React to TypeScript — have implemented different versions of: it’s in jQuery as the deferred function. “People have this boilerplate pattern that they have to write over and over again, where they call the promise constructor, they get the resolve and reject callbacks, they write those to a variable, and then they inevitably do something else [with them]. It’s just an annoying piece of code to write,” said Ehrenberg.

Libraries that implemented promises before ECMAScript 2015 typically covered this, but the feature didn’t make it into the language; Chrome briefly supported and then removed a similar option. But developers still need this often enough that the Promise.withResolvers proposal to add a static method made it through the entire TC39 process in the twelve months between ECMAScript 2023 being finalized and the cutoff date for this year’s update to the language — an achievement so unusual that TC-39 co-chair Rob Palmer referred to it as a speedrun.

“Previously, when you created a promise, the ways that you resolve it and you give it its final state were APIs only accessible inside the function that you built the promise with,” Ehrenberg continued. “Promise.withResolvers gives you a way to create a promise and it gives you direct access to those resolution functions.”

Other functions in your code might depend on whether a promise is resolved or rejected, or you might want to pass the function to something else that can resolve the promise for you, reflecting the complex ways promises are used for orchestration in modern JavaScript, Ashley Claymore (a Bloomberg software engineer who has worked on multiple TC39 proposals) suggested.

“The classical way of creating a promise works well when it’s a small task that’s asynchronous; taking something that was purely callback based or something that was promise-like, and then wrapping it up so it was actually a promise,” Claymore said. “In any code where I start doing lots of requests and need to line them up with IDs from elsewhere, so I’m putting promises or resolve functions inside a map because I’m orchestrating lots of async things that aren’t promise based, you’re always having to do this. I need to pull these things out because I’m sending them to different places.”

“It comes up at least once in every project. Almost every project was writing this same little helper so it being in the language is one of those really nice developer happiness APIs.”

Other improvements to promises are much further down the line; Claymore is involved in a proposal to simplify combining multiple promises without using an array — which involves keeping track of which order all the promises are in. “That works fine for like one, two or three things: after that, it can start getting harder to follow the code,” he said. “What was the fifth thing? You’re counting lines of code to make sure you’ve got the right thing.”

Having an Await dictionary of Promises would let developers name promises: particularly helpful when they cover different areas — like gathering user information, database settings and network details that likely return at very different times. This wouldn’t be a difficult feature to develop: the delay is deciding whether it’s useful enough to be in the language because the TC39 committee wants to avoid adding too many niche features that could confuse new developers.

Keeping Compatibility and Categorizing Data

That’s not a concern for the second major feature in ECMAScript 2024, a new way to categorize objects into categories using Array grouping: something common in other languages (including SQL) that developers frequently import the Lodash userland library for.

You can pass in different items and classify them by some property, like color. “The result is a key value dictionary that is indexed by ‘here’s all your green things, here are your orange things’ and that dictionary can be expressed either as an object or a map”, Palmer explained. Use a map if you want to group keys that aren’t only strings and symbols; to extract multiple data values at the same time (known as destructuring), you need an object.

“As a standards committee we shouldn’t be asking them to incur the cost of risking outages when we already know that something is highly risky.”
– Ehrenberg

That’s useful for everything from bucketing performance data about your web site to grouping a list of settled promises by status, a common use with Promise.allSettled, Claymore said. “You give it an array of promises, it will wait for all of them to settle, then you get back an array of objects that says, ‘did this reject or did it resolve?’ They’re in the same order as you started, but it’s quite common to have one bit of code I want to give all the promises that were successful and resolved, and another bit of code I want to give rejected [promises].” For that you can pass the result of Promise.allSettled to groupBy to group by promise status, which groups all the resolved promises and all the rejected promises separately.

Building the new grouping functionality also delivered a useful lesson about web compatibility.

The utilities in Lodash are functionality that developers could write in 5-10 lines of code, Palmer noted. “But when you look at the frequency at which they’re used, they’re so widely used by so many programs that at some point it’s worth taking the top usage ones and then putting them in the platform, so people don’t have to write their own.” A number of them have now ended up as native constructs.

“This functionality being in the language is a really nice convenience for projects that are trying not to have a large set of dependencies while still having access to these really common things,” Claymore agreed. “They’re not the hardest things to write by hand, but it’s no fun rewriting them by hand and they can subtly get them wrong.”

Unusually, the new Map.groupBy and Object.groupBy methods are static methods rather than array methods, the way Lodash functionality has previously been added to JavaScript. That’s because two previous attempts to add this functionality as array methods both clashed (in different ways) with existing code on websites already using the same two names the proposal came up with, including the popular Sugar library.

This problem could recur any time TC39 proposals try to add new prototype methods to arrays or instance methods, Palmer warned. “Whenever you try and think of any reasonable English language verb you might want to add, it seems it triggers web compatibility problems somewhere on the internet.”

Ideally, good coding standards would avoid that, but part of the reason it takes time to add new features to JavaScript is the need to test for exactly these kinds of issues and work around them when they crop up.

“We can say that the best practice for the web is that users should not be polluting the global prototypes: people should not be adding properties to array or prototype [in their code], because it can lead to web compatibility issues. But it doesn’t matter how much we say that; these sites are already up there, and we have a responsibility to not break them.”

Shipping then withdrawing implementations in browsers makes it more expensive to add new features, Ehrenberg added. “As a standards committee, we shouldn’t be asking them to incur the cost of risking outages when we already know that something is highly risky.” That means similar proposals might use static methods more in the future to avoid the issue.

Updating JavaScript for Better Unicode Handling

JavaScript already has a /u flag for regexp that needs to handle Unicode (introduced in ECMAScript 2015), but that turned out to have some oddities and missing features. The new /v flag fixes some of those (like getting different results if you use an upper or lowercase character when matching, even if you specify that you don’t care about the case) and forces developers to escape special characters. It also allows you to do more complex pattern matching and string manipulation using a new unicodeSets mode, which lets you name Unicode sets so that you can refer to the ASCII character set or the emoji character set.

The new options will simplify internationalization and make it easier to support features for diversity. The /u flag already lets you refer to emoji, but only if they were only a single character — excluding emoji that combine multiple characters to get a new emoji or to specify the gender or skin tone of an emoji representing a person, and even some country flags.

It also simplifies operations like sanitizing or verifying inputs, by adding more set operations including intersections and nesting, making complex regular expressions more readable. “It adds subtraction so you could say, for example, ‘all the ASCII characters’, but then subtract the digits zero to nine, and that would match a narrower range than all of the ASCII characters,” Palmer explained. You could remove invisible characters or convert numbers expressed as words into digits.

“It’s easy to make assumptions about Unicode, and it’s such a big topic; the number of people in the world [who] understand these things well enough to not make mistakes is very small.”
– Claymore

You can also match against various properties of strings, such as what script they’re written in, so that you can find characters like π and treat them differently from p in another language.

You can’t use the /u flag and the /v flag together and you will probably always want to use /v. Palmer described the choice as “the /v flag enables all the good parts of the/ u flag with new features and improvements, but some of them are backwards incompatible with the /u flag.”

ECMAScript 2025 will add another useful improvement for regexp: being able to use the same names in different branches of a regexp. Currently, if you’re writing a regular expression to match something that can be expressed in multiple ways, like the year in a date that might be /2024 or just /24, you can’t use ‘year’ in both branches of the regular expression, even though only one branch can ever match, so you have to say ‘year1’ and ‘year2’ or ‘shortyear’ and ‘longyear’.

“Now we say that’s no longer an error, and you can have multiple parts of the regular expression given the same name, as long as they are on different branches and as long as only one of them can ever be matched,” Claymore explained.

Another new feature in ECMAScript 2024 improves Unicode handling by ensuring that code is using well-formed Unicode strings.

Strings in JavaScript are technically UTF-16 encoded: in practice, JavaScript (like a number of other languages) doesn’t enforce that those encodings are valid UTF-16 even though that’s important for the URI API and the WebAssembly Component Model, for example. “There are various APIs in the web platform that need well-formed strings and they might throw an error or silently replace the string if they get bad data,” Palmer explained.

Because it’s possible for valid JavaScript code to use strings that are invalid UTF sequences, developers need ways to check for that. The new isWellFormed method checks that a JavaScript string is correctly encoded; if not, the new .toWellFormed method fixes the string by replacing anything that isn’t correctly encoded with the 0xFFFD replacement character �.

While experienced Unicode developers could already write checks for this, “It’s very easy to get wrong,” Claymore noted. “It’s easy to make assumptions about Unicode, and it’s such a big topic; the number of people in the world that actually properly understand these things well enough to not make mistakes is very small. This encourages people to fall into the pit of success rather than try and do these things by hand and make mistakes because of not knowing all the edge cases.”

Having it in the language itself might even prove more efficient, Palmer suggested. “Potentially, one of the advantages of delegating this to the JavaScript engine is that it might be able to find faster ways to do this check. You could imagine, for example, it might just cache a single bit of information with every string to say ’I’ve already checked this, this string is not good’ so that every time you pass it somewhere that needs a good string, it doesn’t need to walk every character to check it again, but just look at that one bit.”

Adding Locks With Async Code

“On the main thread, where you’re providing interactivity to the user, it’s one of the rules of the web: thou shalt not halt the main thread!”
– Rob Palmer, TC-39 co-chair

JavaScript is technically a single-threaded language that supports multithreading and asynchronous code. That’s because as well as having web workers and service workers that are isolated from the main thread that provides the user interface, it’s the nature of the web that quite often you’re waiting for something from the network or the operating system, so the main thread can run other code.

The tools in JavaScript for managing this continue to get more powerful with a new option in ECMAScript 2024, Atomics.waitAsync.

“If you want to do multithreading in JavaScript, we have web workers, you can spin up another thread, and the model that was originally based on is message passing, which is nice and safe,” Palmer explained. “But for people [who] want to go faster, where that becomes a bottleneck, shared memory is the more raw, lower-level way of having these threads cooperate on shared data, and SharedArrayBuffer was introduced long ago to permit this memory sharing. And when you’ve got a multithreaded system with shared memory, you need locks to make sure that you got orderly operations.”

“When you wait, you lock up the thread, it can’t make any progress. And that’s fine on a worker thread. On the main thread, where you’re providing interactivity to the user, it’s one of the rules of the web: thou shalt not halt the main thread!”

Async avoids blocking the main thread because it can move on to any other tasks that are ready to go, like loading data that has come in from the network, and the Atomics.wait API offers event-based waiting when you’re not on the main thread. But sometimes you do want the main thread to wait for something.

“Even if you’re not on the main thread, you shouldn’t be blocking most of the time,” Ehrenberg warned, while noting that “it was important for game engines to be allowed to block when they’re not on the main thread, to be able to recreate what they could do in C++ code bases.”

Developers who need this have created workarounds for this, again using message passing, but this had some overheads and slowed things down. Atomics.waitAsync can be used on the main thread and provides a first-class way of waiting on a lock. “The key thing is that it doesn’t stall the main thread,” Palmer said.

“If you call it and the lock is not ready, it will instead give you a backup promise so [that] you can use regular async.await and treat it just like any other promise. This solves how to have high-performance access operations on locks.”

Another proposal still in development takes a slightly different approach to waiting on the main thread and would be useful for making multithreaded applications written in Emscripten more efficient. Atomics.pause promises ‘microwaits’ that can be called from the main thread or worker threads. “It does block, but it’s limited in how long it can block for,” Claymore told us.

Most JavaScript developers won’t use either of these options directly, Palmer pointed out: “It’s very hard to write threaded code.” But mutex libraries would likely rely on it, as might tools for compiling to WebAssembly.

“We can all benefit from this, even if we’re not using it directly.”

Easier WebAssembly Integration

Another proposal for adding a JavaScript API for features previously handled in DOM APIs or available to WebAssembly bytecode, but not JavaScript, is resizable array buffers.

WebAssembly and JavaScript programs need to share memory. “You need to access the WebAssembly heap reasonably efficiently and frequently from the JavaScript side, because in real applications you will not have communication between the two sides; they’re largely sharing their information via the heap,” Ehrenberg explained.

“If you’ve got a WebAssembly toolchain like Emscripten, it means it can do this without creating wrapper objects.”
– Palmer

WebAssembly memory can grow if required: but if it does and you want to access it from JavaScript, that means detaching the ArrayBuffer you’re using for handling binary data in memory and building a new TypedArray over the underlying ArrayBuffer next time you need to access the heap. That’s extra work that fragments the memory space on 32-bit systems.

Now you can create a new type of array buffer that’s resizable, so it can grow or shrink without needing to be detached.

“If you’ve got a WebAssembly toolchain like Emscripten, it means it can do this without creating wrapper objects, which just add inefficiencies,” Palmer added. Again, while few JavaScript developers will use this directly, libraries will likely use resizable arrays for efficiency and no longer need to work around a missing part of the language to do it — making them smaller and faster to load by reducing the amount of code that needs to be downloaded.

Developers have to explicitly choose this, because array buffers and typed arrays are the most common way hackers try to attack browsers, and making the existing ArrayBuffer resizable would mean changing a lot of code that’s been extensively tested and hardened, possibly creating bugs and vulnerabilities.

“That detach operation was initially created to enable transfer to and from workers,” Ehrenberg explained. “When you send an ArrayBuffer via a POST message to transfer it to another worker, the ownership transfers and all the typed arrays that are closing over that ArrayBuffer become detached.” A companion proposal lets developers transfer the ownership of array buffers so they can’t be tampered with.

“If you pass it into a function, the receiver can use this transfer function to acquire ownership, so anyone who used to have a handle to it no longer has it,” Palmer explained. “That’s good for integrity.”

As well as being part of ECMAScript 2024, resizable array buffers have been integrated into the WebAssembly JS API; Ehrenberg called this out as an example of collaboration between different communities in the web ecosystem that’s working well.

“These are efforts that span multiple different standards committees and require explicit cooperation among them. That can be complicated because you have to bring a lot of people in the loop, but ultimately, I think it leads to [a] good design process. You get a lot of vetting.”

Jumpstart AI Workflows With Kubernetes AI Toolchain Operator

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Generative AI is booming as industries and innovators look for ways to transform digital experiences. From AI chatbots to language translation tools, people are interacting with a common set of AI/ML models, known as language models in everyday scenarios. As a result, new language models have developed rapidly in size, performance and use cases in recent years.

As an application developer, you may want to integrate a high-performance model by simply making a REST call and pointing your app to an inferencing endpoint for options like Falcon, Mistral, Phi and other models. Just like that, you’ve unlocked the doors to the AI kingdom and all kinds of intelligent applications.

Open source language models are a cost-effective way to experiment with AI, and Kubernetes has emerged as the open source platform best suited for scaling and automating AI/ML applications without compromising the security of user and company data.

“Let’s make Kubernetes the engine of AI transformation,” said Jorge Palma, Microsoft principal product manager lead. He gave the keynote at KubeCon Europe 2024, where AI use cases were discussed everywhere. Palma talked about the number of developers he’s met who are deploying models locally in their own infrastructure, putting them in containers and using Kubernetes clusters to host them.

“Container images are a great format for models. They’re easy to distribute,” Palma told KubeCon. “Then you can deploy them to Kubernetes and leverage all the nice primitives and abstractions that it gives you — for example, managing that heterogeneous infrastructure, and at scale.”

Containers also help you avoid the annoying “but it runs fine on my machine” issue. They’re portable, so your models run consistently across environments. They simplify version control to better maintain iterations of your model as you fine-tune for performance improvements. Containers provide resource isolation, so you can run different AI projects without mixing up components. And, of course, running containers in Kubernetes clusters makes it easy to scale out — a crucial factor when working with large models.

“If you aren’t going to use Kubernetes, what are you going to do?” asked Ishaan Sehgal, a Microsoft software engineer. He is a contributor to the Kubernetes AI Toolchain Operator (KAITO) project and has helped develop its major components to simplify AI deployment on a given cluster. KAITO is a Kubernetes operator and open source project developed at Microsoft that runs in your cluster and automates the deployment of large AI models.

As Sehgal pointed out, Kubernetes gives you the scale and resiliency you need when running AI workloads. Otherwise, if a virtual machine (VM) fails or your inferencing endpoint goes down, you must attach another node and set everything up again. “The resiliency aspect, the data management — Kubernetes is great for running AI workloads, for those reasons,” he said.

Kubernetes Makes It Easier, KAITO Takes It Further

Kubernetes makes it easier to scale out AI models, but it’s not exactly easy. In my article “Bring your AI/ML workloads to Kubernetes and leverage KAITO,” I highlight some of the hurdles that developers face with this process. For example, just getting started is complicated. Without prior experience, you might need several weeks to correctly set up your environment. Downloading and storing the large model weights, upwards of 200 GB in size, is just the beginning. There are storage and loading time requirements for model files. Then you need to efficiently containerize your models and host them — choosing the right GPU size for your model while keeping costs in mind. And there are troubleshooting pesky quota limits on compute hardware.

Using KAITO, a workflow that previously could span weeks now takes only minutes. This tool streamlines the tedious details of deploying, scaling, and managing AI workloads on Kubernetes, so you can focus on other aspects of the ML life cycle. You can choose from a range of popular open source models or onboard your custom option, and KAITO tunes the deployment parameters and automatically provisions GPU nodes for you. Today, KAITO supports five model families and over 10 containerized models, ranging from small to large language models.

For an ML engineer like Sehgal, KAITO overcomes the hassle of managing different tools, add-ons and versions. You get a simple, declarative interface “that encapsulates all the requirements you need for running your inferencing model. Everything gets set up,” he explained.

How KAITO Works

Using KAITO is a two-step process. First, install KAITO on your cluster, and then select a preset that encapsulates all the requirements needed for inference with your model. Within the associated workspace custom resource definition (CRD), a minimum GPU size is recommended so you don’t have to search for the ideal hardware. You can always customize the CRD to your needs. After deploying the workspace, KAITO uses the node provisioner controller to automate the rest.

“KAITO is basically going to provision GPU nodes and add them to your cluster on your behalf,” explained Microsoft senior cloud evangelist Paul Yu. “As a user, I just have to deploy my workspace into the AKS cluster, and that creates the additional CR.”

As shown in the following KAITO architecture, the workspace invokes a provisioner to create and configure the right-sized infrastructure for you, and it even distributes your workload across smaller GPU nodes to reduce costs. The project uses open source Karpenter APIs for the VM configuration based on your requested size, installing the right drivers and device plug-ins for Kubernetes.

Graphic of Kubernetes AI Toolchain Operator

Zoom

For applications with compliance requirements, KAITO provides granular control over data security and privacy. You can ensure that models are ring-fenced within your organization’s network and that your data never leaves the Kubernetes cluster.

Check out this tutorial on how to bring your own AI models to intelligent apps on Azure Kubernetes Service, where Sehgal and Yu integrate KAITO in a common e-commerce application in a matter of minutes.

Working With Managed Kubernetes

Currently, you can use KAITO to provision GPUs on Azure, but the project is evolving quickly. The roadmap includes support for other managed Kubernetes providers. When you use a managed Kubernetes service, you can interact with other services from that cloud platform more easily to add capabilities to your workflow or applications.

Earlier this year, at Microsoft Build 2024, BMW talked about its use of generative AI and Azure OpenAI Service in the company’s connected car app, My BMW, which runs on AKS.

Brendan Burns, co-founder of the Kubernetes open source project, introduced the demo. “What we’re seeing is, as people are using Azure OpenAI Service, they’re building the rest of the application on top of AKS,” he told the audience. “But, of course, just using OpenAI Service isn’t the only thing you might want to use. There are a lot of reasons why you might want to use open source large language models, including situations like data and security compliance, and fine-tuning what you want to do. Maybe there’s just a model out there that’s better suited to your task. But doing this inside of AKS can be tricky so we integrated the Kubernetes AI Toolchain Operator as an open source project.”

Make Your Language Models Smarter with KAITO

Open-source language models are trained on extensive amounts of text from a variety of sources, so the output for domain-specific prompts may not always meet your needs. Take the pet store application, for example. If a customer asks the integrated pre-trained model for dog food recommendations, it might give different pricing options across a few popular dog breeds. This is informative but not necessarily useful as the customer shops. After fine-tuning the model on historical data from the pet store, the recommendations can instead be tailored to well-reviewed, affordable options available in the store.

As a step in your ML lifecycle, fine-tuning helps customize open source models to your own data and use cases. The latest release of KAITO, v0.3.0, supports fine-tuning, and inference with adapters and a broader range of models. You can simply define your tuning method and data source in a KAITO workspace CR and see your intelligent app become more context-aware while maintaining data security and compliance requirements in your cluster.

To stay up to date on the project roadmap and test these new features, check out KAITO on GitHub.

Open Source or Closed? The AI Dilemma

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Artificial intelligence is in the middle of a perfect storm in the software industry, and now Mark Zuckerberg is calling for open sourced AI.

Three powerful perspectives are colliding on how to control AI:

All AI should be open source for sharing and transparency.
Keep AI closed-source and allow big tech companies to control it.
Establish regulations for the use of AI.

There are a few facts that make this debate tricky. First, if you have a model’s source code, you know absolutely nothing about how the model will behave. Openness in AI requires far more than providing source code. Second, AI comes in many different flavors and can be used to solve a broad range of problems.

From traditional AI for fraud detection and targeted advertising to generative AI for creating chatbots that, on the surface, produce human-like results, pushing us closer and closer to the ultimate (and scary) goal of Artificially Generated Intelligence (AGI). Finally, the ideas listed above for controlling AI all have a proven track record for improving software in general.

Understanding the Different Perspectives

Let’s discuss the different perspectives listed above in more detail.

Perspective #1 — All AI should be open source for sharing and transparency: This comes from a push for transparency with AI. Open source is a proven way to share and improve software. It provides complete transparency when used for conventional software. Open source software has propelled the software industry forward by leaps and bounds.

Perspective #2 — Keep AI closed-source and allow big tech companies to control it: Closed-source, or proprietary software, is the idea that an invention can be kept a secret, away from the competition, to maximize financial gain. To open source idealists, this sounds downright evil; however, it is more of a philosophical choice than one that exists on the spectrum of good and evil. Most software is proprietary, and that is not inherently bad — it is the foundation of a competitive and healthy ecosystem. It is a fundamental right of any innovator who creates something new to choose the closed-source path. The question becomes, if you operate without transparency, what guarantees can there be around responsible AI?

Perspective #3 — Establish regulations for using AI: This comes from lawmakers and elected officials pushing for regulation. The basic idea is that if a public function or technology is so powerful that bad actors or irresponsible management could hurt the general public, a government agency should be appointed to develop controls and enforce those controls. A school of thought suggests that incumbent and current leaders in AI also want regulation, but for reasons that are less pure — they want to freeze the playing field with them in the lead. We will primarily focus on the public good area.

The True Nature of Open Source

Before generative AI burst onto the scene, most software running in data centers was conventional. You can determine precisely what it does if you have the source code for traditional software. An engineer fluent in the appropriate programming language can review the code and determine its logic. You can even modify it and alter its behavior. Open source (or open source code) is another way of saying — I am going to provide everything needed to determine behavior and change behavior. In short, the true nature of open source software is to provide everything you need to understand the software’s behavior and change it.

For a model to be fully open, you need the training data, the source code of the model, the hyperparameters used during training, and, of course, the trained model itself, which is composed of the billions (and soon trillions) of parameters that store the model’s knowledge — also known as parametric memory. Now, some organizations only provide the model, keep everything else to themselves, and claim it is “open source.” This practice is known as “open-washing” and is generally frowned upon by both the open and closed-source communities as disingenuous. I would like to see a new term used for AI models that are partially shared. Maybe “partially open model” or “model from an open washing company.”

There is one final rub when it comes to fully shared models. Let’s say an organization wants to do the right thing and shares everything about a model — the training data, the source code, the hyperparameters, and the trained model. You still can’t determine precisely how it will behave unless you test it extensively. The parametric memory that determines behavior is not human-readable. Again, the industry needs a different term for fully open models. A term that is different from “open source,” which should only be used for non-AI software because the source code of a model does not help determine the behavior of the model. Perhaps “open model.”

Common Arguments

Let’s look at some common arguments that endorse using only one of the previously described perspectives. These are passionate defenders of their perspective, but that passion can cloud judgment.

Argument: Closed AI supporters claim that big tech companies have the means to guard against potential dangers and abuse. Therefore, AI should be kept private and out of the open source community.

Rebuttal: Big tech companies have the means to guard against potential abuse, but that does not mean they will do it judiciously or at all. Furthermore, there are other objectives besides this. Their primary purpose is making money for their shareholders, which will always take precedence.

Argument: Those who think that AI could threaten humanity like to ask, “Would you open source the Manhattan Project?”

Rebuttal: This is an argument for governance. However, it is an unfair and incorrect analogy. The purpose of the Manhattan Project was to build a bomb during wartime by using radioactive materials to produce nuclear fusion. Nuclear fusion is not a general-purpose technology that can be applied to different tasks. You can make a bomb and generate power — that’s it. The ingredients and the results are dangerous to the general public, so all aspects should be regulated. AI is much different. As described above, it comes in varying flavors with varying risks.

Argument: Proponents of open sourcing AI say that open source facilitates the sharing of science, provides transparency, and is a means to prevent a few from monopolizing a powerful technology.

Rebuttal: This is primarily true, but it is not entirely true. Open source does provide sharing. For an AI model, it is only going to provide some transparency. Finally, whether “open models” will prevent a few from monopolizing their power is debatable. To run a model like ChatGPT at scale, you must compute that only a few companies can acquire it.

Needs of the Many Outweigh the Needs of the Few

In “Star Trek II: The Wrath of Khan,” Spock dies from radiation poisoning. Spock realizes that the ship’s main engines must be repaired to facilitate an escape, but the engine room is flooded with lethal radiation. Despite the danger, Spock enters the radiation-filled chamber to make the necessary repairs. He successfully restores the warp drive, allowing the Enterprise to reach a safe distance. Unfortunately, Vulcans are not immune to radiation. His dying words to Captain Kirk explain the logic behind his actions, “The needs of the many outweigh the needs of the few or the one.”

This is perfectly sound logic and will have to be used to control AI. Specific models pose a risk to the general public. For these models, the general public’s needs outweigh innovators’ rights.

Should All AI Be Open Source?

Let’s review the axioms established thus far:

Open Source should remain a choice.
Open models are not as transparent as non-AI software that is open sourced.
Close Source is a right of the innovator.
There is no guarantee that big tech will correctly control their AI.
The needs of the general public must take precedence over all others.

The five bullets above represent everything I tried to make clear about open source, closed source, and regulations. If you believe them to be accurate, the answer to the question, “Should All AI be Open Source?” is no because it will not control AI, nor will a closed source. Furthermore, in a fair world, open source and open models should remain a choice, and close source should remain a right.

We can go one step further and talk about the actions the industry can take as a whole to move toward effective control of AI:

Determine the types of models that pose a risk to the general public. Because they control information (chatbots) or dangerous resources (automated cars), models with high risk should be regulated.
Organizations should be encouraged to share their models as fully open models. The open source community will need to step up and either prevent or label models that are only partially shared. The open source community should also put together tests that can be used to rate models.
Closed models should still be allowed if they do not pose a risk to the general public. Big Tech should develop its controls and tests that it funds and shares. This may be a chance for Big Tech to work closely with the open source community to solve a common problem.

How To Run an Agent on Federated Language Model Architecture

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

In the first part of this series, I introduced the idea of federated language models, where we take advantage of a capable cloud-based large language model (LLM) and a small language model (SLM) running at the edge.

To recap, an agent sends the user query (1) along with the available tools (2) to a cloud-based LLM to map the prompt into a set of functions and arguments (3). It then executes the functions to generate appropriate context from a database (4a). If there are no tools involved, it leverages the simple RAG mechanism to perform a semantic search in a vector database (4b). The context gathered from either of the sources is then sent (5) to an edge-based SLM to generate a factually correct response. The response (6) generated by the SLM is sent as the final answer (7) to the user query.

This tutorial focuses on the practical implementation of a federated LM architecture based on the below components:

OpenAI GPT-4 Omni as an LLM.
Microsoft Phi-3 as a SLM.
Ollama as the inference engine for Phi-3.
Nvidia Jetson AGX Orin as the edge device to run Ollama.
MySQL database and a Flask API server running locally.
Chroma as the local vector store for semantic search.

Refer to the tutorial on setting up Ollama on Jetson Orin and implementing the RAG agent for additional context and details.

Start by cloning the Git repository https://github.com/janakiramm/federated-llm.git, which has the scripts, data, and Jupyter Notebooks. This tutorial assumes that you have access to OpenAI and an Nvidia Jetson Orin device. You can also run Ollama on your local workstation and change the IP address in the code.

Step 1: Run Ollama on Jetson Orin

SSH into Jetson Orin and run the commands mentioned in the file, setup_ollama.sh.

Verify that you are able to connect to and access the model by running the below command on your workstation, where you run Jupyter Notebook.

123456789101112131415

curl http://localhost:11434/v1/chat/completions \ -H “Content-Type: application/json” \ -d ‘{ “model”: “phi3:mini”, “messages”: [ { “role”: “system”, “content”: “You are a helpful assistant.” }, { “role”: “user”, “content”: “What is the capital of France?” } ] }’

Replace localhost with the IP address of your Jetson Orin device. If everything goes well, you should be able to get a response from the model.

Congratulations, your edge inference server is now ready!

Step 2: Run MySQL DB and Flask API Server

The next step is to run the API server, which exposes a set of functions that will be mapped to the prompt. To make this simple, I built a Docker Compose file that exposes the REST API endpoint for the MySQL database backend.

Switch to the api folder and run the below command:

1	start_api_server.sh

Check if two containers are running on your workstation with the docker ps command.

If you run the command curl "http://localhost:5000/api/sales/trends?start_date=2023-05-01&end_date=2023-05-30", you should see the response from the API.

Step 3: Index the PDF and Ingest the Embeddings in Chroma DB

With the API in place, it’s time to generate the embeddings for the datasheet PDF and store them in the vector database.

For this, run the Index_Datasheet.ipynb Jupyter Notebook, which is available in the notebooks folder.

A simple search retrieves the phrases that semantically match the query.

Step 4: Run the Federated LM Agent

The Jupyter Notebook, Federated-LM.ipynb, has the complete code to implement the logic. Let’s understand the key sections of the code.

We will import the API client that exposes the tools to the LLM.

We will import the API client that exposes the tools to the LLM.

First, we initialize two LLMs: GPT-4o (Cloud) and Phi3:mini (Edge)

After creating a Python list with the signatures of the tools, we will let GPT-4o map the prompt to appropriate functions and their arguments.

For example, passing the prompt What was the top selling product in Q2 based on revenue? to GPT-4o results in the model responding with the function get_top_selling_products and the corresponding arguments. Notice that a capable model is able to translate Q2 into date range, starting from April 1st to June 30th. This is exactly the power we want to exploit from the cloud-based LLM.

Once we enumerate the tool(s) suggested by GPT-4o, we execute, collect, and aggregate the output to form the context.

If the prompt doesn’t translate to tools, we attempt to use the retriever based on the semantic search from the vector database.

To avoid sending sensitive context to the cloud-based LLM, we leverage the model (edge_llm) at the edge for generation.

Finally, we implement the agent that orchestrates the calls between the cloud-based LLM and the edge-based LLM. It checks if the tools list is empty and then moves to the retriever to generate the context. If both are empty, the agent responds with the phrase “I don’t know.”

Below is the response from the agent based on tools, retriever, and unknown context.

To summarize, we implemented a federated LLM approach where an agent sends the user query along with available tools to a cloud-based LLM, which maps the prompt into functions and arguments. The agent executes these functions to generate context from a database. If no tools are involved, a simple RAG mechanism is used for semantic search in a vector database. The context is then sent to an edge-based SLM to generate a factually correct response, which is provided as the final answer to the user query.

SAP AI Core Vulnerabilities Expose Customer Data to Cyber Attacks

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Cybersecurity researchers have uncovered security shortcomings in SAP AI Core cloud-based platform for creating and deploying predictive artificial intelligence (AI) workflows that could be exploited to get hold of access tokens and customer data.

The five vulnerabilities have been collectively dubbed SAPwned by cloud security firm Wiz.

“The vulnerabilities we found could have allowed attackers to access customers’ data and contaminate internal artifacts – spreading to related services and other customers’ environments,” security researcher Hillai Ben-Sasson said in a report shared with The Hacker News.

Following responsible disclosure on January 25, 2024, the weaknesses were addressed by SAP as of May 15, 2024.

In a nutshell, the flaws make it possible to obtain unauthorized access to customers’ private artifacts and credentials to cloud environments like Amazon Web Services (AWS), Microsoft Azure, and SAP HANA Cloud.

They could also be used to modify Docker images on SAP’s internal container registry, SAP’s Docker images on the Google Container Registry, and artifacts hosted on SAP’s internal Artifactory server, resulting in a supply chain attack on SAP AI Core services.

Furthermore, the access could be weaponized to gain cluster administrator privileges on SAP AI Core’s Kubernetes cluster by taking advantage of the fact that the Helm package manager server was exposed to both read and write operations.

“Using this access level, an attacker could directly access other customer’s Pods and steal sensitive data, such as models, datasets, and code,” Ben-Sasson explained. “This access also allows attackers to interfere with customer’s Pods, taint AI data and manipulate models’ inference.”

Wiz said the issues arise due to the platform making it feasible to run malicious AI models and training procedures without adequate isolation and sandboxing mechanisms.

“The recent security flaws in AI service providers like Hugging Face, Replicate, and SAP AI Core highlight significant vulnerabilities in their tenant isolation and segmentation implementations,” Ben-Sasson told The Hacker News. “These platforms allow users to run untrusted AI models and training procedures in shared environments, increasing the risk of malicious users being able to access other users’ data.”

“Unlike veteran cloud providers who have vast experience with tenant-isolation practices and use robust isolation techniques like virtual machines, these newer services often lack this knowledge and rely on containerization, which offers weaker security. This underscores the need to raise awareness of the importance of tenant isolation and to push the AI service industry to harden their environments.”

As a result, a threat actor could create a regular AI application on SAP AI Core, bypass network restrictions, and probe the Kubernetes Pod’s internal network to obtain AWS tokens and access customer code and training datasets by exploiting misconfigurations in AWS Elastic File System (EFS) shares.

“People should be aware that AI models are essentially code. When running AI models on your own infrastructure, you could be exposed to potential supply chain attacks,” Ben-Sasson said.

“Only run trusted models from trusted sources, and properly separate between external models and sensitive infrastructure. When using AI services providers, it’s important to verify their tenant-isolation architecture and ensure they apply best practices.”

The findings come as Netskope revealed that the growing enterprise use of generative AI has prompted organizations to use blocking controls, data loss prevention (DLP) tools, real-time coaching, and other mechanisms to mitigate risk.

“Regulated data (data that organizations have a legal duty to protect) makes up more than a third of the sensitive data being shared with generative AI (genAI) applications — presenting a potential risk to businesses of costly data breaches,” the company said.

They also follow the emergence of a new cybercriminal threat group called NullBulge that has trained its sights on AI- and gaming-focused entities since April 2024 with an aim to steal sensitive data and sell compromised OpenAI API keys in underground forums while claiming to be a hacktivist crew “protecting artists around the world” against AI.

“NullBulge targets the software supply chain by weaponizing code in publicly available repositories on GitHub and Hugging Face, leading victims to import malicious libraries, or through mod packs used by gaming and modeling software,” SentinelOne security researcher Jim Walter said.

“The group uses tools like AsyncRAT and XWorm before delivering LockBit payloads built using the leaked LockBit Black builder. Groups like NullBulge represent the ongoing threat of low-barrier-of-entry ransomware, combined with the evergreen effect of info-stealer infections.”

SolarWinds Patches 8 Critical Flaws in Access Rights Manager Software

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

SolarWinds has addressed a set of critical security flaws impacting its Access Rights Manager (ARM) software that could be exploited to access sensitive information or execute arbitrary code.

Of the 13 vulnerabilities, eight are rated Critical in severity and carry a CVSS score of 9.6 out of 10.0. The remaining five weaknesses have been rated High in severity, with four of them having a CVSS score of 7.6 and one scoring 8.3.

The most severe of the flaws are listed below –

CVE-2024-23472 – SolarWinds ARM Directory Traversal Arbitrary File Deletion and Information Disclosure Vulnerability
CVE-2024-28074 – SolarWinds ARM Internal Deserialization Remote Code Execution Vulnerability
CVE-2024-23469 – Solarwinds ARM Exposed Dangerous Method Remote Code Execution Vulnerability
CVE-2024-23475 – Solarwinds ARM Traversal and Information Disclosure Vulnerability
CVE-2024-23467 – Solarwinds ARM Traversal Remote Code Execution Vulnerability
CVE-2024-23466 – Solarwinds ARM Directory Traversal Remote Code Execution Vulnerability
CVE-2024-23470 – Solarwinds ARM UserScriptHumster Exposed Dangerous Method Remote Command Execution Vulnerability
CVE-2024-23471 – Solarwinds ARM CreateFile Directory Traversal Remote Code Execution Vulnerability

Successful exploitation of the aforementioned vulnerabilities could allow an attacker to read and delete files and execute code with elevated privileges.

The shortcomings have been addressed in version 2024.3 released on July 17, 2024, following responsible disclosure as part of the Trend Micro Zero Day Initiative (ZDI).

The development comes after the U.S. Cybersecurity and Infrastructure Security Agency (CISA) placed a high-severity path traversal flaw in SolarWinds Serv-U Path (CVE-2024-28995, CVSS score: 8.6) to its Known Exploited Vulnerabilities (KEV) catalog following reports of active exploitation in the wild.

The network security company was the victim of a major supply chain attack in 2020 after the update mechanism associated with its Orion network management platform was compromised by Russian APT29 hackers to distribute malicious code to downstream customers as part of a high-profile cyber espionage campaign.

The breach prompted the U.S. Securities and Exchange Commission (SEC) to file a lawsuit against SolarWinds and its chief information security officer (CISO) last October alleging the company failed to disclose adequate material information to investors regarding cybersecurity risks.

However, much of the claims pertaining to the lawsuit were thrown out by the U.S. District Court for the Southern District of New York (SDNY) on July 18, stating “these do not plausibly plead actionable deficiencies in the company’s reporting of the cybersecurity hack” and that they “impermissibly rely on hindsight and speculation.”

SocGholish Malware Exploits BOINC Project for Covert Cyberattacks

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The JavaScript downloader malware known as SocGholish (aka FakeUpdates) is being used to deliver a remote access trojan called AsyncRAT as well as a legitimate open-source project called BOINC.

BOINC, short for Berkeley Open Infrastructure Network Computing Client, is an open-source “volunteer computing” platform maintained by the University of California with an aim to carry out “large-scale distributed high-throughput computing” using participating home computers on which the app is installed.

“It’s similar to a cryptocurrency miner in that way (using computer resources to do work), and it’s actually designed to reward users with a specific type of cryptocurrency called Gridcoin, designed for this purpose,” Huntress researchers Matt Anderson, Alden Schmidt, and Greg Linares said in a report published last week.

These malicious installations are designed to connect to an actor-controlled domain (“rosettahome[.]cn” or “rosettahome[.]top”), essentially acting as a command-and-control (C2) server to collect host data, transmit payloads, and push further commands. As of July 15, 10,032 clients are connected to the two domains.

The cybersecurity firm said while it hasn’t observed any follow-on activity or tasks being executed by the infected hosts, it hypothesized that the “host connections could be sold off as initial access vectors to be used by other actors and potentially used to execute ransomware.”

SocGholish attack sequences typically begin when users land on compromised websites, where they are prompted to download a fake browser update that, upon execution, triggers the retrieval of additional payloads to the infiltrated machines.

The JavaScript downloader, in this case, activates two disjointed chains, one that leads to the deployment of a fileless variant of AsyncRAT and the other resulting in the BOINC installation.

The BOINC app, which is renamed as “SecurityHealthService.exe” or “trustedinstaller.exe” to evade detection, sets up persistence using a scheduled task by means of a PowerShell script.

The misuse of BOINC for malicious purposes hasn’t gone unnoticed by the project maintainers, who are currently investigating the problem and finding a way to “defeat this malware.” Evidence of the abuse dates back to at least June 26, 2024.

“The motivation and intent of the threat actor by loading this software onto infected hosts isn’t clear at this point,” the researchers said.

“Infected clients actively connecting to malicious BOINC servers present a fairly high risk, as there’s potential for a motivated threat actor to misuse this connection and execute any number of malicious commands or software on the host to further escalate privileges or move laterally through a network and compromise an entire domain.”

The development comes as Check Point said it’s been tracking the use of compiled V8 JavaScript by malware authors to sidestep static detections and conceal remote access trojans, stealers, loaders, cryptocurrency miners, wipers, and ransomware.

“In the ongoing battle between security experts and threat actors, malware developers keep coming up with new tricks to hide their attacks,” security researcher Moshe Marelus said. “It’s not surprising that they’ve started using V8, as this technology is commonly used to create software as it is very widespread and extremely hard to analyze.”

Chinese Hackers Target Taiwan and U.S. NGO with MgBot and MACMA Malware

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Organizations in Taiwan and a U.S. non-governmental organization (NGO) based in China have been targeted by a Beijing-affiliated state-sponsored hacking group called Daggerfly using an upgraded set of malware tools.

The campaign is a sign that the group “also engages in internal espionage,” Symantec’s Threat Hunter Team, part of Broadcom, said in a new report published today. “In the attack on this organization, the attackers exploited a vulnerability in an Apache HTTP server to deliver their MgBot malware.”

Daggerfly, also known by the names Bronze Highland and Evasive Panda, was previously observed using the MgBot modular malware framework in connection with an intelligence-gathering mission aimed at telecom service providers in Africa. It’s known to be operational since 2012.

“Daggerfly appears to be capable of responding to exposure by quickly updating its toolset to continue its espionage activities with minimal disruption,” the company noted.

The latest set of attacks are characterized by the use of a new malware family based on MgBot as well as an improved version of a known Apple macOS malware called MACMA, which was first exposed by Google’s Threat Analysis Group (TAG) in November 2021 as distributed via watering hole attacks targeting internet users in Hong Kong by abusing security flaws in the Safari browser.

The development marks the first time the malware strain, which is capable of harvesting sensitive information and executing arbitrary commands, has been explicitly linked to a particular hacking group.

“The actors behind macOS.MACMA at least were reusing code from ELF/Android developers and possibly could have also been targeting Android phones with malware as well,” SentinelOne noted in a subsequent analysis at the time.

MACMA’s connections to Daggerly also stem from source code overlaps between the malware and Mgbot, and the fact that it connects to a command-and-control (C2) server (103.243.212[.]98) that has also been used by a MgBot dropper.

Another new malware in its arsenal is Nightdoor (aka NetMM and Suzafk), an implant that uses Google Drive API for C2 and has been utilized in watering hole attacks aimed at Tibetan users since at least September 2023. Details of the activity were first documented by ESET earlier this March.

“The group can create versions of its tools targeting most major operating system platform,” Symantec said, adding it has “seen evidence of the ability to trojanize Android APKs, SMS interception tools, DNS request interception tools, and even malware families targeting Solaris OS.”

The development comes as China’s National Computer Virus Emergency Response Center (CVERC) claimed Volt Typhoon – which has been attributed by the Five Eyes nations as a China-nexus espionage group – to be an invention of the U.S. intelligence agencies, describing it as a misinformation campaign.

“Although its main targets are U.S. congress and American people, it also attempt[s] to defame China, sow discords [sic] between China and other countries, contain China’s development, and rob Chinese companies,” the CVERC asserted in a recent report.

CISA Adds Twilio Authy and IE Flaws to Exploited Vulnerabilities List

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has added two security flaws to its Known Exploited Vulnerabilities (KEV) catalog, based on evidence of active exploitation.

The vulnerabilities are listed below –

CVE-2012-4792 (CVSS score: 9.3) – Microsoft Internet Explorer Use-After-Free Vulnerability
CVE-2024-39891 (CVSS score: 5.3) – Twilio Authy Information Disclosure Vulnerability

CVE-2012-4792 is a decade-old use-after-free vulnerability in Internet Explorer that could allow a remote attacker to execute arbitrary code via a specially crafted site.

It’s currently not clear if the flaw has been subjected to renewed exploitation attempts, although it was abused as part of watering hole attacks targeting the Council on Foreign Relations (CFR) and Capstone Turbine Corporation websites back in December 2012.

On the other hand, CVE-2024-39891 refers to an information disclosure bug in an unauthenticated endpoint that could be exploited to “accept a request containing a phone number and respond with information about whether the phone number was registered with Authy.”

Earlier this month, Twilio said it resolved the issue in versions 25.1.0 (Android) and 26.1.0 (iOS) after unidentified threat actors took advantage of the shortcoming to identify data associated with Authy accounts.

“These types of vulnerabilities are frequent attack vectors for malicious cyber actors and pose significant risks to the federal enterprise,” CISA said in an advisory.

Federal Civilian Executive Branch (FCEB) agencies are required to remediate the identified vulnerabilities by August 13, 2024, to protect their networks against active threats.

Microsoft Defender Flaw Exploited to Deliver ACR, Lumma, and Meduza Stealers

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

A now-patched security flaw in the Microsoft Defender SmartScreen has been exploited as part of a new campaign designed to deliver information stealers such as ACR Stealer, Lumma, and Meduza.

Fortinet FortiGuard Labs said it detected the stealer campaign targeting Spain, Thailand, and the U.S. using booby-trapped files that exploit CVE-2024-21412 (CVSS score: 8.1).

The high-severity vulnerability allows an attacker to sidestep SmartScreen protection and drop malicious payloads. Microsoft addressed this issue as part of its monthly security updates released in February 2024.

“Initially, attackers lure victims into clicking a crafted link to a URL file designed to download an LNK file,” security researcher Cara Lin said. “The LNK file then downloads an executable file containing an [HTML Application] script.”

The HTA file serves as a conduit to decode and decrypt PowerShell code responsible for fetching a decoy PDF file and a shellcode injector that, in turn, either leads to the deployment of Meduza Stealer or Hijack Loader, which subsequently launches ACR Stealer or Lumma.

ACR Stealer, assessed to be an evolved version of the GrMsk Stealer, was advertised in late March 2024 by a threat actor named SheldIO on the Russian-language underground forum RAMP.

“This ACR stealer hides its [command-and-control] with a dead drop resolver (DDR) technique on the Steam community website,” Lin said, calling out its ability to siphon information from web browsers, crypto wallets, messaging apps, FTP clients, email clients, VPN services, and password managers.

It’s worth noting that recent Lumma Stealer attacks have also been observed utilizing the same technique, making it easier for the adversaries to change the C2 domains at any time and render the infrastructure more resilient, according to the AhnLab Security Intelligence Center (ASEC).

The disclosure comes as CrowdStrike has revealed that threat actors are leveraging last week’s outage to distribute a previously undocumented information stealer called Daolpu, making it the latest example of the ongoing fallout stemming from the faulty update that has crippled millions of Windows devices.

The attack involves the use of a macro-laced Microsoft Word document that masquerades as a Microsoft recovery manual listing legitimate instructions issued by the Windows maker to resolve the issue, leveraging it as a decoy to activate the infection process.

The DOCM file, when opened, runs the macro to retrieve a second-stage DLL file from a remote that’s decoded to launch Daolpu, a stealer malware equipped to harvest credentials and cookies from Google Chrome, Microsoft Edge, Mozilla Firefox, and other Chromium-based browsers.

It also follows the emergence of new stealer malware families such as Braodo and DeerStealer, even as cyber criminals are exploiting malvertising techniques promoting legitimate software such as Microsoft Teams to deploy Atomic Stealer.

“As cyber criminals ramp up their distribution campaigns, it becomes more dangerous to download applications via search engines,” Malwarebytes researcher Jérôme Segura said. “Users have to navigate between malvertising (sponsored results) and SEO poisoning (compromised websites).”

CISA Warns of Exploitable Vulnerabilities in Popular BIND 9 DNS Software

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The Internet Systems Consortium (ISC) has released patches to address multiple security vulnerabilities in the Berkeley Internet Name Domain (BIND) 9 Domain Name System (DNS) software suite that could be exploited to trigger a denial-of-service (DoS) condition.

“A cyber threat actor could exploit one of these vulnerabilities to cause a denial-of-service condition,” the U.S. Cybersecurity and Infrastructure Security Agency (CISA) said in an advisory.

The list of four vulnerabilities is listed below –

CVE-2024-4076 (CVSS score: 7.5) – Due to a logic error, lookups that triggered serving stale data and required lookups in local authoritative zone data could have resulted in an assertion failure
CVE-2024-1975 (CVSS score: 7.5) – Validating DNS messages signed using the SIG(0) protocol could cause excessive CPU load, leading to a denial-of-service condition.
CVE-2024-1737 (CVSS score: 7.5) – It is possible to craft excessively large numbers of resource record types for a given owner name, which has the effect of slowing down database processing
CVE-2024-0760 (CVSS score: 7.5) – A malicious DNS client that sent many queries over TCP but never read the responses could cause a server to respond slowly or not at all for other clients

Successful exploitation of the aforementioned bugs could cause a named instance to terminate unexpectedly, deplete available CPU resources, slow down query processing by a factor of 100, and render the server unresponsive.

The flaws have been addressed in BIND 9 versions 9.18.28, 9.20.0, and 9.18.28-S1 released earlier this month. There is no evidence that any of the shortcomings have been exploited in the wild.

The disclosure comes months after the ISC addressed another flaw in BIND 9 called KeyTrap (CVE-2023-50387, CVSS score: 7.5) that could be abused to exhaust CPU resources and stall DNS resolvers, resulting in a denial-of-service (DoS).

Critical Docker Engine Flaw Allows Attackers to Bypass Authorization Plugins

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Docker is warning of a critical flaw impacting certain versions of Docker Engine that could allow an attacker to sidestep authorization plugins (AuthZ) under specific circumstances.

Tracked as CVE-2024-41110, the bypass and privilege escalation vulnerability carries a CVSS score of 10.0, indicating maximum severity.

“An attacker could exploit a bypass using an API request with Content-Length set to 0, causing the Docker daemon to forward the request without the body to the AuthZ plugin, which might approve the request incorrectly,” the Moby Project maintainers said in an advisory.

Docker said the issue is a regression in that the issue was originally discovered in 2018 and addressed in Docker Engine v18.09.1 in January 2019, but never got carried over to subsequent versions (19.03 and later).

The issue has been resolved in versions 23.0.14 and 27.1.0 as of July 23, 2024, after the problem was identified in April 2024. The following versions of Docker Engine are impacted assuming AuthZ is used to make access control decisions –

<= v19.03.15
<= v20.10.27
<= v23.0.14
<= v24.0.9
<= v25.0.5
<= v26.0.2
<= v26.1.4
<= v27.0.3, and
<= v27.1.0

“Users of Docker Engine v19.03.x and later versions who do not rely on authorization plugins to make access control decisions and users of all versions of Mirantis Container Runtime are not vulnerable,” Docker’s Gabriela Georgieva said.

“Users of Docker commercial products and internal infrastructure who do not rely on AuthZ plugins are unaffected.”

It also affects Docker Desktop up to versions 4.32.0, although the company said the likelihood of exploitation is limited and it requires access to the Docker API, necessitating that an attacker already has local access to the host. A fix is expected to be included in a forthcoming release (version 4.33).

“Default Docker Desktop configuration does not include AuthZ plugins,” Georgieva noted. “Privilege escalation is limited to the Docker Desktop [virtual machine], not the underlying host.”

Although Docker makes no mention of CVE-2024-41110 being exploited in the wild, it’s essential that users apply their installations to the latest version to mitigate potential threats.

Earlier this year, Docker moved to patch a set of flaws dubbed Leaky Vessels that could enable an attacker to gain unauthorized access to the host filesystem and break out of the container.

“As cloud services rise in popularity, so does the use of containers, which have become an integrated part of cloud infrastructure,” Palo Alto Networks Unit 42 said in a report published last week. “Although containers provide many advantages, they are also susceptible to attack techniques like container escapes.”

“Sharing the same kernel and often lacking complete isolation from the host’s user-mode, containers are susceptible to various techniques employed by attackers seeking to escape the confines of a container environment.”

Researchers Reveal ConfusedFunction Vulnerability in Google Cloud Platform

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Cybersecurity researchers have disclosed a privilege escalation vulnerability impacting Google Cloud Platform’s Cloud Functions service that an attacker could exploit to access other services and sensitive data in an unauthorized manner.

Tenable has given the vulnerability the name ConfusedFunction.

“An attacker could escalate their privileges to the Default Cloud Build Service Account and access numerous services such as Cloud Build, storage (including the source code of other functions), artifact registry and container registry,” the exposure management company said in a statement.

“This access allows for lateral movement and privilege escalation in a victim’s project, to access unauthorized data and even update or delete it.”

Cloud Functions refers to a serverless execution environment that allows developers to create single-purpose functions that are triggered in response to specific Cloud events without the need to manage a server or update frameworks.

The problem discovered by Tenable has to do with the fact that a Cloud Build service account is created in the background and linked to a Cloud Build instance by default when a Cloud Function is created or updated.

This service account opens the door for potential malicious activity owing to its excessive permissions, thereby permitting an attacker with access to create or update a Cloud Function to leverage this loophole and escalate their privileges to the service account.

This permission could then be abused to access other Google Cloud services that are also created in tandem with the Cloud Function, including Cloud Storage, Artifact Registry, and Container Registry. In a hypothetical attack scenario, ConfusedFunction could be exploited to leak the Cloud Build service account token via a webhook.

Following responsible disclosure, Google has updated the default behavior such that Cloud Build uses the Compute Engine default service account to prevent misuse. However, it’s worth noting that these changes do not apply to existing instances.

“The ConfusedFunction vulnerability highlights the problematic scenarios that may arise due to software complexity and inter-service communication in a cloud provider’s services,” Tenable researcher Liv Matan said.

“While the GCP fix has reduced the severity of the problem for future deployments, it didn’t completely eliminate it. That’s because the deployment of a Cloud Function still triggers the creation of the aforementioned GCP services. As a result, users must still assign minimum but still relatively broad permissions to the Cloud Build service account as part of a function’s deployment.”

The development comes as Outpost24 detailed a medium-severity cross-site scripting (XSS) flaw in the Oracle Integration Cloud Platform that could be weaponized to inject malicious code into the application.

The flaw, which is rooted in the handling of the “consumer_url” parameter, was resolved by Oracle in its Critical Patch Update (CPU) released earlier this month.

“The page for creating a new integration, found at https://<instanceid>.integration.ocp.oraclecloud.com/ic/integration/home/faces/link?page=integration&consumer_url=<payload>, did not require any other parameters,” security researcher Filip Nyquist said.

“This meant that an attacker would only need to identify the instance-id of the specific integration platform to send a functional payload to any user of the platform. Consequently, the attacker could bypass the requirement of knowing a specific integration ID, which is typically accessible only to logged-in users.”

It also follows Assetnote’s discovery of three security vulnerabilities in the ServiceNow cloud computing platform (CVE-2024-4879, CVE-2024-5178, and CVE-2024-5217) that could be fashioned into an exploit chain in order to gain full database access and execute arbitrary code on the within the context of the Now Platform.

The ServiceNow shortcomings have since come under active exploitation by unknown threat actors as part of a “global reconnaissance campaign” designed to gather database details, such as user lists and account credentials, from susceptible instances.

The activity, targeting companies in various industry verticals such as energy, data centers, software development, and government entities in the Middle East, could be leveraged for “cyber espionage and further targeting,” Resecurity said.

ServiceNow, in a statement shared with The Hacker News said it has “not observed evidence that the activity […] is related to instances that ServiceNow hosts.

“We have encouraged our self-hosted and ServiceNow-hosted customers to apply relevant patches if they have not already done so. We will also continue to work directly with customers who need assistance in applying those patches. It is important to note that these are not new vulnerabilities, but rather were previously addressed and disclosed in CVE-2024-4879, CVE-2024-5217, and CVE-2024-5178.”

(The story was updated after publication to include details about active exploitation of ServiceNow flaws.)

SAP AI Core Vulnerabilities Expose Customer Data to Cyber Attacks

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The five vulnerabilities have been collectively dubbed SAPwned by cloud security firm Wiz.

Following responsible disclosure on January 25, 2024, the weaknesses were addressed by SAP as of May 15, 2024.

Wiz said the issues arise due to the platform making it feasible to run malicious AI models and training procedures without adequate isolation and sandboxing mechanisms.

“People should be aware that AI models are essentially code. When running AI models on your own infrastructure, you could be exposed to potential supply chain attacks,” Ben-Sasson said.

Ongoing Cyberattack Targets Exposed Selenium Grid Services for Crypto Mining

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Cybersecurity researchers are sounding the alarm over an ongoing campaign that’s leveraging internet-exposed Selenium Grid services for illicit cryptocurrency mining.

Cloud security firm Wiz is tracking the activity under the name SeleniumGreed. The campaign, which is targeting older versions of Selenium (3.141.59 and prior), is believed to be underway since at least April 2023.

“Unbeknownst to most users, Selenium WebDriver API enables full interaction with the machine itself, including reading and downloading files, and running remote commands,” Wiz researchers Avigayil Mechtinger, Gili Tikochinski, and Dor Laska said.

“By default, authentication is not enabled for this service. This means that many publicly accessible instances are misconfigured and can be accessed by anyone and abused for malicious purposes.”

Selenium Grid, part of the Selenium automated testing framework, enables parallel execution of tests across multiple workloads, different browsers, and various browser versions.

“Selenium Grid must be protected from external access using appropriate firewall permissions,” the project maintainers warn in a support documentation, stating that failing to do so could allow third-parties to run arbitrary binaries and access internal web applications and files.

Exactly who is behind the attack campaign is currently not known. However, it involves the threat actor targeting publicly exposed instances of Selenium Grid and making use of the WebDriver API to run Python code responsible for downloading and running an XMRig miner.

https://youtube.com/watch?v=Pn7_MkAToe4%3Fsi%3D9o4n-SQHbv8bgV0t

It starts with the adversary sending a request to the vulnerable Selenium Grid hub with an aim to execute a Python program containing a Base64-encoded payload that spawns a reverse shell to an attacker-controlled server (“164.90.149[.]104”) in order to fetch the final payload, a modified version of the open-source XMRig miner.

“Instead of hardcoding the pool IP in the miner configuration, they dynamically generate it at runtime,” the researchers explained. “They also set XMRig’s TLS-fingerprint feature within the added code (and within the configuration), ensuring the miner will only communicate with servers controlled by the threat actor.”

The IP address in question is said to belong to a legitimate service that has been compromised by the threat actor, as it has also been found to host a publicly exposed Selenium Grid instance.

Wiz said it’s possible to execute remote commands on newer versions of Selenium and that it identified more than 30,000 instances exposed to remote command execution, making it imperative that users take steps to close the misconfiguration.

“Selenium Grid is not designed to be exposed to the internet and its default configuration has no authentication enabled, so any user that has network access to the hub can interact with the nodes via API,” the researchers said.

“This poses a significant security risk if the service is deployed on a machine with a public IP that has inadequate firewall policy.”

Update #

Selenium, in an advisory released on July 31, 2024, urged users to upgrade their instances to the latest version to mitigate against the threat.

“Selenium Grid by default doesn’t have any authentication as the assumption has always been that we want you to put this behind a secure network to prevent people from abusing your resources,” it said. “Another way to combat this is to use a cloud provider to run your Selenium Grid.”

Linux: Mount Remote Directories With SSHFS

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The Secure Shell (SSH) isn’t just about allowing you to remote into servers to tackle admin tasks. Thanks to this secure networking protocol, you can also mount remote directories with the help of the SSH File System (SSHF).

SSHFS uses SFTP (SSH File Transfer Protocol) to mount remote directories to a local machine using secure encryption, which means the connection is far more secure than your standard FTP. As well, once a remote directory is mounted, it can be used as if it was on the local machine.

Consider SSHFS to be a more secure way of creating network shares, the only difference is you have to have SSHFS installed on any machine that needs to connect to the share (whereas with Samba, you only have to have it installed on the machine hosting the share).

Let’s walk through the process of getting SSHFS up and running, so you can securely mount remote directories to your local machine.

What You’ll Need

To make this work, you’ll need at least two Linux machines. These machines can be Ubuntu or Fedora-based, because SSHFS is found in the standard repositories for most Linux distributions. You’ll also need a user with sudo privileges.

Installing SSHFS

Since SSHFS is found in the standard repositories, the installation is quite simple. Log into the server (which will house the directory to share) and install SSHFS with one of the following commands:

Ubuntu-based distributions – sudo apt-get install sshfs -y
Fedora-based distributions – sudo dnf install fuse-sshfs -y
Arch-based distributions – sudo pacman -S sshfs
openSUSE-based distributions – sudo zypper -n in sshfs

Next, log into your local machine and install the package as well.

Once installed, you’ll need to set user_allow_other in the SSHFS config file on the local machine. For that, open the file with:

1	sudo nano /etc/fuse.conf

In that file, locate the line:

1	#user_allow_other

Change that to:

1	user_allow_other

Save and close the file.

Creating a Directory for Mounting

Back on the server, we must create a directory that will be mounted on the client machines. We’ll place our new directory in /srv with the command:

1	sudo mkdir /srv/data

With the new directory created, we need to give it ownership, such that either a user or group can access it. If you only have one user who needs to access it, you can change the ownership with the command:

1	sudo chown -R USERNAME:USERNAME /srv/data

If you want to allow more than one user to access the directory, you’d need to first create a group with the command:

1	sudo groupadd GROUP

Where GROUP is the name of the new group.

Next, add the necessary users to the group (one at a time) with the command:

1	sudo usermod -aG GROUP USERNAME

Where GROUP is the name of the group and USERNAME is the name of the user to be added.

You would then need to change the ownership of the new directory to the new group with:

1	sudo chown -R USERNAME:GROUP /srv/data

On the local machine, you’ll have to create a directory that will house the mounted remote directory. We’ll create this in a user’s home directory with:

1	mkdir ~/data_mount

Mount the Directory

It’s now time to mount our remote directory. Remember, we’re mounting the remote directory /srv/data to the local directory ~/data_mount. This is done with the command:

1	sshfs USER@SERVER:/srv/data ~/data_mount

Where USER is the remote username and SERVER is the IP address of the remote server. You’ll be prompted for the remote user’s password. On successful authentication, the remote directory will be mounted to the local directory and you can access it as if it were native to the local machine. If you save or edit a file in ~/data_mount, it will be reflected in /srv/data on the remote machine.

This method of mounting is temporary. Let’s make it permanent.

Permanently Mount the Remote Drive

To permanently mount the SSHFS drive, you have to jump through a few hoops before it’ll work. First, you must create an SSH key pair (on the local machine) with the command:

1	ssh-keygen -t rsa

Make sure to give the key a strong/unique password.

Once the key is generated, copy it to the server with the command:

1	ssh-copy-id USER@SERVER

Where USER is the remote user name and SERVER is the IP address of the remote server.

Let’s test the connection to ensure it’s working properly. From the local machine, SSH to the server with:

1	ssh USER@SERVER

Where USER is the remote username and SERVER is the IP address of the remote server. You should be prompted for the SSH key password and not your user password. Once you’ve successfully authenticated, exit from the connection with the exit command.

To make this mount permanent, you need to modify the /etc/fstab file on the local machine. Open that file for editing with:

1	sudo nano /etc/fstab

At the bottom of the file, paste the following line:

1	USER1@SERVER:/srv/data /home/USER1/data_mount fuse.sshfs x-systemd.automount,_netdev,user,idmap=user,transform_symlinks,identityfile=/home/USER2/.ssh/id_rsa,allow_other,default_permissions,uid=USER_ID_N,gid=USER_GID_N 0 0

Where USER1 is the remote username, SERVER is the IP address of the server, USER2 is the username on the local machine, and USER_ID and GROUP_ID are unique to the local machine. You can locate the IDs with the command:

You should see entries like this:

1	uid=1000(jack) gid=1000(jack)

In the above example, the user ID is 1000 and the group ID is also 1000.

Save the file and test the mount with:

mount -a

If you receive no errors, all is well.

There is one caveat to this. During the boot process, the mount will fail because it will be attempted before networking is brought up. Because of this, after a reboot on the local machine, you’ll have to open a terminal window and mount the SSHFS directory with the command:

mount -a

Once you’ve done that, you’re ready to use the remote directory as if it were local.

Are Mobile Devices Less Secure than PCs?

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Are smartphones less secure than PCs? The answer to that is, they’re different. They face different security threats. Yet they certainly share one thing in common — they both need protection.

So, what makes a smartphone unique when it comes to security? And how do you go about protecting it? We’ll cover both here.

Apps, spam texts, and other smartphone vulnerabilities

Several facts of life about smartphones set them apart when it comes to keeping your devices safer. A quick rundown looks like this:

First off, people keep lots of apps on their phones. Old ones, new ones, ones they practically forgot they had. The security issue that comes into play there is that any app on a phone is subject to vulnerabilities.

A vulnerability in just one of the dozens of apps on a phone can lead to problems. The adage of “the weakest link” applies here. The phone is only as secure as its least secure app. And that goes for the phone’s operating system as well.

Additionally, app permissions can also introduce risks. Apps often request access to different parts of your phone to work — such as when a messenger app asks for access to contacts and photos. In the case of malicious apps, they’ll ask for far more permissions than they need. A classic example involves the old “flashlight apps” that invasively asked for a wide swath of permissions. That gave the hackers all kinds of info on users, including things like location info. Today, the practice of malicious, permission-thirsty apps continues with wallpaper apps, utility apps, games, and more.

As for other malicious apps, sometimes people download them without knowing. This often happens when shopping in third-party app stores, yet it can happen in legit app stores as well — despite rigorous review processes from Apple and Google. Sometimes, hackers sneak them through the review process for approval. These apps might include spyware, ransomware, and other forms of malware.

Many people put their smartphones to personal and professional use.[i] That might mean the phone has access to corporate apps, networks, and data. If the phone gets compromised, those corporate assets might get compromised too. And it can work in the other direction. A corporate compromise might affect an employee’s smartphone.

More and more, our phones are our wallets. Digital wallets and payment apps have certainly gained popularity. They speed up checkout and make splitting meals with friends easy. That makes the prospect of a lost or stolen phone all the more serious. An unsecured phone in the hands of another is like forking over your wallet.

Lastly, spam texts. Unique to phones are the sketchy links that crop up in texting and messaging apps. These often lead to scam sites and other sites that spread malware.

With a good sense of what makes securing your smartphone unique, let’s look at several steps you can take to protect it.

How to protect your smartphone

Update your phone’s apps and operating system

Keeping your phone’s apps and operating system up to date can greatly improve your security. Updates can fix vulnerabilities that hackers rely on to pull off their malware-based attacks. it’s another tried and true method of keeping yourself safer — and for keeping your phone running great too.

Lock your phone

With all that you keep and conduct on your phone, a lock is a must. Whether you have a PIN, passcode, or facial recognition available, put it into play. The same goes for things like your payment, banking, and financial apps. Ensure you have them locked too.

Avoid third-party app stores

As mentioned above, app stores have measures in place to review and vet apps that help ensure they’re safe and secure. Third-party sites might very well not, and they might intentionally host malicious apps as part of a front. Further, legitimate app stores are quick to remove malicious apps from their stores once discovered, making shopping there safer still.

Review apps carefully

Check out the developer — have they published several other apps with many downloads and good reviews? A legit app typically has many reviews. In contrast, malicious apps might have only a handful of (phony) five-star reviews. Lastly, look for typos and poor grammar in both the app description and screenshots. They could be a sign that a hacker slapped the app together and quickly deployed it.

Go with a strong recommendation.

Yet better than combing through user reviews yourself is getting a recommendation from a trusted source, like a well-known publication or app store editors themselves. In this case, much of the vetting work has been done for you by an established reviewer. A quick online search like “best fitness apps” or “best apps for travelers” should turn up articles from legitimate sites that can suggest good options and describe them in detail before you download.

Keep an eye on app permissions

Another way hackers weasel their way into your device is by getting permissions to access things like your location, contacts, and photos — and they’ll use malicious apps to do it. If an app asks for way more than you bargained for, like a simple puzzle game that asks for access to your camera or microphone, it might be a scam. Delete the app.

Learn how to remotely lock or erase your smartphone

So what happens if your phone ends up getting lost or stolen? A combination of device tracking, device locking, and remote erasing can help protect your phone and the data on it. Different device manufacturers have different ways of going about it, but the result is the same — you can prevent others from using your phone. You can even erase it if you’re truly worried that it’s gone for good. Apple provides iOS users with a step-by-step guide, and Google offers a guide for Android users as well.

Protect your phone and block sketchy links

Comprehensive online protection software can secure your phone in the same ways that it secures your laptops and computers. Installing it can protect your privacy, and keep you safe from attacks on public Wi-Fi, just to name a few things it can do. Ours also includes Text Scam Detector that blocks sketchy links in texts, messages, and email before they do you any harm. And if you tap that link by mistake, Text Scam Detector still blocks it.

Microsoft Says Azure Outage Caused by DDoS Attack Response

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Microsoft’s response to a distributed denial-of-service (DDoS) attack appears to have caused Azure service outages that impacted many customers.

Microsoft explained on its Azure status page that a “subset of customers” experienced issues connecting to services such as Azure App Services, Application Insights, Azure IoT Central, Azure Log Search Alerts, and Azure Policy, as well as the Azure portal and some Microsoft 365 and Purview services.

According to the BBC, the outage, which lasted roughly 10 hours, impacted water utilities, courts, banks, and other types of organizations.

Microsoft said it initially saw an unexpected usage spike that resulted in Azure Front Door and Azure Content Delivery Network components “performing below acceptable thresholds”, which led to errors, timeouts and latency issues.

An investigation showed that a DDoS attack launched against its systems triggered protection mechanisms, but an implementation bug in those defenses caused the attack’s impact to be amplified rather than mitigated.

The tech giant has promised to publish a preliminary incident review within 72 hours and a more detailed review within two weeks.

It’s unclear who is behind the DDoS attack on Microsoft services, but it would not be surprising if multiple hacktivist groups take credit for it in an effort to boost their reputation.

The incident comes just days after millions of computers worldwide were disrupted by a bad update rolled out by cybersecurity firm CrowdStrike.

A vast majority of devices impacted by the CrowdStrike incident were restored within one week, but insurers predict billions in losses for the security firm’s major customers. CrowdStrike is also facing lawsuits over the incident.

Cost of Data Breach in 2024: $4.88 Million, Says Latest IBM Study

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The bald figure of $4.88 million tells us little about the state of security. But the detail contained within the latest IBM Cost of Data Breach Report highlights areas we are winning, areas we are losing, and the areas we could and should do better.

“The real benefit to industry,” explains Sam Hector, IBM’s cybersecurity global strategy leader, “is that we’ve been doing this consistently over many years. It allows the industry to build up a picture over time of the changes that are happening in the threat landscape and the most effective ways to prepare for the inevitable breach.”

IBM goes to considerable lengths to ensure the statistical accuracy of its report (PDF). More than 600 companies were queried across 17 industry sectors in 16 countries. The individual companies change year on year, but the size of the survey remains consistent (the major change this year is that ‘Scandinavia’ was dropped and ‘Benelux’ added). The details help us understand where security is winning, and where it is losing. Overall, this year’s report leads toward the inevitable assumption that we are currently losing: the cost of a breach has increased by approximately 10% over last year.

While this generality may be true, it is incumbent on each reader to effectively interpret the devil hidden within the detail of statistics – and this may not be as simple as it seems. We’ll highlight this by looking at just three of the many areas covered in the report: AI, staff, and ransomware.

AI is given detailed discussion, but it is a complex area that is still only nascent. AI currently comes in two basic flavors: machine learning built into detection systems, and the use of proprietary and third party gen-AI systems. The first is the simplest, most easy to implement, and most easily measurable. According to the report, companies that use ML in detection and prevention incurred an average $2.2 million less in breach costs compared to those who did not use ML.

The second flavor – gen-AI – is more difficult to assess. Gen-AI systems can be built in house or acquired from third parties. They can also be used by attackers and attacked by attackers – but it is still primarily a future rather than current threat (excluding the growing use of deepfake voice attacks that are relatively easy to detect).

Nevertheless, IBM is concerned. “As generative AI rapidly permeates businesses, expanding the attack surface, these expenses will soon become unsustainable, compelling business to reassess security measures and response strategies. To get ahead, businesses should invest in new AI-driven defenses and develop the skills needed to address the emerging risks and opportunities presented by generative AI,” comments Kevin Skapinetz, VP of strategy and product design at IBM Security.

But we don’t yet understand the risks (although nobody doubts, they will increase). “Yes, generative AI-assisted phishing has increased, and it’s become more targeted as well – but fundamentally it remains the same problem we’ve been dealing with for the last 20 years,” said Hector.

Part of the problem for in-house use of gen-AI is that accuracy of output is based on a combination of the algorithms and the training data employed. And there is still a long way to go before we can achieve consistent, believable accuracy. Anyone can check this by asking Google Gemini and Microsoft Co-pilot the same question at the same time. The frequency of contradictory responses is disturbing.

The report calls itself “a benchmark report that business and security leaders can use to strengthen their security defenses and drive innovation, particularly around the adoption of AI in security and security for their generative AI (gen AI) initiatives.” This may be an acceptable conclusion, but how it is achieved will need considerable care.

Our second ‘case-study’ is around staffing. Two items stand out: the need for (and lack of) adequate security staff levels, and the constant need for user security awareness training. Both are long term problems, and neither are solvable. “Cybersecurity teams are consistently understaffed. This year’s study found more than half of breached organizations faced severe security staffing shortages, a skills gap that increased by double digits from the previous year,” notes the report.

Security leaders can do nothing about this. Staff levels are imposed by business leaders based on the current financial state of the business and the wider economy. The ‘skills’ part of the skills gap continually changes. Today there is a greater need for data scientists with an understanding of artificial intelligence – and there are very few such people available.

User awareness training is another intractable problem. It is undoubtedly necessary – and the report quotes ‘employee training’ as the #1 factor in decreasing the average cost of a beach, “specifically for detecting and stopping phishing attacks”. The problem is that training always lags the types of threat, which change faster than we can train employees to detect them. Right now, users might need additional training in how to detect the greater number of more compelling gen-AI phishing attacks.

Our third case study revolves around ransomware. IBM says there are three types: destructive (costing $5.68 million); data exfiltration ($5.21 million), and ransomware ($4.91 million). Notably, all three are above the overall mean figure of $4.88 million.

The biggest increase in cost has been in destructive attacks. It is tempting to link destructive attacks to global geopolitics since criminals focus on money while nation states focus on disruption (and also theft of IP, which incidentally has also increased). Nation state attackers can be hard to detect and prevent, and the threat will probably continue to expand for as long as geopolitical tensions remain high.

But there is one potential ray of hope found by IBM for encryption ransomware: “Costs dropped dramatically when law enforcement investigators were involved.” Without law enforcement involvement, the cost of such a ransomware breach is $5.37 million, while with law enforcement involvement it drops to $4.38 million.

These costs do not include any ransom payment. However, 52% of encryption victims reported the incident to law enforcement, and 63% of those did not pay a ransom. The argument in favor of involving law enforcement in a ransomware attack is compelling by IBM’s figures. “That’s because law enforcement has developed advanced decryption tools that help victims recover their encrypted files, while it also has access to expertise and resources in the recovery process to help victims perform disaster recovery,” commented Hector.

Our analysis of aspects of the IBM study is not intended as any form of criticism of the report. It is a valuable and detailed study on the cost of a breach. Rather we hope to highlight the complexity of finding specific, pertinent, and actionable insights within such a mountain of data. It is worth reading and finding pointers on where individual infrastructure might benefit from the experience of recent breaches. The simple fact that the cost of a breach has increased by 10% this year suggests that this should be urgent.

DigiCert Revoking 83,000 Certificates of 6,800 Customers

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

DigiCert has started revoking thousands of certificates impacted by a recently discovered verification issue, but some customers in critical infrastructure and other sectors are asking for more time.

The certificate authority (CA) informed customers on July 29 of an incident related to domain validation, saying that it needs to revoke some certificates within 24 hours due to strict CA/Browser Forum (CABF) rules.

The company initially said roughly 0.4% of applicable domain validations were impacted. A DigiCert representative clarified in discussions with stakeholders that 83,267 certificates and 6,807 subscribers are affected.

DigiCert said some of the impacted customers were able to quickly reissue their certificates, but others would not be able to do so within the 24-hour time frame.

“Unfortunately, many other customers operating critical infrastructure, vital telecommunications networks, cloud services, and healthcare industries are not in a position to be revoked without critical service interruptions. While we have deployed automation with several willing customers, the reality is that many large organizations cannot reissue and deploy new certificates everywhere in time,” said Jeremy Rowley, CISO at DigiCert.

DigiCert said in an updated notification that it has been working with browser representatives and customers in an effort to delay revocations under exceptional circumstances in order to avoid disruption to critical services.

However, the company highlighted that “all certificates impacted by this incident, regardless of circumstances, will be revoked no later than Saturday, August 3rd 2024, 19:30 UTC.”

Rowley noted that some customers have initiated legal action against DigiCert in an attempt to block the revocation of certificates.

The certificates are being revoked due to an issue related to the process used by DigiCert to validate that a customer requesting a TLS certificate for a domain is actually the owner or administrator of that domain.

One option is for customers to add a DNS CNAME record with a random value provided by DigiCert to their domain. The random value provided by DigiCert is prefixed by an underscore character to prevent collisions between the value and the domain name. However, the underscore prefix was not added in some cases since 2019.

In order to comply with CABF rules, DigiCert has to revoke certificates with an issue in their domain validation within 24, without exception.

Andrew Ayer, founder of SSLMate and an expert in digital certificates, believes that DigiCert’s public notification about this incident “gets the security impact of the noncompliance completely wrong”.

“[…] this is truly a security-critical incident, as there is a real risk […] that this flaw could have been exploited to get unauthorized certificates. Revocation of the improperly validated certificates is security-critical,” Ayer said.

The European Union’s World-First Artificial Intelligence Rules Are Officially Taking Effect

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

The European Union’s world-first artificial intelligence law formally took effect on Thursday, marking the latest milestone in the bloc’s efforts to regulate the technology.

Officials say the Artificial Intelligence Act will protect the “fundamental rights” of citizens in the 27-nation bloc while also encouraging investment and innovation in the booming AI industry.

Years in the making, the AI Act is a comprehensive rulebook for governing AI in Europe, but it could also act as a guidepost for other governments still scrambling to draw up guardrails for the rapidly advancing technology.

The AI Act covers any product or service offered in the EU that uses artificial intelligence, whether it’s a platform from a Silicon Valley tech giant or a local startup. The restrictions are based on four levels of risk, and the vast majority of AI systems are expected to fall under the low-risk category, such as content recommendation systems or spam filters.

“The European approach to technology puts people first and ensures that everyone’s rights are preserved,” European Commission Executive Vice President Margrethe Vestager said. “With the AI Act, the EU has taken an important step to ensure that AI technology uptake respects EU rules in Europe.”

The provisions will come into force in stages, and Thursday’s implementation date starts the countdown for when they’ll kick in over the next few years.

AI systems that pose “unacceptable risk,” such as social scoring systems that influence how people behave, some types of predictive policing and emotion recognition systems in schools and workplaces, will face a blanket ban by February.

Rules covering so-called general-purpose AI models like OpenAI’s GPT-4 system will take force by August 2025.

Brussels is setting up a new AI Office that will act as the bloc’s enforcer for the general purpose AI rules.

OpenAI said in a blog post that it’s “committed to complying with the EU AI Act and we will be working closely with the new EU AI Office as the law is implemented.”

By mid-2026, the complete set of regulations, including restrictions on high-risk AI such as systems that decide who gets a loan or that operate autonomous robots, will be in force.

There’s also a fourth category for AI systems that pose a limited risk, and face transparency obligations. Chatbots must be informed that they’re interacting with a machine and AI-generated content like deepfakes will need to be labelled.

Companies that don’t comply with the rules face fines worth as much as 7% of their annual global revenue.

Sitting Ducks DNS attacks let hackers hijack over 35,000 domains

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Threat actors have hijacked more than 35,000 registered domains in so-called Sitting Ducks attacks that allow claiming a domain without having access to the owner’s account at the DNS provider or registrar.

In a Sitting Ducks attack, cybercriminals exploit configuration shortcomings at the registrar level and insufficient ownership verification at DNS providers.

Researchers at DNS-focused security vendor Infoblox and at firmware and hardware protection company Eclypsium discovered that there are more than a million domains that can be hijacked every day via the Sitting Ducks attacks.

Multiple Russian cybercriminal groups have been using this attack vector for years and leveraged the hijacked domains in spam campaigns, scams, malware delivery, phishing, and data exfiltration.

Sitting Ducks details

Although the issues that make Sitting Ducks possible were first documented in 2016 [1, 2] by Matthew Bryant, a security engineer at Snap, the attack vector continues to be an easier way to hijack domains than other better-known methods.

For the attack to be possible, the following conditions are required:

– registered domain either uses or delegates authoritative DNS services to a provider other than the registrar

– the authoritative name server of the record cannot resolve queries because it lacks the info about the domain (lame delegation)

– the DNS provider needs to allow claiming a domain without properly verifying ownership or requiring access to the owner’s account

Variations of the attack include partially lame delegation (not all name servers are configured incorrectly) and redelegation to another DNS provider. However, if lame delegation and exploitable provider conditions are met, the domain can be hijacked.

**Prerequisites diagram**
*Source: Infoblox*

Infoblox explains that attackers can use the Sitting Ducks method on domains that use authoritative DNS services from a provider that is different from the registrar, such as a web hosting service.

If the authoritative DNS or web hosting service for the target domain expires, an attacker can simply claim it after creating an account with the DNS service provider.

The threat actor can now set up a malicious website under the domain and configure DNS settings to resolve IP address record requests to the fake address; and the legitimate owner won’t be able to modify the DNS records.

Sitting Ducks overview — **“Sitting Ducks” overview**
*Source: Infoblox*

Attacks in the wild

Infoblox and Eclypsium report that they have observed multiple threat actors exploiting the Sitting Ducks (or Ducks Now Sitting – DNS) attack vector since 2018 and 2019.

Since then, there have been at least 35,000 domain hijacking cases using this method. Typically, the cybercriminals held the domains for a short period but there were some instances where they kept them up to a year.

There have also been occurrences where the same domain was hijacked by multiple threat actors successively, who used it in their operations for one to two months and then passed it on.

GoDaddy is confirmed as a victim of Sitting Ducks attacks, but the researchers say there are six DNS providers who are currently vulnerable.

The observed clusters of activity leveraging Sitting Ducks is summarized as follows:

“Spammy Bear” – Hijacked GoDaddy domains in late 2018 for use in spam campaigns.
“Vacant Viper” – Started using Sitting Ducks in December 2019, and hijacks 2,500 yearly since then, used in the 404TDS system that distributes IcedID, and setting up command and control (C2) domains for malware.
“VexTrio Viper” – Started using Sitting Ducks in early 2020 to utilize the domains in a massive traffic distribution system (TDS) that facilitates the SocGholish and ClearFake operations.
Unnamed actors – Several smaller and unknown threat actors creating TDS, spam distribution, and phishing networks.

Defense tips

Domain owners should regularly review their DNS configurations for lame delegations, especially on older domains, and update the delegation records at the registrar or authoritative name server with proper, active DNS services.

Registrars are advised to perform proactive checks for lame delegations and alert owners. They should also ensure that a DNS service is established before propagating name server delegations.

Ultimately, regulators and standards bodies must develop long-term strategies to address DNS vulnerabilities and press DNS providers under their jurisdictions to take more action to mitigate Sitting Ducks attacks.

DuckDuckGo blocked in Indonesia over porn, gambling search results

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Privacy-focused search engine DuckDuckGo has been blocked in Indonesia by its government after citizens reportedly complained about pornographic and online gambling content in its search results.

The government’s choice to block DuckDuckGo isn’t surprising considering the cultural and religious context, with Indonesia being a Muslim country where gambling is prohibited and porn is viewed as morally unacceptable.

In January 2024, Indonesia announced it blocked nearly 600,000 online gambling portals and took action against 5,000 bank accounts that engaged with them.

The government has previously blocked numerous pornography sites, Reddit, and Vimeo, and imposed temporary or partial restrictions on Tumblr, Telegram, TikTok, Netflix, and Badoo.

DuckDuckGo has now confirmed to BleepingComputer that Indonesia blocked its search engine in the country and that it has no means to respond to it.

“We can confirm that DuckDuckGo has been blocked in Indonesia due to their censorship policies. Unfortunately, there is no current path to being unblocked, similar to how we’ve been blocked in China for about a decade now,” DuckDuckGo told BleepingComputer.

At the same time, Google Search remains accessible in Indonesia, which suggests that either the tech giant has implemented effective self-censorship mechanisms for its local search engine or its size makes blocking too disruptive for internet usage in the country.

Indonesians have resorted to using VPN software to bypass the government’s restrictions. However, the Indonesian government plans to block free VPNs, making gaining access to blocked sites costly.

Free VPNs next

Virtual Private Network (VPN) tools are commonly used to bypass censorship imposed by governments and internet service providers.

When using VPNs, users can make connections from other countries to once again access DuckDuckGo, but free offerings may soon be removed.

Minister of Communication and Information Budi Arie Setiadi stated that the government intends to restrict access to free VPN tools, as they know these are used to access blocked online gambling portals.

“Yesterday, Mr. Hokky (Ministry’s Director General of Informatics Applications) had a meeting with Mr. Wayan (Ministry’s Director General of Postal and Information Technology Operations), and we will shut down free VPNs to reduce access to networks for the general public to curb the spread of online gambling,” stated Setiadi on June 31, 2024.

“I specifically have to include the issue of online gambling to make it clear that this is the darkest side of digitalization.”

The same ministry announcement highlighted the risks of free VPN services, underlining personal data theft, malware infections, and making internet connectivity slow or unreliable.

CrowdStrike sued by investors over massive global IT outage

Posted on August 4, 2024 - August 4, 2024 by Maq Verma

Cybersecurity company CrowdStrike has been sued by investors who say it provided false claims about its Falcon platform after a bad security update led to a massive global IT outage causing the stock price to tumble almost 38%.

The plaintiffs claim that the massive IT outage that occurred on July 19, 2024, proves CrowdStrike’s claims that their cybersecurity platform is thoroughly tested and validated are false.

As a result of this incident and its aftermath, CrowdStrike’s stock price has tumbled almost 38% from $343 on July 18 to $214, causing significant financial losses to investors.

The class action lawsuit submitted by the Plymouth County Retirement Association in the U.S. District Court of Austin, Texas, seeks compensatory damages for these losses.

A bad update causes a global IT outage

On July 19, Crowdstrike pushed out a faulty Falcon sensor update to Windows devices running the security software. The update slipped past Crowdstrike’s internal tests due to a bug in its content validator and inadequate testing procedures.

The update was received by 8,500,000 Windows devices, if not more, causing an out-of-bounds memory read when processed by Falcon, leading to the operating system crashing with Blue Screen of Death (BSOD).

CrowdStrike is widely used in enterprises, including airports, hospitals, government organizations, the media, and financial firms, causing catastrophic, costly, and even dangerous IT outages.

As restoring systems required staff to remove the faulty update manually, it took days for some companies to resume normal operations, leading to extended outages and delays.

While most have returned to normal operations, the fallout from the incident continues to unfold on multiple levels, including elevated cybercrime activity, loss of trust, and litigation threats.

According to the plaintiffs, the faulty Falcon update proved that contrary to CrowdStrike’s assurances around the diligence in its procedures and the efficacy and reliability of the Falcon platform, updates were inadequately tested and controlled, and the risk of outages is high.

The class action alleges that stockholders were defrauded by CrowdStrike’s knowingly false statements about the quality of its products and procedures.

“Because of their positions and access to material, nonpublic information, the Individual Defendants knew or recklessly disregarded that the adverse facts specified herein had not been disclosed to and were being concealed from the investing public and that the positive representations that were being made were false and misleading.” – Class action document.

To reflect the extent of the losses, the lawsuit mentions that the CrowdStrike stock price fell by 11% on the day of the incident, then another 13.5% on July 22, when Congress called CEO George Kurtz for a testimony, and another 10% on July 29 following news that Delta Airlines, one of the impacted entities, hire an attorney to seek damages.

The plaintiff alleges violations of Sections 10(b) and 20(a) of the Exchange Act and seeks compensation.

Financial impact

The IT outage caused by the CrowdStrike Falcon update has caused massive financial losses to impacted organizations, with many of them exploring litigation pathways to get some of it back.

Delta Airlines CEO Ed Bastian previously stated that the outage forced the cancellation of 2,200 flights for the company, resulting in losses estimated at $500,000,000.

The firm has already hired a law firm that will seek compensation from CrowdStrike and Microsoft, which is now in the crosshairs despite not being responsible for the incident.

Market analysts estimate that the outage has caused big enterprises $5.4 billion in losses.

A report by Guy Carpenter projects the estimated insured losses resulting from the bad Falcon update to be between $300 million and $1 billion, while CyberCube have raised the figure to $1.5 billion.

Fake AI editor ads on Facebook push password-stealing malware

Posted on August 4, 2024 by Maq Verma

A Facebook malvertising campaign targets users searching for AI image editing tools and steals their credentials by tricking them into installing fake apps that mimic legitimate software.

The attackers exploit the popularity of AI-driven image-generation tools by creating malicious websites that closely resemble legitimate services and trick potential victims into infecting themselves with information stealer malware, as Trend Micro researchers who analyzed the campaign found.

The attacks start with phishing messages sent to Facebook page owners or administrators, which will send them to fake account protection pages designed to trick them into providing their login information.

After stealing their credentials, the threat actors hijack their accounts, take control of their pages, publish malicious social media posts, and promote them via paid advertising.

“We discovered a malvertising campaign involving a threat actor that steals social media pages (typically related to photography), changing their names to make them seem connected to popular AI photo editors,” said Trend Micro threat researcher Jaromir Horejsi.

“The threat actor then creates malicious posts with links to fake websites made to resemble the actual website of the legitimate photo editor. To increase traffic, the perpetrator then boosts the malicious posts via paid ads.”

*Fake AI photo editor website (Trend Micro)*

Facebook users who click the URL promoted in the malicious ad are sent to a fake web page impersonating legitimate AI photo editing and generating software, where they are prompted to download and install a software package.

However, instead of AI image editing software, the victims install the legitimate ITarian remote desktop tool configured to launch a downloader that automatically deploys the Lumma Stealer malware.

The malware then quietly infiltrates their system, allowing the attackers to collect and exfiltrate sensitive information like credentials, cryptocurrency wallet files, browser data, and password manager databases.

This data is later sold to other cybercriminals or used by the attackers to compromise the victims’ online accounts, steal their money, and promote further scams.

“Users should enable multi-factor authentication (MFA) on all social media accounts to add an extra layer of protection against unauthorized access,” Horejsi advised.

“Organizations should educate their employees on the dangers of phishing attacks and how to recognize suspicious messages and links. Users should always verify the legitimacy of links, especially those asking for personal information or login credentials.”

In April, a similar Facebook malvertising campaign promoted a malicious page impersonating Midjourney to target almost 1.2 million users with the Rilide Stealer Chrome browser extension.