Self-healing services in MaaS applications

Automated communication between digital actors can lead to unexpected mistakes. These can occur due to inconsistent or improperly formatted data, or because of network issues, such as an unreachable service or other issues that result in error codes. Handling these errors depends on the specific situation. It is often a question of whether the error code should simply be passed on, for example to users, or whether an attempt should be made to correct the error automatically, for example by sending an incorrect server request again.

The decision to implement self-healing mechanisms can be a strategic decision to mitigate the effects of errors and ensure that the system continues to work reliably even in the event of failures. However, designing such self-healing systems requires careful planning and thorough testing to ensure that they can recover from failures and errors as expected. There are a variety of mechanisms available, from automatic failovers and redundancy checks to automatic replay mechanisms.

Typical sources of error in MaaS applications

In the area of Mobility as a Service (MaaS) applications, dealing with unexpected errors is a challenge. In particular, moving vehicles and parking them in places with poor network coverage lead to potential sources of error. A reliable MaaS platform must be able to handle them so as not to impact the user experience. For example, it can be frustrating when a customer tries to interact with a vehicle via an app (to retrieve information or complete a booking) and simply receives an error message asking them to try again later.

Specific use cases in the Smartmove platform that are susceptible to such problems include starting or ending a car-sharing vehicle booking. In these cases, the system tries to view information about the vehicle in order to collect data or to check whether a vehicle is returned correctly. For example, it must be checked whether the vehicle is in the right place when returned. However, some of these parking spaces are located in rural areas or in underground car parks, where the Internet connection may not be available. In such cases, a customer should not be left alone. Requesting you to try to return a vehicle again later is inadequate, particularly when there is a risk of late penalties and higher costs. Instead, the risk of errors should be minimized. If an error occurs, the customer should be informed in detail and appropriate steps should be taken for manual intervention. It should always be ensured that the vehicle is returned correctly and the customer should never feel left alone.

Halbvoller Parkplatz aus der Vogelperspektive samt Logo für Fleetmanagement
Connectivity problems when communicating with vehicles can quickly become a challenge.

Our solution approach in the SmartMove platform

Based on the vehicle return situation outlined above, we would like to show you how we react to potential errors in our SmartMove platform. As mentioned, our platform tries to contact the appropriate vehicle before returning it to check the parking location, charge level and other contractually agreed return criteria.

Before we talk about self-healing functionality, we'd like to show how to deal with it when the error can't be fixed automatically. Here, we have designed our services to accompany and support a manual intervention process. If the situation occurs that our system cannot reach the car, we ask the user for help: We ask for photos of the car, e.g. to document the parking location and mileage. At the same time, the associated booking is marked as incorrect, which alerts our support team. They can then use the submitted images to manually check the return criteria and successfully complete the booking. All other processes on the SmartMove platform, which depend on correct booking information, are paused until this successful completion.

Even though the occurrence of an error can be handled with manual help and appropriate platform mechanisms, it is still advisable to reduce the probability of an error. That is why we have taken measures that automatically correct errors, i.e. heal them ourselves. As a result, a new service was created, which periodically retrieves data from vehicles and stores it. That is why we know at least some of the characteristics of a vehicle from the recent past. Should a car suddenly no longer be reached and its position is not known, we can look at the previously stored data. If the last saved data set is not too old, we can use this saved data instead of live information.

We can also let other services know when a previously unreachable vehicle reconnects. All services that depend on live information can then continue their operations, which were pending due to the vehicle's connectivity issues.

So if a booking is marked incorrectly due to connection problems, we automatically check based on stored, not too old data whether the car meets all requirements in order to successfully complete a booking. When all requirements are met, we remove the error mark from the booking and end it. Otherwise, the problem solving procedure described above is used.


Experience from implementing self-healing services

Implementing regular vehicle data storage to recover from connection problems has brought further benefits. Above all, the user experience has drastically improved, as it is no longer necessary to contact the support team in case of an error, but can also solve the problem yourself.

Even when vehicles are parked in garages with poor network coverage, it often happens that the connection can be restored for a while, which is more than enough to retrieve important information about the car. As a result, it is very rare that vehicles have to be moved manually to get them back into operation. In the end, manual work was reduced and the user experience was improved for both customers and our staff.

Kevin
Senior Backend Engineer

Let us tell you a story

Refactoring Smartmove's Angular web portal: A post-mortem

The refactoring of the Angular web portal separated UI and business logic using the Facade Pattern, while Smart & Dumb Components improved the structure. Buddy Services streamlined API interactions and state management, making maintainability and troubleshooting easier.

Avoid problems with elegant design instead of complex solutions — loading points of interest (POIs)

A fixed map grid optimizes the loading and caching of POIs, avoids duplicate queries and reduces data consumption. Geographic indexes in MongoDB ensure scalable and high-performance queries.

Why Mobility-as-a-Service is still in its infancy

Mobility-as-a-Service (MaaS) integrates various means of transport into a single service. Ideally, you can seamlessly book and pay for buses, bikes, cars or scooters on such platforms. Despite this potential, MaaS is still in its infancy. This post highlights current developments, challenges and the future of MaaS.

Ready to Build the Next Big App?

Let us help you create innovative, user-friendly solutions like Remap. Contact us today and bring your vision to life.