The free European energy market has gone through big changes in recent years. In Europe today, the owner of the energy transporting infrastructure may not be the same as the contractual supplier of energy to end-users (consumers).
Energy companies are merging, and the internal solutions are now a mélange of different hardware and software solutions. Recently a supplier called KT for help. Their IT consolidation project had resulted in an inability to provide contract owners with the daily energy consumption by the mandatory deadline.
The threat of multi-million Euro penalties focused attention.
The workaround met the external requirements, but not the internal company standards for production. The root cause of the issue was required in order to drive a high quality technical resolution.
Although some staff in this company are trained on KT Resolve for troubleshooting in high-tech environments, they decided to have a KT consultant facilitate the troubleshooting process so they could focus completely on the technical details.
Situation Appraisal and Problem Analysis were performed to unravel the reasons why the migrated application environment was not going fast enough.
Situation Appraisal found that the issue to be solved was slowness in the interface sending energy end-user connection and consumption information to other (external) parties in the energy sector.
The processing of this information would typically go through several phases, where intermediate information is saved on disk. To guarantee data integrity in case of the application stopping somewhere in the middle of processing information, a lot of effort goes into properly saving the intermediate information, such that it could be continued once the application is up and running again--potentially even running from another machine. To achieve this redundant solution, the application is using “filer” functionality to avoid unnecessary disk writing. Unfortunately, the use of this “filer” is involved in the slow performance, as the issue get resolved through working around the filer. The interesting detail is that this filer is working properly in other environments, at convenient performance levels.
During the facilitation session, KT focused on documenting the factual data for this issue, looking for helpful contrast between the working and non-working environments that were available. Based on available knowledge and experience a list of causes was put together, including Network Performance and Network driver issues. In the end it was concluded that the most probable cause was a delay in the handling of disk-writing actions between (and excluding) the application and the network driver. As in depth Unix experience was not available in this session, the session outcome was a clear understanding of the model of the problem, an understanding of who was required, and what area required technical focus in order to find the root cause.
A few weeks later KT was informed that the root cause was found in the underlying storage. To resolve this issue the architecture of this IT solution, which is in its roll-out phase, had to go through a substantial redesign.
By Berrie Schuurhuis, KT Consultant
KT has a powerful toolkit for root cause analysis and preventing future problems