Submitted
By Govindaraj Rangan
The IT industry underwent a rapid and inorganic growth period over the last two decades. The sourcing, fulfillment, and operations of enterprise IT happened discretely in different islands as the different business units had varied amounts of technology leverage to accelerate their business growth.
The resultant state is what we see as distributed and siloed IT with a complex cobweb of integrations. In most cases, it is a huge cost to undo the complexity that we developed, simplify the infra, application, and data landscape, and become more agile to take up new business opportunities. This cost of change is due to the “Technical Debt” which got accrued as we kept postponing the sanity and hygiene checks for future readiness. As more Web-scale companies leverage the Social Mobile Analytics Cloud (SMAC) revolution with fail-fast architectures and demonstrate newer ways of doing business, it is no more a choice but a necessity for enterprises to break the rigidity of their legacy and build an agile technology platform. Here, we guide customers with approaches that we learnt, developed, and tested.
“Technical Debt,” as Wikipedia cites Refactoring for Software Design Smells: Managing Technical Debt 1st Edition by Girish Suryanarayana, Ganesh Samarthyam, and Tushar Sharma, refers to eventual consequences of a system architecture design, especially the left-out or postponed work, which make future changes expensive or impossible. Though the term was originally referred in the context of software development, it is equally applicable to enterprise IT architectures.
Technical Debt introduces “Cost of Change” challenges that many enterprises fail to appropriately address. As Web-Scale companies continue to build and operate their businesses on change-ready, fail-fast architectures, traditional enterprises struggle to do the same and by consequence, jeopardize their ability to compete and ultimately, survive. The SMAC trend is pushing enterprises to change and it is making it inevitable for them to work on their Technical Debt.
The Field, from the Bird’s Eye
The transformation can be overwhelming to execute. It can take months or even years to achieve agility across the IT landscape; business units though, cannot wait until an agile infrastructure is completely ready before seeking out new business opportunities. Industry experts recommend a bimodal or two-speed architecture to get going. Creating another high-speed infrastructure will only increase the Technical Debt if it is built with our conventional design patterns, thinking about various integration points that need to be developed and maintained. The bimodal or two-speed infrastructure and operating model must be designed with highest degree of decoupling, with proper care and diligence to avoid increasing complexities any further.
While designing the new architecture and operating model are business processes, application landscape, infrastructure design, unified operations model, and organizational culture are key factors to consider. The technology components need to be built with transient architecture patterns and must operate at the highest level of decoupling.
Where to Begin
It’s business. It’s recommended to start with service maps correlating business processes with the application and infrastructure components. Service maps are built starting with each business process and tracking down the dependencies layer-after-layer below the applications serving the business process. Representatives from both enterprise IT and business IT need to do business process understanding/re-engineering exercises to identify unused application modules/features. This should not be a full-fledged application platform rationalization exercise, but a simple business-IT handshake.
The representatives should base their discussions on service maps and utilization data, obtained through auto-generated configuration management database(CMDB) and application performance management (APM) tools. The objectives are two-fold, first start with a business process and identify all the applications it is dependent on, and second using the utilization data, segment the features/modules based on criticality and identify those being used.
Gathering Insight
CMDB and APM tools simplify the task of identifying dependencies and gathering insights into patterns and behavior of utilization. It’s necessary to have or identify a service owner for each component of the service map who will own the lifecycle of that component. As the ownerships and dependencies become mapped, there will be dependable insight into what is being used, how it is used, and who is using. Monitoring tools can also reveal the amount of CPU, memory, storage, and network capacity that an application demands during normal operation and peak loads.
Get the Foundation Right with a Two-Speed or BiModal IT
As businesses grew over time, more focus was placed on the applications necessary to support a business process, while the infrastructure agility was often overlooked. Most of the technical debt that enterprises accrued was a result of changes postponed in the infrastructure space. Later, as the public cloud providers built a robust, highly available, scalable, reconfigurable, and programmable infrastructure, it became clear that the infrastructure agility is as important to business as the actual application and data architecture. Since moving from rigid systems into agile cloud infrastructure is going to take time, experts propose a bimodal or two-speed infrastructure foundation.
Legacy Mode
Application systems that must remain on a rigid infrastructure, and is critical to business, will continue to be so for a considerable amount of time. Such infrastructure has to be maintained and operated in an isolated manner. However, it is important that enterprises evaluate the total cost of ownership and weigh the benefits that it would bring to business in the future. Nevertheless, it is important to build a retirement or replacement plan to eliminate it over a period of time. In the interim, it has to be treated for manageability by the unified operations platform without needing to change the operations platform to adapt. These systems can have an isolated management and monitoring toolset, and few selective scope of data can be taken to the unified operations platform for consolidated visibility into IT operations.
Cloud Mode
A separate infrastructure needs to be implemented to operate in the cloud mode. It has to be a policy that all new applications or workloads need to be built cloud ready, assuming the transient nature of infrastructure, allowing easy changes to be made to the infrastructure. This infrastructure can leverage the public cloud providers’ assets or can be built within enterprise’s data centers. It is recommended to use hyper-converged architectures built on commodity hardware elements, which can be started small and expanded as the business grows. While the cloud is built or leveraged from a third party, it is important to plan for hybrid IT to operate across multiple providers. It is also required to be made programmable by exposing every element in the infrastructure through APIs.
Unified Operations Platform
The next important step is to have a unified operations platform, which can manage legacy, private clouds, public clouds, and partner hosted clouds. The incident aggregation and handling may be kept modular in such a way that the tools or modules used for legacy can be discarded without affecting the cloud mode systems. Apart from the operational aspects of IT, this platform should also include capabilities like service brokering, fulfillment orchestration, reporting, and financial accountability.
Treating a Workload and Placing it in the Right Infra
Modernizing and “right-targeting” a workload needs sufficient due diligence, and hence can be overwhelming if the number of applications are in hundreds. It’s recommended to do this in two steps. In the first step, segment the applications based on application platform, development methodologies used, and commercial off-the-shelf vendor, etc. to arrive at a list of different archetypes that the applications are built on. This would help identify the easy ones from the complex ones and identify patterns for which treatment would be similar. In the second step, deal with application specifics.
Treatment Patterns
There are two phases of the treatment patterns, identifying the right R-lane and build or outsource a set of work packages needed to treat the applications in each of the R-lanes. The R-lanes are an extrapolation of Gartner’s 5-R model.
Identify the R-lane. There are eight possible lanes that an application can be passed through. One of them, “reduce,” is more of a sanity check than a full-fledged lane.
Reduce. Are there any unused modules in an application that I can discard? If yes, look at retiring those portions, minimize the scope of the application, and then look at putting them on the lanes.
Retire. Can I retire the application? Archive the data and decommission the application.
Rehost. Can I move from physical to virtual x86 platforms? Can applications on mainframe, Unix, and other non-x86 systems be moved to x86 systems? Use P2V migration or platform simulation tools to move the application “as-is” from legacy hardware to x86 cloud.
Replace.Can I replace the application with cloud-ready alternative or SaaS services? If an application reached its end-of-life and there is a cloud-ready alternative available, either as an in-house installation or offered as SaaS services, it would be a good candidate to replace. Employee engagement systems such as email, collaboration, and human capital management solutions are good to be replaced with a SaaS service.
Revise. Can I upgrade the application to a newer version available from the supplier? Verify that the newer version is cloud ready and pass through the upgrade lane.
Refactor. Can I use the majority of the code and make minor changes to make it cloudready? Applies to home-grown applications which may need minor tweaks to the codebase to make it cloud ready.
Rebuild. Should I completely rebuild the application, since it needs lot of re-design and functionality addition? These are great candidates to apply micro-services cloud-ready Web-scale architectures. It may sound expensive to consider a “rebuild,” but using an agile DevOps platform, the must-have features can be addressed first in a short period of time, and additional features can be added on an ongoing basis.
Retain. Cannot be moved and has to remain on the existing physical servers. Agree on a retirement plan with the business users and figure out a way to manage these systems without tightly integrating them into the unified operations model.
What happens on an R-lane? The R-lane is similar to a manufacturing production line, consisting of various industrialized and standardized work packages, and measurable checkpoints or milestones.
Placement targets. The placement targets can be based on two dimensions—the datacenter and the service model. The datacenter can be any of the cloud environments, private, public, hosted private, community, or a non-cloud dedicated physical server farm. The service model can be one of IaaS, PaaS, or SaaS.
People and Cultural Transformation
It is equally important to bring about a cultural change within the organization, both within IT organization and the end consumers of IT. The IT team should consist of unicorns with a mix of infra and app development expertise. They should make APIs a necessity for every system that they develop or implement. They should be willing to acquire new skills at a much faster rate, fail fast, and remain nimble. The whole notion of subject matter expert should transform to subject learning expert.
On the other hand, consumers need to self-serve based on business needs. They need to be held accountable for their IT usage and they need to be willing to relinquish control of those systems, which can be managed and operated by a third party at a lower cost. The consumers should rely more on “knowledge networks” within the organization for support and nurture the culture of contributing to the knowledge networks.
Summary
Clearing Technical Debt can be a mammoth exercise, but not an impossible. With sufficient planning, this can be achieved over a period of time with little to no disruption to current business processes while and allowing pursuit of new business opportunities.SW
Govindaraj Rangan is the head of data center and cloud innovation at Wipro Limited. He has 19 years of industry experience across the breadth of the technology spectrum from application development to IT operations, UX design to IT security controls, presales to Implementation, converged systems to Internet of things, and strategy to hands-on. Before joining Wipro, Govindaraj spent 10-plus years at Microsoft as a technology strategist, working with some of the largest enterprise customers in India. He has also worked in the CIO/CTO organizations of Texas Instruments, automatic data processing, D. E. Shaw & Co. and PCL Mindware. Govindaraj has an M.B.A. from ICFAI University specializing in Finance, an M.S. in Software Systems from BITS Pilani and B.E. (EEE) from Madras University. Professionally, Govindaraj is MCSE, CISSP, PMP, ITIL Foundation certified.