Service Operations
6. Organizing for Service Operation
6.1 Functions
A function is a logical concept that refers to the people and automated measures that execute a defined process, an activity or a combination of processes or activities. In larger organizations a function may be broken up and performed by several departments, teams and groups, or it may be embodied within a single organizational unit.
|
Figure 6.1 Service Operation functions |
The Service Operation functions given in Figure 6.1 are needed to manage the 'steady state' operational IT
environment. These are logical functions and do not necessarily have to be performed by an equivalent organizational structure. This means that Technical and Application Management can be organized in any combination and into any number of departments. The second-level groupings in Figure 6.1 are examples of typical groups of activities performed by Technical Management (see Chapter 5) and are not a suggested organization structure.
The following is an overview of the Service Operation functions in Figure 6.1:
- The Service Desk is the primary point of contact for users when there is a service disruption, for service requests or even for some categories of Request for Change. The Service Desk provides a point of communication to the users and a point of coordination for several IT groups and processes. To enable them to perform these actions effectively the Service Desk is usually separate from the other Service Operation functions. In some cases, e.g. where detailed technical support is offered to users on the first call, it may be necessary for Technical or Application Management staff to be on the Service Desk. This does not mean that the Service Desk becomes part of the Technical Management function. In fact, while they are on the Service Desk, they cease to be a part of the Technical Management or Application Management functions and become part of the Service Desk, even if only temporarily.
- Technical Management provides detailed technical skills and resources needed to support the ongoing operation of the IT Infrastructure. Technical Management also plays an important role in the design, testing, release and improvement of IT services. In small organizations, it is possible to manage this expertise in a single department, but larger organizations are typically split into a number of technically specialized departments (see later in this chapter). In many organizations, the Technical Management departments are also responsible for the daily operation of a subset of the IT Infrastructure. Figure 6.1 shows that, although they are part of a Technical Management department, staff who perform these activities are logically part of the IT Operations Management function.
- IT Operations Management is the function responsible for the daily operational activities needed to manage the IT Infrastructure. This is done according to the Performance Standards defined during Service Design. In some organizations this is a single, centralized department, while in others some activities and staff are centralized and some are provided by distributed or specialized departments. This is illustrated in Figure 6.1 by the overlapping from the Technical and Application Management functions. IT Operations Management has two functions that are unique and which are generally formal organizational structures. These are:
- IT Operations Control, which is generally staffed by shifts of operators and which ensures that
routine operational tasks are carried out. IT
Operations Control will also provide centralized
monitoring and control activities, usually using an
Operations Bridge or Network Operations Centre.
- Facilities Management refers to the management
of the physical IT environment, usually Data
Centres or computer rooms. In many organizations
Technical and Application Management are co
located with IT Operations in large Data Centres. In
some organizations many physical components of
the IT Infrastructure have been outsourced and
Facilities Management may include the
management of the outsourcing contracts.
- Application Management is responsible for managing applications throughout their lifecycle. The Application Management function supports and maintains operational applications and also plays an important role in the design, testing and improvement of applications that form part of IT services. Application Management is usually divided into departments based on the application portfolio of the organization (see the examples in Figure 6.1), thus allowing easier specialization and more focused support. In many organizations Application Management departments have staff who perform daily operations for those applications. As with Technical Management, these staff logically form part of the IT Operations Management function.
Special note on Information Security Management
Although most would agree that Information Security Management is a function, it is highly specialized and spans several phases of the lifecycle. It is also responsible for the oversight of many activities within all Service Operation functions. For a more in-depth description of Information Security Management, please refer to the Service Design publication and to section 5.13 of this publication.
|
6.1.1 Functions And Activities
Chapter 5 of this publication introduced a number of common Service Operation activities. Due to the technical nature and specialization of these activities, the teams, groups or departments that perform them are often given names that correspond to the particular activities. For example, Network Management could be performed by a 'Network Management Department'. This, however, is by no means a rule. There are a number of options available in mapping activities to a team or department, for example:
- One activity could be performed by several teams or departments, e.g. if an organization has five major Application Support departments, each supporting a different set of applications, each of these departments could perform Database Administration for 'its' applications
- One department could perform several activities, e.g. the Network Management Department could be responsible for managing the network, Directory Services Management and Server Management
- An activity could be performed by groups, e.g. Security Administration can be performed by any person with responsibility for managing an application server, middleware or desktop.
These organizational decisions are influenced by a numb: of factors, such as:
- The size and location of the organization. Smaller, less distributed organizations will tend to combine these functions, whereas large, decentralized organizations may have several teams or departments performing the same activity (e.g. per region).
- The complexity of technology used in the organization. The higher the number of different technologies used, the more likely there are to be several different teams, each doing something similar, but in a different context (e.g. UNIX Server Management and Windows Server Management).
- The availability of skills. Where technical skills are scarce, it is common for organizations to use generalists to perform multiple groups of activities - although, in some cases, security considerations make this very difficult. For example, an organization working on classified or secret projects may have to hire expensive, specialized resources even when that means relocating them or contracting through security-cleared vendors.
- The culture of the organization. Some organizations prefer to work in highly specialized environments, while others tend to prefer the flexibility of generalist staff.
- The financial situation of the organization will determine how many people, with what type of skill, can be employed and how they will be organized.
As a result of these factors, it is impossible for this publication to prescribe an appropriate organizational structure that will fit every situation, however, the following sections list the required activities under the functional groups most likely to be involved in their operation. Please note that this does not mean that all organizations have to use these divisions.
Smaller organizations will tend to combine these activities into single departments, or even individuals - if they are even needed at all.
Special note on outsourcing
These organizational considerations are likely to be most relevant to internal IT organizations. The situation becomes even more complex when some or all of a particular activity or function are outsourced. Prime opportunities for outsourcing have been the Service Desk and Network Operations. This will be covered in more detail in ITIL Complementary Guidance, but some of the key points to remember are:
- Regardless of who is performing the activity, the company contracting the outsourcer is still responsible for ensuring that it is performed to a standard that will support the delivery of services to their customers and users.
- Outsourcing to solve an organization's problems or as an alternative to good Service Management processes rarely works. The best results are obtained if these are in place before outsourcing.
- Outsourcing works best when there is active involvement by both organizations. If the staff and managers of the customer organization disengage, the outsourcer is unlikely to be successful, simply because nobody understands the organization better than the people who work there.
- The outsourcer should not determine their outputs or how they are measured. These are determined by understanding the business requirements of users and customers and ensuring that they can be met by the outsourcer's capabilities.
- Although the outsourcer's services become an integral part of the organization, they are still a third-party organization, with a different set of business objectives, policies and practices. Security standards must be upheld and both parties must clearly understand their respective roles and contributions.
|
6.2 Service Desk
A Service Desk is a functional unit made up of a dedicated number of staff responsible for dealing with a variety of service events, often made via telephone calls, web interface, or automatically reported infrastructure events.
.The Service Desk is a vitally important part of an organization's IT Department and should be the single point of contact for IT users on a day-by-day basis - and will handle all incidents and service requests, usually using specialist software tools to log and manage all such events.
The value of an effective Service Desk should not be underrated - a good Service Desk can often compensate for deficiencies elsewhere in the IT organization, but a poor Service Desk (or the lack of a Service Desk) can give a poor impression of an otherwise very effective IT organization!
It is therefore very important that the correct calibre of staff is used on the Service Desk and that IT Managers do their best to make the desk an attractive place to work to improve staff retention.
The exact nature, type, size and location of a Service Desk will vary, depending upon the type of business, number of users, geography, complexity of calls, scope of services and many other factors.
In alignment to customer and business requirements, the IT organization's senior managers should decide the exact nature of its required Service Desk (and whether it should be internal or outsourced to a third party) as part of its overall ITSM strategy (see Service Strategy publication) - and then subsequent planning must be done to prepare for and then implement the appropriate Service Desk function (either when implementing a new function, or more likely these days when making necessary amendments to an existing function - see Service Design and Service Transition publications).
6.2.1 Justification And Role Of The Service Desk
Very little justification is needed today for a Service Desk, as many organizations have become convinced that this is by far the best approach for dealing with first-line IT support issues. One only needs ask the question 'What is the alternative?' to make a compelling case for the Service Desk concept. Where further justification is needed, the following benefits should be considered:
- Improved customer service, perception and satisfaction
- Increased accessibility through a single point of contact, communication and information
- Better-quality and faster turnaround of customer or user requests
- Improved teamwork and communication
- Enhanced focus and a proactive approach to service @ provision
- A reduced negative business impact
- Better-managed infrastructure and control
- Improved usage of IT Support resources and increased productivity of business personnel
- More meaningful management information for decision support
- It is common practice that the Service Desk provides 'entry-level' positions for ITSM staff. Working on the Service Desk is an excellent 'grounding' for anyone who wishes to pursue a career in Service Management. However, this could also present challenges with people who do not understand the business or technology. Users calling the Service Desk should be able to speak to someone who is able to address their needs, and Service Desk Analysts should not be burned out in less than a year because of undue stress. Care should be taken to select appropriately skilled individuals with a good understanding of the business and to provide adequate training - thus preventing reduction in levels of support due to a lack of knowledge at the first line.
6.2.2 Service Desk Objectives
The primary aim of the Service Desk is to restore the 'normal service' to the users as quickly as possible. In this context 'restoration of service' is meant in the widest possible sense. While this could involve fixing a technical fault, it could equally involve fulfilling a service request or answering a query - anything that is needed to allow the users to return to working satisfactorily.
Specific responsibilities will includeN:
- Logging all relevant incident/service request details, allocating categorization and prioritization codes
- Providing first-line investigation and diagnosis
- Resolving those incidents/service requests they are able
- Escalating incidents/service requests that they cannot resolve within agreed timescales
- Keeping users informed of progress
- Closing all resolved incidents, requests and other calls
- Conducting customer/user satisfaction call backs/surveys as agreed
- Communication with users - keeping them informed of incident progress, notifying them of impending changes or agreed outages, etc.
- Updating the CMS under the direction and approval of Configuration Management if so agreed.
6.2.3 Service Desk Organizational Structure
There are many ways of structuring Service Desks and locating them - and the correct solution will vary for different organizations. The primary options are detailed below, but in reality an organization may need to implement a structure that combines a number of these options in order to fully meet the business needs.
6.2.3.1 Local Service Desk
This is where a desk is co-located within or physically close to the user community it serves. This often aids communication and gives a clearly visible presence, which some users like, but can often be inefficient and expensive to resource as staff are tied up waiting to deal with incidents when the volume and arrival rate of calls may not justify this.
There may, however, be some valid reasons for maintaining a local desk, even where call volumes alone do not justify this.
Reasons might include:
- Language and cultural or political differences
- Different time zones
- Specialized groups of users
- The existence of customized or specialized services that require specialist knowledge
- VIP/criticality status of users.
|
|
| Figure 6.2 Local Service Desk |
|
6.2.3.2 Centralized Service Desk
It is possible to reduce the number of Service Desks by merging them into a single location (or into a smaller number of locations) by drawing the staff into one or more centralized Service Desk structures. This can be more efficient and cost-effective, allowing fewer overall staff to deal with a higher volume of calls, and can also lead to higher skill levels through great familiarization through more frequent occurrence of events. It might still be necessary to maintain some form of 'local presence' to handle physical support requirements, but such staff can be controlled and deployed from the central desk
|
|
| Figure 6.3 Centralized Service Desk |
|
6.2.3.3 Virtual Service Desk
Through the use of technology, particularly the Internet, and the use of corporate support tools, it is possible to give the impression of a single, centralized Service Desk when in fact the personnel may be spread or located in any number or type of geographical or structural locations. This brings in the option of 'home working', secondary support group, off-shoring or outsourcing - or any combination necessary to meet user demand. It is important to note, however, that safeguards are needed in all of these circumstances to ensure consistency and uniformity in service quality and cultural terms.
|
|
| Figure 6.4 Virtualized Service Desk |
|
6.2.3.4 Follow the Sun
Some global or international organizations may wish to combine two or more of their geographically dispersed Service Desks to provide a 24-hour follow-the-sun service. For example, a Service Desk in Asia-Pacific may handle calls during its standard office hours and at the end of this period it may hand over responsibility for any open incidents to a European-based desk. That desk will handle these calls alongside its own incidents during its standard day and then hand over to a USA-based desk - which finally hands back responsibility to the Asia-Pacific desk to complete the cycle.
This can give 24-hour coverage at relatively low cost, as no desk has to work more than a single shift. However, the same safeguards of common processes, tools, shared database of information and culture must be addressed for this approach to proceed - and well-controlled escalation and hand-over processes are needed.
6.2.3.5 Specialized Service Desk Groups
For some organizations it might be beneficial to create 'specialist groups' within the overall Service Desk structure, so that incidents relating to a particular IT service can be routed directly (normally via telephony selection or a web-based interface) to the specialist group. This can allow faster resolution of these incidents, through greater familiarity and specialist training.
The selection would be made using a script along the lines of 'If your call is about the X Service, please press 1 now, otherwise please hold for a Service Desk analyst'.
Care is needed not to over complicate the selection, so specialist groups should only be considered for a very small number of key services where these exist, and where call rates about that service justify a separate specialist group.
6.2.3.6 Environment
- The environment where the Service Desk is to be located should be carefully chosen. Where possible, the following facilities should be provided:
- A location where the entire function can be positioned with sufficient natural light and overall space - to allow adequate desk and storage-space, and room to move around if necessary
- A quiet environment with adequate acoustic control so that one telephone conversation is not disrupted by another
- Pleasant surroundings and comfortable furniture so as to lighten the mood (the Service Desk can be a very stressful place to work, so every little helps!)
- A separate rest-room and refreshment area nearby so that staff can take short breaks as appropriate when necessary without being away for too long.
Anecdote
One company found that there was a 'them and us' culture existing between the Service Desk and the other support teams. The third-line teams often believed themselves to be better than the Service Desk. Hiding the Service Desk away in an isolated room helped to reinforce this culture. The company found that creating an open-plan office with the Service Desk in the middle encouraged closer working and helped to break down these barriers.
|
6.2.3.7 Building A Single Point Of Contact
Regardless of the combination of options chosen to fulfill an organization's overall Service Desk structure, individual users should be in no doubt about who to contact if they need assistance. A single telephone number (or a single number for each group if separate desks are chosen) should be provided and well publicized - as well as a single e-mail address and a single web Service Desk contact page.
Ideas that can be successfully used to help publicize the Service Desk telephone number and e-mail address, and making it available close to hand when users are likely to need them, are:
- Including the Service Desk telephone number on hardware Cl labels, attached to the components the user is likely to be calling about
- Printing Service Desk contact details on telephones
- For PCs and laptops, using a customized background or desktop with the Service Desk contact details, together with information read from the system that will be needed when calling (such as IP address, OS build number, etc.) in one corner
- Printing the Service Desk number on 'freebies' (pens, pencils, mugs, mouse-mats, etc.)
- Prominently placing these details on Service Desk Internet/intranet sites
- Including them on any calling cards or satisfaction survey cards left with users when a desk visit has been necessary
- Repeating the details on all correspondence sent to
the users (together with call reference numbers)
- Placing the details on notice boards or physical
locations that users are likely to regularly visit
(entrances, canteens, refreshment areas, etc.).
6.2.4 Service Desk Staffing
The issues involved in, and criteria for, establishing the appropriate staffing model and levels are discussed in this section. Details about typical Service Desk roles and responsibilities can be found in paragraph 6.6.1 below. They include the Service Desk Manager, Supervisor, Analysts and, in some organizations, these roles are complemented by business users ('Super Users') who provide first-line support.
6.2.4.1 Staffing Levels
An organization must ensure that the correct number of staff are available at any given time to match the demand being placed upon the desk by the business. Call rates can be very volatile and often in the same day the arrival rate may go from very high to very low and back again. An organization planning a new desk should attempt to predict the call arrival rate and profile - and to staff accordingly. Statistical analysis of call arrival rates under current support arrangements must be undertaken and then closely monitored and adjusted as necessary.
Many organizations will find that call rates peak during the start of the office day and then fall off quickly, perhaps with another burst in the early part of the afternoon - this obviously varies depending upon the organization's business but is an often occurring pattern for many organizations. In such circumstances it may be possible to utilize part-time staff, home-workers, second-line support staff or third parties to cover the peaks.
The following factors should be considered when deciding staffing levels:
- Customer service expectations
- Business requirements, such as budget, call response times, etc.
- Size, relative age, design and complexity of the IT Infrastructure and Service Catalogue - for example, the number and type of incidents, the extent of customized versus standard off-the-shelf software deployed, etc.
- The number of customers and users to support, and associated factors such as:
- Number of customers and users speaking a different language
- Skill level
- Incident and Service Request types (and types of RFC if appropriate):
- Duration of time required for call types (e.g. simple queries, specialist application queries, hardware, etc.)
- Local or external expertise required
- The volume and types of incidents and Service Requests
- The period of support cover required, based on:
- Hours covered
- Out-of-hours support requirements
- Time zones to be covered
- Locations to be supported (particularly if Service
- Desk staff also conduct desk-side support)
- Travel time between locations
- Workload pattern of requests (e.g. daily, month
- end, etc.)
- The service level targets in place (response levels
- The type of response required:
- Telephone
- E-mail/fax/voicemail/video
- Physical attendance
- Online access/control
- The level of training required
- The support technologies available (e.g. phone systems, remote support tools, etc.)
- The existing skill levels of staff
- The processes and procedures in use.
All these items should be carefully considered before making any decision on staffing levels. This should also be reflected in the levels of documentation required. Remember that the better the service, the more the business will use it.
A number of tools are available to help determine the appropriate number of staff for the Service Desk. These workload modelling tools are dependent on detailed 'local knowledge' of the organization such as call volumes and patterns, service and user profiles, etc.
6.2.4.2 Skill Levels
An organization must decide on the level and range of skills it requires of its Service Desk staff - and then ensure that these skills are available at the appropriate times. A range of skill options are possible, starting from a 'call logging' service only - where staff need only very basic technical skills - right through to a 'technical' Service Desk where the organization's most technically skilled staff are used. In the case of the former, there will be a high handling but low resolution rate, while in the latter case this will be reversed.
The decision on the required skills level will often be driven by target resolution times (agreed with the business and captured in service level targets), the complexity of the systems supported and 'what the business is prepared to pay'.
There is a strong correlation between response and resolution targets and costs - generally speaking, the shorter the target times, the higher the cost because more resources are required.
While there may be instances when business dependency or criticality make a highly technically skilled desk an imperative, the optimum and most cost-effective approach is generally to have a 'call-logging' first line of support via the Service Desk, with quick and effective escalations to more skilled second-line and third-line resolution groups where skilled staff can be concentrated and more effectively utilized (see Incident Management, section 4.2, for more details and guidance on end-to-end support structures). However, this basic starting point can be improved over time by providing the first-line staff with an effective knowledge-base, diagnostic scripts and integrated support tools (including a CMS), as well as ongoing training and awareness, so that first-line resolution rates can gradually be increased.
Note that first-line resolution rates can be reduced by effective Problem Management, which will reduce a number of the simpler, repetitive incidents. In such cases, although the resolution rates appear to be going down, the overall service quality will have improved by the complete removal of many incidents. While this is good, if Service Desk staff are paid incentives or bonuses for first-call resolution, it could prove disastrous for morale and process effectiveness unless the bonus threshold is reviewed. |
This can also be achieved by locating second-level staff on the Service Desk, effectively creating a two-tier structure. This has advantages of making second-level staff available to help deal with peak call periods and to train more junior personnel, and it will often increase the first-call resolution rate. However, second-line staff often have duties outside of the Service Desk - resulting in rosters having to be managed or second-line staff positions being duplicated. In addition, having to deal with routine calls may be demotivating for more experienced staff. A further potential drawback is that the Service Desk becomes really good at resolving calls, whereas second-line staff should be focused on removing the root cause instead.
Another factor to consider when deciding on the skills requirements for Service Desk staff is the level of customization or specialization of the supported services. Standardized services require less specific knowledge to provide quality customer support. The more specialized the service, the more likely specialist knowledge will be required on the first call.
Improvements in resolution times/rates should not be left to chance, but should instead be part of an ongoing Service Improvement Plan (see the Continual Service Improvement publication for fuller details).
Once the required skill levels have been identified, there is an ongoing task to ensure that the Service Desk is operated in such a way that the necessary staff obtain and maintain the necessary skills - and that staff with the correct balance of skills are on duty at appropriate times so that consistency is maintained.
This will involve an ongoing training and awareness programme which should cover:
- Interpersonal skills: such as telephony skills, communication skills, active listening and customer care training.
- Business awareness: specific knowledge of the organization's business areas, drivers, structure, priorities, etc.
- Service awareness of all the organization's key IT services for which support is being provided
- Technical awareness (and deeper technical training to the appropriate level, depending upon the resolution rate sought)
- Depending on level of support provided, some diagnosis skills (e.g. Kepner and Tregoe)
- Support tools and techniques
- Awareness training and tutorials in new systems and technologies, prior to their introduction
- Processes and procedures (most particularly Incident, Change and Configuration Management - but an overview of all ITSM processes and procedures)
- Typing skills to ensure quick and accurate entry of incident or Service Request details.
For such a programme to be effective, skill requirements and levels should be evaluated periodically and training records maintained. Careful formulation of staffing rotations or schedules should be maintained so that a consistent balance of staff experience and appropriate skill levels are present during all critical operational periods. It is not sufficient to have only the right number of staff on duty - the correct blend of skills should also be available.
6.2.4.3 Training
It is vital that all Service Desk staff are adequately trained before they are called upon to staff the Service Desk. A formal induction programme should be undertaken by all new staff, the exact content of which will vary depending upon the existing skill levels and experience of the new recruit, but is likely to include many of the required skills as described above.
Where possible, a business awareness programme, including short periods of secondment into key business areas, should be provided for new staff who do not already have this level of business awareness.
Note: Investment should also be made in the professional development of Service Desk staff. Internal mentoring and shadowing second- and third-level support staff is a good start, but best-of-breed Service Desks benefit from a formalized programme of staff development. Organizational commitment to professional development helps instil a sense of accomplishment and opportunity to staff. This often leads to innovation in Service Desk operation (such as specialized services) which in turn drive operational efficiencies at all tier levels of support. It helps to build skills that can be used in their current role as well as it jump-starts the training for a new role. While it is important to develop their core competencies in their current role, having a clear career path and recognizing future requirement and development needs is also important.
|
When starting on the Service Desk, new staff should initially 'shadow' experienced staff - sit with them and listen in on calls - before starting to take calls themselves with a mentor listening in and able to intervene and provide support where necessary. The mentor should initially review each call with the trainee after it concludes to learn any lessons. The frequency of such reviews should be gradually reduced as experience and confidence grows but the mentor should still be available to provide ongoing support even when the trainee has reached the stage of going solo.
Mentors may need to be trained on how to mentor. Service Desk experience and technical skills are not the only requirements for mentoring. Effective knowledge transfer skills and the ability to teach without being condescending or threatening are equally important.
A programme will be necessary to keep Service Desk staff's knowledge up to date - and to make them aware of new developments, services and technologies. The timing of such events is critical so as not to impact upon the normal duties. Many Service Desks find that it is best to organize short 'tutorials' during quiet periods when staff are less likely to be needed for call handling.
6.2.4.4 Staff Retention
It is very important that all IT Managers recognize the importance of the Service Desk and the staff who work on it, and give this special attention. Any significant loss of staff can be disruptive and lead to inconsistency of service - so efforts should be made to make the Service Desk an attractive place to work.
Ways in which this can be done include proper recognition of the role with reward packages recognizing this, team-building exercises, staff rotation onto other activities (projects, second-line support, etc.).
The Service Desk can often be used as a stepping stone into other more technical or supervisory/managerial roles. If this is done, care is needed to ensure that proper succession planning takes place so that the desk does not lose all of its key expertise in any area at one time. Also, good documentation and cross-training can mitigate this risk.
6.2.4.5 Super Users
Many organizations find it useful to appoint or designate a number of 'Super Users' throughout the user community, to act as liaison points with IT in general and the Service Desk in particular.
Super Users can be given some additional training and awareness and used as a conduit for communications flow in both directions. They can be asked to filter requests and issues raised by the user community (in some cases even going as far as to have incidents or requests raised by the Super User) - this can help prevent 'incident storms' when a key service or component fails, affecting many users.
They can also be used to cascade information from the Service Desk outwards throughout their local user community, which can be very useful in disseminating service details to all users very quickly.
It is important to note that Super Users should log all calls that they deal with, and not just those that they pass on to IT. This will mean access to, and training on how to use, the Incident logging tools. This will help to measure the
.activity of the Super User and also to ensure that their position is not abused. In addition, it will ensure that valuable history regarding incidents and service quality are not lost.
It may also be possible for Super Users to be involved in:
- Staff training for users in their area
- Providing support for minor incidents or simple request fulfilment
- Involvement with new releases and rollouts.
Super Users do not necessarily provide support for the whole of IT. In many cases a Super User will only provide support for a specific application, module or business unit area. As a business user the Super User often has in-depth knowledge of how key business processes run and how services work in practice. This is very useful knowledge to share with the Service Desk, so that it can provide higher quality services in future.
It should be noted that a firm commitment is needed from potential Super Users, and specifically their management, that they will have the time and interest to perform this role before selection and training commences.
A Super User, while a valuable interface to the business and the Service Desk, must be given proper training, accountability and expectation. Super Users can be vulnerable to misuse if their role, responsibilities and
the process governing these are not clearly communicated to the users. It is imperative that a Super User is not seen as a replacement for, or a means to circumvent, the Service Desk.
6.2.5 Service Desk Metrics
Metrics should be established so that performance of the Service Desk can be evaluated at regular intervals. This is important to assess the health, maturity, efficiency, effectiveness and any opportunities to improve Service Desk operations.
Metrics for Service Desk performance must be realistic and carefully chosen. It is common to select those metrics that are easily available and that may seem to be a possible indication of performance; however, this can be misleading. For example, the total number of calls received by the Service Desk is not in itself an indication of either good or bad performance and may in fact be caused by events completely outside the control of the Service Desk - for example a particularly busy period for the organization, or the release of a new version of a major corporate system.
An increase in the number of calls to the Service Desk can indicate less reliable services over that period of time - but may also indicate increased user confidence in a Service Desk that is maturing, resulting in a higher likelihood that users will seek assistance rather than try to cope alone. For this type of metric to be reliable for reaching either conclusion, further comparison of previous periods for any Service Desk improvements implemented since the last measurement baseline, or service reliability changes, problems, etc. to isolate the true cause for the increase is needed.
Further analysis and more detailed metrics are therefore needed and must be examined over a period of time. These will include the call-handling statistics previously mentioned under telephony, and additionally:
- The first-line resolution rate: the percentage of calls
resolved at first line, without the need for escalation to
other support groups. This is the figure often quoted
by organizations as the primary measure of the Service
Desks performance - and used for comparison
purposes with the performance of other desks - but
care is needed when making any comparisons. For
greater accuracy and more valid comparisons this can
be broken down further as follows:
- The percentage of calls resolved during the first contact with the Service Desk, i.e. while the user is still on the telephone to report the call
- The percentage of calls resolved by the Service Desk staff themselves without having to seek deeper support from other groups. Note: some desks will choose to co-locate or embed more technically skilled second-line staff with the Service Desk (see Incident Management for further details). In such cases it is important when making comparisons to also separate out (i) the percentage resolved by the Service Desk staff alone; and (ii) the percentage resolved by the first-line Service Desk staff and second-line support staff combined.
- Average time to resolve an incident (when resolved at first line)
- Average time to escalate an incident (where first-line resolution is not possible)
- Average Service Desk cost of handling an incident.
Two metrics should be considered here:
- Total cost of the Service Desk divided by the number of calls. This will provide an average figure which is useful as an index and for planning purposes but does not accurately represent the relative costs of different types of calls
- By calculating the percentage of call duration time on the desk overall and working out a cost per minute (total costs for the period divided by total call duration minutes') this can be used to calculate the cost for individual calls and give a more accurate figure.
By evaluating the types of incidents with call duration, a more refined picture of 'cost per call' by types arises and gives an indication of which incident types tend to cost more to resolve and possible targets for improvements.
- Percentage of customer or user updates conducted within target times, as defined in SLA targets
- Average time to review and close a resolved call
- The number of calls broken down by time of day and day of week, combined with the average call-time metric, is critical in determining the number of staff required.
Further general details on metrics and how they should be used to drive forward service quality is included in the Continual Service Improvement publication.
6.2.5.1 Customer/User Satisfaction Surveys
As well as tracking the 'hard' measures of the Service Desk's performance (via the metrics described above), it is also important to assess 'soft' measures - such as how well the customers and users feel their calls have been answered, whether they feel the Service Desk operator was courteous and professional, whether they instilled confidence in the user.
This type of measure is best obtained from the users themselves. This can be done as part of a wider customer/user satisfaction survey covering all of IT or can be specifically targeted at Service Desk issues alone.
One effective way of achieving the latter is through a callback telephone survey, where an independent Service Desk Operator or Supervisor rings back a small percentage of users shortly after their incident has been resolved, to ask the specific questions needed.
Care should be taken to keep the number of questions to a minimum (five to six at the most) so that the users will have the time to cooperate. Also survey questions should be designed so that the user or customer knows what area or subject questions are about and which incident or service they are referring to. The Service Desk must act on low satisfaction levels and any feedback received.
To allow adequate comparisons, the same percentage of calls should be selected in each period and they should be rigorously carried out despite any other time pressures.
Surveys are a complex and specialized area, requiring a good understanding of statistics and survey techniques. This publication will not attempt to provide an overview of all of these, but a summary of some of the more widely used techniques and tools is listed in Table 6.1.
Technique/Tool | Advantages | Disadvantages
|
After-call survey Callers are asked to remain on the phone after the call and then asked to rate the service they were provided
|
- High response rate since the caller is already on the phone
- Caller is surveyed immediately after the call so their experience is recent
|
- People may feel pressured into taking the survey, resulting in a negative service experience
The surveyor is seen as part of the Service Desk being surveyed, which may discourage open answers
|
Outbound telephone survey Customers and users who have previously used the Service Desk are contacted some time after their experience with the Service Desk
|
- Higher response rate since the caller is interviewed directly
- Specific categories of user or customer can be targeted for feedback (e.g. people who requested a specific service, or people experienced a disruption to a particular service
|
- This method could be seen as intrusive, if the call disrupts the user or customer from their work
- The survey is conducted some time after the user or customer used the Service Desk, so their perception may have changed
|
Personal interviews Customers and users are interviewed personally by the person doing the survey. This is especially effective for customers or users who use the Service Desk extensively or who have had a very negative experience
|
- The interviewer is able to observe non-verbal signals as well as listening to what the user or customer is saying
- Users and customers feel a greater degree of personal attention and a sense that their answers are being taken serious)
|
- Interviews are time-consuming for both the interviewer and the respondent
- Users and customers could turn the interviews into complaint sessions
|
Group interviews Customers and users are interviewed in small groups. This is good for gathering general impressions and for determining whether there is a need to change certain aspects of the Service Desk, e.g. service hours or location
|
- A larger number of users and customers can be interviewed
- Questions are more generic and therefore more consistent between interviews
|
- People may not express themselves freely in front of their peers or managers
- People's opinions can easily be changed by others n the group during the interview
|
Postal/e-mail surveys Survey questionnaires are mailed to a target set of customers and users. They are asked to return their responses by e/mail
|
- Specific or all customers or users can be targeted
- Postal surveys can be anonymous, allowing people to express themselves more freely
- E-mail surveys are not anonymous, but can be created using automated forms that make it convenient and easy for the user to reply and increase the likelihood it will be completed
|
- Postal surveys are labour intensive to process
- The percentage of people responding to postal surveys tends to be small
- Misinterpretation of a question could affect the result
|
Online surveys Questionnaires are posted on a website and users and customers encouraged via e-mail or links from a popular site to participate in the survey
|
- The potential audience of these surveys is fairly large
- Respondents can complete the questionnaire in their own time
- The links on popular websites are good reminders without being intrusive
|
- The percentage of respondents cannot be predicted
|
Table 6.1 Survey techniques and tools |
6.2.6 Outsourcing the Service Desk
The decision to outsource is a strategic issue for senior managers - and is addressed in detail in the Service Strategy and Service Design publications. Many of the guidelines in this section are not unique to the Service Desk and can be applied to any function, support area or service being outsourced (or out-tasked).
Regardless of the reasons for, or the extent of, the outsourcing contract, it is vital that the organization retains responsibility for the activities and services provided by the Service Desk. The organization is ultimately responsible for the outcomes of the decision and must therefore determine what service the outsourcer provides, not the other way round.
If the outsourcing route is chosen, there are some safeguards that are needed to ensure that the outsourced Service Desk works effectively and efficiently with the organization's other IT teams and departments and that end-to-end Service Management control is maintained (this is particularly important for organizations seeking ISO/IEC 20000 certification as overall management control has to be demonstrated). Some of these safeguards are set out below.
6.2.6.1 Common Tools And Processes
The Service Desk does not have responsibility for all the processes and procedures that it initiates. For example, a Service Request is received by the Service Desk but the request is fulfilled by the internal IT Operational team.
If the Service Desk is outsourced, care must be taken that the tools are consistent with those still being used in the customer organization. Outsourcing is often seen as an opportunity to replace outdated or inadequate tools, only to find that there are severe integration problems between the new tool and the legacy tools and processes.
For this reason it is important to ensure that these issues are properly researched and the customer's requirements are adequately scoped and specified before the outsourcing contract. Service Desk tools must not only support the outsourced Service Desk, but they must support the customer organization's processes and business requirements as well.
Ideally the outsourced desk should use the same tools and processes (or, as a minimum, interfacing tools and processes) to allow smooth process flow between the Service Desk and second- and third-line support groups.
In addition, the outsourced Service Desk should have access to:
- All incident records and information
- Problem Records and information
- Known Error Data
- Change Schedule
- Sources of internal knowledge (especially technical or application experts)
- SKMS
- CMS
- Alerts from monitoring tools.
It is often a challenge integrating processes and tools in a less mature organization with those in a more mature organization. A common but incorrect assumption is that the maturity of the one organization will somehow result in higher maturity in the other. Active involvement to ensure alignment of processes and tools is essential to a smooth transition and ongoing management of services between the internal and external organizations. In fact, if this is not directly addressed, it could result in the failure of the contract.
It is also often incorrectly assumed that the proof of Service Management quality and maturity in an external outsource partner can be guaranteed by stating requirements in the procurement process for 'ITIL conformance' and / or 'ISO/IEC 20000 certification'. These statements may indicate that a potential supplier uses the ITIL Framework in its delivery of services to customers, or that they have achieved standards certification for their internal practices, but it is equally important to have the enabling technology in place and being used that demonstrates a service provider's capability to manage services and interface to internal practices harmoniously. There is no standard of compliance that ensures this and so procurement efforts should include specific queries to satisfy this requirement. More information on outsource provider acquisition can be found in the Service Design publication.
6.2.6.2 SLA Targets
The SLA targets for overall incident-handling and resolution times need to be agreed with the customers and between all teams and departments - and OLA/UC targets need to be coordinated and agreed with individual support groups so that they underpin and support the SLA targets.
Examples of these can be seen in the section on metrics above (see section 6.2.5).
6.2.6.3 Good Communications
The lines of communication between the outsourced Service Desk and the other support groups need to work very effectively. This can be assisted by some or all of the following steps:
- Close physical co-location
- Regular liaison/review meetings
- Cross-training tutorials between the teams and departments
- 'Partnership' arrangements when staff from both organizations are used jointly to staff the desk
- Communication Plans and performance targets are documented in a consistent manner in OLAs and UCs.
In cases where the Service Desk is located off-shore, not all of these measures will be possible. However, the need for training and communication of the Service Desk staff is still critical, even more so in cases where there are language and cultural differences.
This will be covered in more detail in ITIL complementary publications, but, as a rule, outsourcing companies who offer off-shore Service Desk solutions should take the following into account:
- Training programmes focused on cultural understanding of the customer market
- Language skills - especially the understanding of idiomatic use of the language in the customer market. This is not so that the Service Desk staff sound like natives of the customer's country (that type of insincerity is very quickly detected by customers), but to facilitate better understanding of the customer and the better to appreciate their priorities
- Regular visits by representatives of the customer organization to provide training and appropriate feedback directly to the Service Desk management and staff
- Training in the use of the customer organizations tools and methods of work. This is especially effective if similar training materials are presented by the same instructors as those used by the customer organization.
6.2.6.4 Ownership Of Data
Clear ownership of the data collected by the outsourced Service Desk must be established. Ownership of all data relative to users, customers, affected CIs, services, incidents, Service Requests, changes, etc. must remain with the organization that is outsourcing the activity - but both organizations will require access to it.
Data that is related specifically to performance of employees of the outsourcing company will remain the property of that company, which is often legally prevented from sharing the data with the customer organization. This may also be true of other data that is used purely for the internal management of the Service Desk, such as head count, optimization activities, Service Desk cost information, etc.
All reporting requirements and issues around ownership of data must be specified in the underpinning contract with the company providing the outsourcing service.
6.3 Technical Management
Technical Management refers to the groups, departments or teams that provide technical expertise and overall management of the IT Infrastructure.
6.3.1 Technical Management Role
Technical Management plays a dual role:
- It is the custodian of technical knowledge and expertise related to managing the IT Infrastructure. In this role, Technical Management ensures that the knowledge required to design, test, manage and improve IT services is identified, developed and refined.
- It provides the actual resources to support the ITSM Lifecycle. In this role Technical Management ensures that resources are effectively trained and deployed to design, build, transition, operate and improve the technology required to deliver and support IT services.
By performing these two roles, Technical Management is able to ensure that the organization has access to the right type and level of human resources to manage technology and, thus, to meet business objectives. Defining the requirements for these roles starts in Service Strategy and is expanded in Service Design, validated in Service Transition and refined in Continual Service Improvement (see other ITIL publications in this series).
Part of this role is also to ensure a balance between the skill level, utilization and the cost of these resources. For example, hiring a top-level resource at the higher end of the salary scale and then only using that skill for 10% of the time is not effective. A better Technical Management strategy would be to identify the times that the skill is needed and then hire a contractor for only those tasks.
Another strategy in larger organizations is to leverage specialist staff out of 'central' pools so that specialists can be well utilized and provide an economy of scale to the organization and minimize the need to hire in contractors. Specialized skills should be identified among resources in the IT organization, then leveraged for specific needs as they arise, analogous to a special tactical unit, whose members also perform regular duties but who are assigned to tasks needing their specialized skills. This type of resource utilization is particularly useful both for project teams and problem resolution.
An additional, but very important role played by Technical Management is to provide guidance to IT Operations about how best to carry out the ongoing operational management of technology. This role is partly carried out during the Service Design process, but it is also a part of everyday communication with IT Operations Management as they seek to achieve stability and optimum performance.
The objectives, activities and structures that enable Technical Management to perform these roles effectively are discussed below.
6.3.2 Technical Management Objectives
The objectives of Technical Management are to help plan, implement and maintain a stable technical infrastructure to support the organization's business processes through:
- Well designed and highly resilient, cost-effective technical topology
- The use of adequate technical skills to maintain the technical infrastructure in optimum condition
- Swift use of technical skills to speedily diagnose and resolve any technical failures that do occur.
6.3.3 Generic Technical Management Activities
Technical Management is involved in two types of activity:
- Activities that are generic to the Technical Management function as a whole are discussed in this section as they enable Technical Management as a function to execute its role.
- A set of discrete activities and processes, which are performed by all three functions of Technical, Application and IT Operations Management, are covered in Chapter 5.
Generic Technical Management activities are highlighted as follows:
- Identifying the knowledge and expertise required to manage and operate the IT Infrastructure and to deliver IT services. This process starts during the Service Strategy phase, is expanded in detail in Service Design and is executed in Service Operation. Ongoing assessment and updating of these skills is done during Continual Service Improvement.
- Documentation of the skills that exist in the organization, as well as those skills that need to be developed. This will include the development of Skills Inventories and the performance of Training Needs Analyses.
- Initiating training programmes to develop and refine the skills in the appropriate technical resources and maintaining training records for all technical resources.
- Design and delivery of training for users, the Service Desk and other groups. Although training requirements must be defined in Service Design, they are executed in Service Operation. Where Technical Management does not deliver training, it is responsible for identifying organizations that can provide it.
- Recruiting or contracting resources with skills that cannot be developed internally, or where there are insufficient people to perform the required Technical Management activities.
- Procuring skills for specific activities where the required skills are not available internally or in the open market, or where it is more cost-efficient to do so.
- Definition of standards used in the design of new architectures and participation in the definition of
- technology architectures during the Service Strategy and Design phases.
- Research and development of solutions that can help expand the Service Portfolio or which can be used to simplify or automate IT Operations, reduce costs or increase levels of IT service.
- Involvement in the design and building of new services. Technical Management will contribute to the design of the Technical Architecture and Performance standards for IT services. In addition, it will also be responsible for specifying the operational activities required to manage the IT Infrastructure on an ongoing basis.
- Involvement in projects, not only during Service Design and Service Transition, but also for Continual Service Improvement or operational projects, such as Operating System upgrades, server consolidation projects or physical moves.
- Availability and Capacity Management are dependent on Technical Management for engineering IT services to meet the levels of service required by the business. This means that modelling and workload forecasting are often done with Technical Management resources.
- Assistance in assessing risk, identifying critical service and system dependencies and defining and implementing countermeasures.
- Designing and performing tests for the functionality, performance and manageability of IT services.
- Managing vendors. Many Technical Management departments or groups are the only ones who know exactly what is required of a vendor and how to measure and manage them. For this reason, many organizations rely on Technical Management departments to manage contracts with vendors of specific CIs. If this is the case it is important to ensure that these relationships are managed as part of the SLM process.
- Definition and management of Event Management standards and tools. Technical Management will also monitor and respond to many categories of events.
- Technical Management departments or groups are integral to the performance of Incident Management. They receive incidents through Functional Escalation and provide second- and higher-level support. They are also involved in maintaining categories and defining the escalation procedures that are executed in Incident Management.
- Technical Management as a function provides the resources that execute the Problem Management process. It is its technical expertise and knowledge that is used to diagnose and resolve problems. It is also its relationship with the vendors that is used to escalate and follow up with vendor support teams.
- Technical Management resources will be involved in defining coding systems that are used in Incident and Problem Management (e.g. Incident Categories).
- Technical Management resources are used to support Problem Management in validating and maintaining the KEDB.
- Change Management relies on the technical knowledge and expertise to evaluate changes, and many changes will be built by Technical Management.
- Releases are frequently deployed using Technical Management resources.
- Technical Management will provide information for, and operationally maintain, the Configuration Management system and its data. This will be done in cooperation with Application Management to ensure that the correct CI attributes and relationships are created from the deployment of services and the ongoing maintenance over the life of CIs.
- Technical Management is involved in the Continual Service Improvement processes, particularly in identifying opportunities for improvement and then in helping to evaluate alternative solutions.
- As a custodian of technical knowledge and expertise, Technical Management ensures that all system and operating documentation is up to date and properly utilized. This includes ensuring that all management, administration and user manuals are up to date and complete and that technical staff are familiar with their contents.
- Updating and maintaining data used for reporting on technical and service capabilities, e.g. Capacity and Performance Management, Availability Management, Problem Management, etc.
- Assisting IT Financial Management to identify the cost of technology and IT human resources used to manage IT services.
- Involvement in defining the operational activities performed as part of IT Operations Management. Many Technical Management departments, groups or teams also perform the operational activities as part of an organization's IT Operations Management function.
6.3.4 Technical Management Organization
Technical Management is not normally provided by a single department or group. One or more Technical Support teams or departments will be needed to provide technical management and support for the IT
Infrastructure. In all but the smallest organizations, where a single combined team or department may suffice, separate teams or departments will be needed for each type of infrastructure being used.
IT Operations Management consists of a number of technological areas. Each of these requires a specific set of skills to manage and operate it. Some skill sets are related and can be performed by generalists, whereas others are specific to a component, system or platform.
The primary criterion of Technical Management organizational structure is that of specialization or division of labour. The principle is that people are grouped according to their technical skill sets, and that these skill sets are determined by the technology that needs to be managed.
Sections 6.6 and 6.7 cover the organizational aspects of Technical Management in detail, but this list provides some examples of typical Technical Management teams or departments:
- Mainframe team or department - if one or more mainframe types are still being used by the organization
- Server team or department - often split again by technology types (e.g. Unix server, Wintel server)
- Storage team or department, responsible for the management of all data storage devices and media
- Network Support team or department, looking after the organization's internal WANs/LANs and managing any external network suppliers
- Desktop team or department, responsible for all installed desktop equipment
- Database team or department, responsible for the creation, maintenance and support of the organization's databases
- Middleware team or department, responsible for the integration, testing and maintenance of all middleware in use in the organization
- Directory Services team or department, responsible for maintaining access and rights to service elements in the infrastructure
- Internet or Web team or department, responsible for managing the availability and security of access to servers and content by external customers, users and partners
- Messaging team or department, responsible for e-mail services
- IP-based Telephony team or department (e.g. VoIP).
6.3.5 Technical Design and Technical Maintenance and Support
Technical Management consists of specialist technical architects and designers (who are primarily involved during Service Design) and specialist maintenance and support staff (who are primarily involved during Service Operation).
In this publication, they are viewed as being part of the same function, but many organizations see them as two separate teams or even departments. The problem with this approach is that good design needs input from the people who are required to manage the solution - and good operation requires involvement from the people who designed the solution.
The problems that need to be overcome are similar to those faced in managing the Application Lifecycle (see section 6.5 for a more detailed discussion). The solution will include the following elements:
- Support staff should be involved during the design or architecture of a solution. Design staff should be involved in setting maintenance objectives and resolving support issues.
- A change in how both Design and Support staff are measured. Designers should be held partly accountable for design flaws that create operational outages. Support staff should be held partly accountable for contribution to the technical architecture.
6.3.6 Technical Management Metrics
Metrics for Technical Management will largely depend on which technology is being managed, but some generic metrics include:
- Measurement of agreed outputs. These could include:
- Contribution to achievement of services to the business. Although many of the Technical Management teams will not be in direct contact with the business, the technology they manage impacts the business. Metrics should reflect both negative (incidents traced to their team) and positive (system performance and availability) contributions
- Transaction rates and availability for critical business transactions
- Service Desk training
- Recording problem resolutions into the KEDB
- User measures of the quality of outputs as defined in the SLAs
- Installation and configuration of components under their control.
- Process metrics. Technical Management teams execute many Service Management process activities. Their ability to do so will be measured as part of the process metrics where appropriate (see section on each process for more details). Examples include:
- Response time to events and event completion rates
- Incident resolution times for second- and third-line support
- Problem resolution statistics
- Number of escalations and reason for those escalations
- Number of changes implemented and backed out
- Number of unauthorized changes detected
- Number of releases deployed, total and successful
- Security issues detected and resolved
- Actual system utilization against Capacity Plan forecasts (where the team has contributed to the development of the plan)
- Tracking against SIPs
- Expenditure against budget.
- Technology performance. These metrics are based on Service Design specifications and technical performance standards set by vendors, and will typically be contained in OLAs or Standard Operation Procedures. Actual metrics will vary by technology, but are likely to include:
- Utilization rates (e.g. memory or processor for server, bandwidth for networks, etc.)
- Availability (of systems, network, devices, etc.), which is helpful for measuring team or system performance, but is not to be confused with Service Availability - which requires the ability to measure the overall availability of the service and may use the availability figures for a number of individual systems or components
- Performance (e.g. response times, queuing rates, etc.).
- Mean Time Between Failures of specified equipment. This metric is used to ensure that good purchasing decisions are being made and, when compared with maintenance schedules, whether the equipment is being properly maintained
- Measurement of maintenance activity, including:
- Maintenance performed per schedule
- Number of maintenance windows exceeded
- Maintenance objectives achieved (number and percentage).
- Training and skills development. These metrics ensure that staff have the skills and training to manage the technology that is under their control, and will also identify areas where training is still required.
6.3.7 Technical Management Documentation
Technical Management is involved in drafting and maintaining several documents as part of other processes (e.g. Capacity Planning, Change Management, Problem Management, etc.). These documents are discussed in some detail in the relevant process descriptions. However, there are some documents that are specific to the Technical Management groups or teams who will provide document management and control for documents relating to the technology under their control. Technical Management documentation includes the following.
6.3.7.1 Technical Documentation
The sourcing and maintenance of technical documentation for all Cls is the responsibility of Technical Management. These include:
- Technical manuals
- Management and administration manuals
- User manuals for CIs. These will typically exclude application user manuals, which are maintained by Application Management.
6.3.7.2 Maintenance Schedules
These schedules are drawn up and agreed during the Service Design phase related to Availability and Capacity Management, but they are essentially the property of the various Technical Management departments, groups or teams. This is because they have the technical expertise for specific technologies and are most likely to know what is needed to keep them in working order.
For more details on the definition of Maintenance Schedules and Service Maintenance Objectives, refer to the ITIL Service Design publication.
6.3.7.3 Skills Inventory
A Skills Inventory is a system or tool that identifies the skills required to deliver and support IT services and also the individuals who possess those skills. Skills Inventories are most effective if they are aligned with processes, architectures and performance standards.
In addition, Skills Inventories should identify the training available to cultivate each skill should existing staff leave the organization.
Skills Inventories can also be used as part of the Service Portfolio to assess whether a new service can be delivered with existing staff and skill sets, or whether an investment needs to be made in new staff or training. Skills Inventories can therefore contribute significantly to Capacity Planning.
The definition and maintenance of Skills Inventories requires a good interface with Human Resource processes and tools in the organization.
6.4 IT Operations Management
In business, the term 'Operations Management' is used to mean the department, group or team of people responsible for performing the organization's day-to-day operational activities - such as running the production line in a manufacturing environment or managing the distribution centres and fleet movements within a logistics organization.
Operations Management generally has the following characteristics:
- There is work to ensure that a device, system or process is actually running or working (as opposed to strategy or planning)
- This is where plans are turned into actions
- The focus is on daily or shorter-term activities, although it should be noted that these activities will generally be performed and repeated over a relatively long period (as opposed to one-off project type activities)
- These activities are executed by specialized technical staff, who often have to undergo technical training to learn how to perform each activity
- There is a focus on building repeatable, consistent actions that - if repeated frequently enough at the right level of quality - will ensure the success of the operation
- This is where the actual value of the organization is delivered and measured
- There is a dependency on investment in equipment or human resources or both
- The value generated, must exceed the cost of the investment and all other organizational overheads (such as management and marketing costs) if the business is to succeed.
In a similar way, IT Operations Management can be defined as the function responsible for the ongoing management and maintenance of an organization's IT Infrastructure to ensure delivery of the agreed level of IT services to the business.
IT Operations can be defined as the set of activities involved in the day-to-day running of the IT Infrastructure for the purpose of delivering IT services at agreed levels to meet stated business objectives.
6.4.1 IT Operations Management Role
The role of Operations Management is to execute the ongoing activities and procedures required to manage and maintain the IT Infrastructure so as to deliver and support IT Services at the agreed levels. These have already been described in section 5, but are summarized here for completeness:
- Operations Control, which oversees the execution and monitoring of the operational activities and events in the IT Infrastructure. This can be done with the assistance of an Operations Bridge or Network Operations Centre. In addition to executing routine tasks from all technical areas, Operations Control also performs the following specific tasks:
- Console Management, which refers to defining central observation and monitoring capability and then using those consoles to exercise monitoring and control activities
- Job Scheduling, or the management of routine batch jobs or scripts
- Backup and Restore on behalf of all Technical and Application Management teams and departments and often on behalf of users
- Print and Output management for the collation and distribution of all centralized printing or electronic output
- Performance of maintenance activities on behalf of Technical or Application Management teams or departments.
- Facilities Management, which refers to the management of the physical IT environment, typically a Data Centre or computer rooms and recovery sites together with all the power and cooling equipment. Facilities Management also includes the coordination of large-scale consolidation projects, e.g. Data Centre consolidation or server consolidation projects. In some cases the management of a data centre is outsourced, in which case Facilities Management refers to the management of the outsourcing contract.
As with many IT Service Management processes and functions, IT Operations Management plays a dual role.
- IT Operations Management is responsible for executing the activities and performance standards defined during Service Design and tested during Service Transition. In this sense IT Operations' role is primarily to maintain the status quo. The stability of the IT infrastructure and consistency of IT Services is a primary concern of IT Operations. Even operational improvements are aimed at finding simpler and better ways of doing the same thing.
- At the same time, IT Operations is part of the process of adding value to the different lines of business and to support the value network (see the ITIL Service Strategy publication). The ability of the business to meet its objectives and to remain competitive depends on the output and reliability of the day-today operation of IT. As such, IT Operations Management must be able to continually adapt to business requirements and demand. The Business does not care that IT Operations complied with a standard procedure or that a server performed optimally. As business demand and requirements change, IT Operations Management must be able to keep pace with them, often challenging the status quo.
IT Operations must achieve a balance between these roles, which will require the following:
- An understanding of how technology is used to
provide IT services
- An understanding of the relative importance and
impact of those services on the business
- Procedures and manuals that outline the role of IT
Operations in both the management of technology
and the delivery of IT services
- A clearly differentiated set of metrics to report to the
business on the achievement of Service objectives; and
to report to IT managers on the efficiency and
effectiveness of IT Operations
- All IT Operations staff understand exactly how the
performance of the technology affects the delivery of
IT services
- A cost strategy aimed at balancing the requirements
of different business units with the cost savings
available through optimization of existing technology
or investment in new technology
- A value, rather than cost, based Return on Investment
strategy.
6.4.2 IT Operations Management Objectives
The objectives of IT Operations Management include:
- Maintenance of the status quo to achieve stability of
the organization's day-to-day processes and activities
- Regular scrutiny and improvements to achieve
improved service at reduced costs, while maintaining
stability
- Swift application of operational skills to diagnose and
resolve any IT operations failures that occur.
6.4.3 IT Operations Management Organization
Figure 6.1 in the introduction to Chapter 6 illustrated that IT Operations Management is seen as a function in its own right but that, in many cases, staff from Technical and Application Management groups form part of this function.
This means that some Technical and Application Management departments or groups will manage and execute their own operational activities. Others will delegate these activities to a dedicated IT Operations department.
There is no single method for assigning activities, as it depends on the maturity and stability of the infrastructure being managed. For example, Technical and Application Management areas that are fairly new and unstable tend to manage their own operations. Groups where the technology or application is stable, mature and well understood tend to have standardized their operations more and will therefore feel more comfortable delegating these activities.
Some options of how to structure IT Operations are discussed in detail in section 6.7 of this publication.
6.4.4 IT Operations Management Metrics
IT Operations Management is measured in terms of its effective execution of specified activities and procedures, as well as its execution of process activities. Examples of these are as follows:
- Successful completion of scheduled jobs
- Number of exceptions to scheduled activities and jobs
- Number of data or system restores required
- Equipment installation statistics, including number of items installed by type, successful installations, etc.
- Process metrics. IT Operations Management executes many Service Management process activities. Their ability to do so will be measured as part of the process metrics where appropriate (see section on each process for more details). Examples include:
- Response time to events
- Incident resolution times for incidents
- Number of security-related incidents
- Number of escalations and reason for those escalations
- Number of changes implemented and backed out
- Number of unauthorized changes detected
- Number of releases deployed, total and successful
- Tracking against SIPs
- Expenditure against budget.
- If maintenance activities have been delegated, then metrics related to these activities will also be appropriate:
- Maintenance performed per schedule
- Number of maintenance windows exceeded
- Maintenance objectives achieved (number and
percentage).
- Metrics related to Facilities Management are extensive, but typically include:
- Costs versus budget related to maintenance,
construction, security, shipping, etc.
- Incidents related to the building, e.g. repairs
needed to the facility
- Reports on access to the facility
- Number of security events and Incidents and their resolution
- Power usage statistics, especially as related to changes in layout and environmental conditioning strategies
- Events or incidents related to shipping and distribution.
6.4.5 IT Operations Management Documentation
A number of documents are produced and used during IT Operations Management. This list is a summary of some of the most important and does not include reports that are produced by IT Operations Management on behalf of other processes or functions.
6.4.5.1 Standard Operating Procedures
The SOPs are a set of documents containing detailed instructions and activity schedules for every IT Operations Management team, department or group.
These documents represent the routine work that needs to be done for every device, system or procedure. They also outline the procedures to be followed if an exception is detected or if a change is required.
SOP documents could also be used to define standard levels of performance for devices or procedures. In some organizations the SOP documents are referred to in the OLA. Instead of listing detailed performance measures in the OLA, a clause is inserted to refer to the performance standards in the SOP and how these will be measured and reported.
6.4.5.2 Operations Logs
Any activity that is conducted as part of IT Operations should be recorded for a number of reasons, including:
- They can be used to confirm the successful completion of specific jobs or activities
- They can be used to confirm that an IT service was delivered as agreed
- They can be used by Problem Management to research the root cause of incidents
- They are the basis for reports on the performance of the IT Operations Management teams and departments.
The format of these logs is as varied as the number of systems and Operations Management teams or departments. Examples of Operations Logs include the following:
- Operating System Logs stored on each device
- Application Activity Logs stored in a file on the application server
- Event Logs stored on the monitoring tool server
- Utilization Logs for key devices
- Physical access logs recording who accessed secure buildings and when
- Handwritten logs of actions performed by operators. This must be in a formal logbook or binder, numbered and stored in a secure environment. Checks should ensure that pages are not removed.
A policy needs to be established as part of the SOPs to state how long logs need to be kept, how they are archived and when they can be deleted. These policies will take into account statutory and compliance requirements. Policies should also specify the parameters for adequate storage and backup strategies to store and retrieve log files.
6.4.5.3 Shift Schedules and Reports
Shift Schedules are documents that outline the exact activities that need to be carried out during the shift. They will also list all dependencies and activity sequences. There will probably be more than one Shift Schedule, where each team will have a version for its own systems. It is important that all schedules are coordinated before the start of the shift. This is usually done by a person who is specialized in Shift Scheduling, with the help of scheduling tools.
A Shift Schedule could consist of a number of routine items that are included in the SOP. In this case the items
could simply be listed briefly with a reference to the section or page in the SOP.
Most Shift Schedules take the form of a checklist where operators can check off the item as it is completed, together with the time of completion. This makes it easy to see the progress of activities and also helps to identify any potential issues where jobs are taking too long.
Shift Reports are a form of Operations Log, but have the additional functions as follows:
- To record major events and actions that occurred during the shift
- To form part of the hand-over between shift leaders
- To report exceptions to Service Maintenance Objectives
- To identify any uncompleted activity that could result in degraded performance on any service during the next service hours.
6.4.5.4 Operations Schedule
The Operations Schedules are similar to Shift Schedules but cover all aspects of IT Operations at a high level. This schedule will include an overview of all planned changes, maintenance, routine jobs and additional work, together with information about upcoming business or vendor events. The Operations Schedule is used as the basis for the Daily Operations Meeting and is the master reference for all IT Operations managers to track progress and detect exceptions.
6.5 Application Management
Application Management is responsible for managing applications throughout their lifecycle. The Application Management function is performed by any department, group or team involved in managing and supporting operational applications. Application Management also plays an important role in the design, testing and improvement of applications that form part of IT services. As such, it may be involved in development projects, but is not usually the same as the Applications Development teams.
6.5.1 Application Management Role
Application Management is to applications what Technical Management is to the IT Infrastructure. Application Management plays a role in all applications, whether purchased or developed in-house. One of the key decisions that they contribute to is the decision of whether to buy an application or build it (this is discussed in detail in the Service Design publication). Once that decision is made, Application Management will play a dual role:
- It is the custodian of technical knowledge and expertise related to managing applications. In this role Application Management, working together with Technical Management, ensures that the knowledge required to design, test, manage and improve IT services is identified, developed and refined.
- It provides the actual resources to support the ITSM Lifecycle. In this role, Application Management ensure, that resources are effectively trained and deployed to design, build, transition, operate and improve the technology required to deliver and support IT services.
By performing these two roles, Application Management is able to ensure that the organization has access to the right type and level of human resources to manage applications and thus to meet business objectives. This starts in Service Strategy and is expanded in Service Design, tested in Service Transition and refined in Continual Service Improvement (see other ITIL publication' in this series).
Part of this role is to ensure a balance between the skill level and the cost of these resources.
In additional to these two high-level roles, Application Management also performs the following two specific roles:
- Providing guidance to IT Operations about how best to carry out the ongoing operational management of applications. This role is partly carried out during the Service Design process, but it is also a part of everyday communication with IT Operations Management as they seek to achieve stability and optimum performance.
- The integration of the Application Management Lifecycle into the ITSM Lifecycle. This is discussed below.
The objectives, activities and structures that enable Application Management to play these roles effectively are discussed below.
6.5.2 Application Management Objectives
The objectives of Application Management are to support the organization's business processes by helping to identify functional and manageability requirements for application software, and then to assist in the design and deployment of those applications and the ongoing support and improvement of those applications.
These objectives are achieved through:
- Applications that are well designed, resilient and cost-effective
- Ensuring that the required functionality is available to achieve the required business outcome
- The organization of adequate technical skills to maintain operational applications in optimum condition
- Swift use of technical skills to speedily diagnose and resolve any technical failures that do occur.
6.5.3 Application Management Principles
6.5.3.1 Build or Buy?
One of the key decisions in Application Management is whether to buy an application that supports the required functionality, or whether to build the application specifically for the organization's requirements. These decisions are often made by a Chief Technical Officer (CTO) or Steering Committee, but they are dependent on information from a number of sources. These are discussed in detail in Service Design, but are summarized here from an Application Management function perspective.
Application Management will assist in this decision during Service Design as follows:
- Application sizing and workload forecasts (see section 4.6.4)
- Specification of manageability requirements
- Identification of ongoing operational costs
- Data access requirements for reporting or integration into other applications
- Investigating to what extent the required functionality can be met by existing tools - and how much customization will be required to achieve this
- Estimating the cost of customization
- Identifying what skills will be required to support the solution (e.g. if an application is purchased, will it require a new set of employees, or can existing employees be trained to support it?)
- Administration requirements
- Security requirements.
If the decision is to build the application, a further decision needs to be made on whether the development will be outsourced or built using employees. This is detailed in the Service Strategy and Service Design publications, but there are some important considerations affecting Service Operation, for example:
- How will manageability requirements be specified and agreed (e.g. designing application and transaction monitoring)? These are sometimes forgotten when the operational teams or departments are not represented in the project
- What are the Acceptance Criteria for operational performance; how and where will the solution be tested and who will perform the tests?
- Who will own and manage the Definitive Library for that application?
- Who will design and maintain the operational management and administration scripts for these applications?
- Who is responsible for environment set-up and owning and maintaining the different infrastructure components?
- How will the solution be instrumented so that it is capable of generating the required events?
6.5.3.2 Operational Models
An Operational Model is the specification of the operational environment in which the application will eventually run when it goes live. This will be used during testing and transition phases to simulate and evaluate the live environment. This is a way of ensuring that the application can be sized correctly and the required environmental conditions can be documented and understood by all. The Operational Model should be defined and used in testing during the Service Design and Service Transition phases respectively (see Service Design and Service Transition publications).
6.5.4 Application Management Lifecycle
The lifecycle followed to develop and manage applications has been referred to by many names, including the Software Lifecycle (SLC) and Software Development Lifecycle (SDLC). These are generally used by Applications Development teams and their Project Managers to define their involvement in designing, building, testing, deploying and supporting applications. Examples of these approaches are Structured Systems Analysis and Design Methodology (SSADM), Dynamic Systems Development Method (DSDM), Rapid Application Development (RAD), etc.
ITIL is primarily interested in the overall management of applications as part of IT Services, whether they are developed in-house or purchased from a third party. For this reason, the term Application Management Lifecycle has been used, as it implies a more holistic view.
This should not replace the SDLC, which is still a valid approach used by developers, especially by third-party software companies. However, it does mean that there should be greater alignment between the development view of applications and the 'live' management of those applications.
This is more difficult in large-scale purchased applications, such as e-mail, since the developers do not typically interact individually with their application's users. However, the basic lifecycle still holds true in that the application needs requirements, design, customization, operation and deployment. Optimization is achieved through better management, improvements to customization and upgrades.
|
Figure 6.5 Application Management Lifecycle |
The Application Management Lifecycle is illustrated on the right:
ITSM processes and Applications Development processes have to be aligned as part of the overall strategy of delivering IT services in support of the business.
Applications Development and Operations are part of the same overall lifecycle and both should be involved at all stages, although their level of involvement will vary depending on the stage of the lifecycle.
Relationship between the Application Management and Service Management Lifecycles
The Application Management Lifecycle should not be seen as an alternative to the Service Management Lifecycle. Applications are part of services and have to be managed as such. Nevertheless, applications are a unique blend of technology and functionality and this requires a specialized focus at each stage of the Service Management Lifecycle.
Each stage of the Application Management Lifecycle has its own specific set of objectives, activities, deliverables and dedicated teams. Each stage also has a clear responsibility to ensure that their outputs match up to the specific objectives of the Service Management Lifecycle. Different aspects of Application Management are covered in detail in each of the ITIL publications, as follows:
- Service Strategy: Defines the overall architecture of applications and infrastructure. This will include defining the criteria for developing in-house, outsourcing development, or purchasing and customizing applications. Service Strategy will also assist in defining the Service Portfolio (including applications) which also includes information about the Return on Investment of applications and the services they support. Thus high-level requirements are set during this phase.
- Service Design: Helps to establish requirements for functionality and manageability of applications and works with Development teams to ensure that they meet these objectives. Service Design covers most of the Requirements phase and is involved during the Build phase of the Application Management Lifecycle.
- Service Transition: Application Development and Management teams are involved in testing and validating what has been built and deploying it operationally.
- Service Operation: This covers the Operate phase of the Application Management Lifecycle. These processes and structures are discussed in detail in this publication.
- Continual Service Improvement: Covers the Optimize phase of the Application Management Lifecycle. Continual Service Improvement measures the quality and relevance of applications in operation and provides recommendations on how to improve applications if there is a clear Return on Investment for doing so.
|
6.5.4.1 Requirements
This is the phase during which the requirements for a new application are gathered, based on the business needs of the organization. This phase is active primarily during the Service Design phase of the ITSM Lifecycle.
There are six types of requirements for any application, whether being developed in-house, outsourced or purchased:
- Functional requirements are those specifically required to support a particular business function
- Manageability requirements, looked at from a Service Management perspective, address the need for a responsive, available and secure service, and deal with such issues as deployment, operations, system management and security
- Usability requirements are those that address the needs of the end user, and result in features of the system that facilitate its ease of use
- Architectural requirements, especially if this requires a change to existing architecture standards
- Interface requirements, where there are dependencies between existing applications or tools and the new application
- Service Level Requirements, which specify how the service should perform, the quality of its output and any other qualitative aspects measured by the user or customer.
6.5.4.2 Design
This is the phase during which requirements are translated into specifications. Design includes the design of the application itself, and the design of the environment, or operational model that the application has to run on. Architectural considerations are the most important aspect of this phase, since they can impact on the structure and content of both application and operational model. Architectural considerations for the application (design of the application architecture) and architectural considerations for the operation model (design of the system architecture) are strongly related and need to be aligned.
In the case of purchased software, most organizations will not be allowed direct input to the design of the software (which has already been built). However, it is important that Application Management is able to provide feedback to the software vendor about the functionality, manageability and performance of the software. This will, in turn, be taken up by the software vendor as part of the continual improvement of the software.
Part of the evaluation process for purchased software should include an evaluation of whether the vendor is responsive to such feedback. At the same time, they should ensure that there is a balance between being responsive and changing their software so much that it is disruptive or that it changes some basic functionality.
Design for purchased software will also include the design of any customization that is required. Of special importance here is an evaluation of whether future version of the software will support the customization.
6.5.4.3 Build
In the Build phase, both the application and the operational model are made ready for deployment. Application components are coded or acquired, integrated and tested.
Please note that Test is not a separate stage in the lifecycle, even though it is a discrete activity, and even though tests are conducted independently of both the development and operational activities. Without the Build and Deploy phases, there would be nothing to test and, without testing, there would be no control over what is developed and deployed.
Testing is an integral component of both the Build and Deploy phases as a validation of the activity and output of those phases - even if it uses different environments and staff. Testing in the Build phase focuses on whether the application meets its functionality and manageability specifications. Often the distinction is made between a development and test environment. The test environment allows for testing the combination of application and operational model. Testing is covered in the ITIL Service Transition publication.
For purchased software, this will involve the actual purchase of the application, any required middleware and the related hardware and networking equipment. Any customization that is required will need to be done here, as will the creation of tables, categories, etc. that will be used. This is often done as a pilot implementation by the relevant Application Management team or department.
6.5.4.4 Deploy
In this phase, both the operational model and the application are deployed. The operational model is incorporated in the existing IT environment and the application is installed on top of the operational model, using the Release and Deployment Management process described in the ITIL Service Transition publication.
Testing also takes place during this phase, although here the emphasis is on ensuring that the deployment process and mechanisms work effectively, e.g. testing whether the application still functions to specification after it has been downloaded and installed. This is known as Early Life Support and covers a pre-defined guarantee period that testing, validation and monitoring of a new application or service during that period occurs. Early Life Support is covered in detail in the Service Transition publication.
6.5.4.5 Operate
In the Operate phase, the IT services organization operates the application as part of delivering a service required by the business. The performance of the application in relation to the overall service is measured continually against the Service Levels and key business drivers. It is important to distinguish that applications themselves do not equate to a service. It is common in many organizations to refer to applications as 'services'; however, applications are but one component of many needed to provide a business service.
The Operate phase is not exclusive to applications and is discussed throughout this publication, with a more detailed list of activities given in section 6.5.5 below.
6.5.4.6 Optimize
In the Optimize phase, the results of the Service Level performance measurements are measured, analysed and acted upon. Possible improvements are discussed and developments initiated if necessary. The two main strategies in this phase are to maintain and/or improve the Service Levels and to lower cost. This could lead to iteration in the lifecycle or to justified retirement of an application.
One important thing to remember about the Application Management Lifecycle is that, because it is circular, the same application can reside in different phases of the lifecycle at the same time. For example, when the next version of an application is being designed, and the current version is being deployed, the previous version might still be in operation in parts of an organization. This obviously requires strong version, configuration and release control.
Particular phases might take longer or seem more significant than others, but they are all crucial. Every application must go through all of them at least once and, because of the circular nature of the lifecycle, will go through some more than once.
This approach also supports iterative development approaches, where software is continually being
.developed in incremental steps. Each step follows the lifecycle and the application is built in increments, using business priorities as a driver.
Good communication is the key as an application works its way through the phases of the lifecycle. It is critical that high-quality information is passed along by those handling the application in one phase of its existence to those handling it in the next phase. It is also important that an organization monitors the quality of the Application Management Lifecycle. Changes in the lifecycle, for example in the way an organization passes information between the different phases, will affect its quality. Understanding the characteristics of every phase in the Application Management Lifecycle is crucial to improving the quality of the whole. Methods and tools used in one phase might have an impact on others, while optimization of one phase might sub-optimize the whole.
6.5.5 Application Management Generic Activities
While most Application Management teams or departments are dedicated to specific applications or sets of applications, there are a number of activities which they have in common. These include:
- Identifying the knowledge and expertise required to manage and operate applications in the delivery of IT services. This process starts during the Service Strategy phase, is expanded in detail in Service Design and is executed in Service Operation. Ongoing assessment and updating of these skills are done during Continual Service Improvement.
- Initiating training programmes to develop and refine the skills in the appropriate Application Management resources and maintaining training records for these resources.
- Recruiting or contracting resources with skills that cannot be developed internally, or where there are insufficient people to perform the required Application Management activities.
- Design and delivery of end-user training. Training may be developed and delivered by either the Application Development or Application Management groups, or by a third party, but Application Management is responsible for ensuring that training is conducted as appropriate.
- Insourcing for specific activities where the required skills are not available internally or in the open market, or where it is more cost-efficient to do so.
- Definition of standards used in the design of new architectures and participation in the definition of application architectures during the Service Strategy processes.
- Research and Development of solutions that can help expand the Service Portfolio or which can be used to simplify or automate IT Operations, reduce costs or increase levels of IT service.
- Involvement in the design and building of new services. All Application Management teams or departments will contribute to the design of the Technical Architecture and Performance standards for IT Services. In addition they will also be responsible for specifying the operational activities required to manage applications on an ongoing basis.
- Involvement in projects, not only during the Service Design process, but also for Continual Service Improvement or operational projects, such as Operating System upgrades, server consolidation projects or physical moves.
- Designing and performing tests for the functionality, performance and manageability of IT Services (bearing in mind that testing should be controlled and performed by an independent tester - see Service Transition publication).
- Availability and Capacity Management are dependent on Application Management for contributing to the design of applications to meet the levels of service required by the business. This means that modelling and workload forecasting are often done together with Technical and Application Management resources.
- Assistance in assessing risk, identifying critical service and system dependencies and defining and implementing countermeasures.
- Managing vendors. Many Application Management departments or groups are the only ones who know exactly what is required of a vendor and how to measure and manage them. For this reason, many organizations rely on Application Management to manage contracts with vendors of specific applications. If this is the case it is important to ensure that these relationships are managed as part of the SLM process.
- Involvement in definition of Event Management standards and especially in the instrumentation of applications for the generation of meaningful events.
- Application Management as a function provides the resources that execute the Problem Management process. It is their technical expertise and knowledge that is used to diagnose and resolve problems. It is also their relationship with the vendors that is used to escalate and follow up with vendor support teams or departments.
- Application Management resources will be involved in defining coding systems that are used in Incident and Problem Management (e.g. Incident Categories).
- Application Management resources are used to support Problem Management in validating and maintaining the KEDB together with the Application Development teams.
- Change Management relies on the technical knowledge and expertise to evaluate changes and many changes will be built by Application Management teams.
- Successful Release Management is dependent on involvement from Application Management staff. In fact they are frequently the drivers of the Release Management process for their applications.
- Application Management will define, manage and maintain attributes and relationships of application CIs in the CMS.
- Application Management is involved in the Continual Service Improvement processes, particularly in identifying opportunities for improvement and then in helping to evaluate alternative solutions.
- Application Management ensures that all system and operating documentation is up to date and properly utilized. This includes ensuring that all design, management and user manuals are up to date and complete and that Application Management staff and users are familiar with their contents.
- Collaboration with Technical Management on performing Training Needs Analysis and maintaining Skills Inventories.
- Assisting IT Financial Management to identify the cost of the ongoing management of applications.
- Involvement in defining the operational activities performed as part of IT Operations Management. Many Application Management departments, groups or teams also perform the operational activities as part of an organization's IT Operations Management function.
- Input into, and maintenance of, software configuration policies.
- Together with Software Development teams, the definition and maintenance of documentation related to applications.
- These will include user manuals, administration and management manuals, as well as any SOPs required to manage operational aspects of the application.
Application Management teams or departments will be needed for all key applications. The exact nature of the role will vary depending upon the applications being supported, but generic responsibilities are likely to include:
- Third-level support for incidents related to the application(s) covered by that team or department
- Involvement in operation testing plans and deployment issues
- Application bug tracking and patch management (coding fixes for in-house code, transports/patches for third-party code)
- Involvement in application operability and supportability issues such as error code design, error messaging, event management hooks
- Application sizing and performance; volume metrics and load testing etc. This is in support of Capacity and Availability Management processes
- Involvement in developing Release Policies
- Identification of enhancements to existing software, both from a functionality and manageability perspective.
6.5.6 Application Management Organization
Although all Application Management departments, groups or teams perform similar activities, each application or set of applications has a different set of management and operational requirements. Examples of these differences include:
- The purpose of the application. Each application was developed to meet a specific set of objectives, usually business objectives. For effective support and improvement, the group that manages that application needs to have a comprehensive understanding of the business context and how the application is used to meet its objectives. This is often achieved by Business Analysts who are close to the business and responsible for ensuring that business requirements are effectively translated into application specifications. Business Analysts should recognize that business requirements must be translated into both functional and manageability specifications.
- The functionality of the application. Each application is designed to work in a different way and to perform different functions at different times.
- The platform on which the application runs. Although the platform is usually managed by a Technical Management team or department, each of them affects the way in which an application needs to be managed and operated.
- The type or brand of technology used. Even applications that have similar functionality operate differently on different databases or platforms. These differences have to be understood in order to manage the application effectively. Even though the activities to manage these applications are generic, the specific schedule of activities and the way they are performed will be different. For this reason, Application Management teams and departments tend to be organized according to the categories of applications that they support. Typical examples of Application Management organizations include:
- Financial applications. In larger organizations where a number of different applications are used for different aspects of Financial Management, there may be several department, groups or teams managing these applications, e.g. Debtors and Creditors, Age Analysis, General Ledger, etc.
- Messaging and collaboration applications
- HR applications
- Manufacturing support applications
- Sales force automation
- Sales order processing applications
- Call centre and marketing applications
- Business-specific applications (e.g. health care,
insurance, banking, etc.)
- IT applications, such as Service Desk, Enterprise System
Management, etc.
- Web portals
- Online shopping.
6.5.6.1 Organizational Roles
Traditionally, Application Development and Management teams and departments have been autonomous units. Each one manages its own environment in its own way and each has a separate interface to the business. This is illustrated in Table 6.2.
| Application Development | Application Management
|
Primary focus | Building functionality for their customer. What the application does is more important to them than how it is operated | Focus on what the functionality is as well as how to deliver it. Manageability aspects of the application, i.e. how to ensure stability and performance of the application.
|
Management mode | Most development work is done in projects where the focus is on delivering specific units of work to specification, on time and within budget. This means that it is often difficult for developers to understand and build for ongoing operations, especially since they are not available for support of the application once they have moved on to the next project | Most work is done as part of repeatable, ongoing processes. A relatively small number of people work in projects.
This means that it is very difficult for operational staff to get involved in development projects, as that takes them away from their 'real jobs'
|
Measurement | Staff are rewarded for consistency and for preventing unexpected events and unauthorized functionality (e.g. 'bells and whistles' added by developers) | Staff are rewarded for creativity and for completing one project so that they can move on to the next project
|
Cost | Development projects are relatively easy to quantify since the resources are known and it is easy to link their expenses to a specific application or IT Service | Ongoing management costs are often mixed in with the costs of other IT services since resources are often shared across multiple IT services and applications
|
Lifecycles | Development staff focus on Software Development Lifecycles, which highlight the dependencies for successful operation, but do not assign accountability for these | Staff involved in ongoing management typically only control one or two phases of these lifecycles - Operation and Improvement
|
Table 6.2 Organizational roles |
Over the last several years, these two worlds are being brought together by recent moves to Object Oriented and SOA approaches, together with growing pressure from the Business to be more responsive and easy to work with.
This means that Application Development will have greater accountability for the successful operation of applications they design, while Application Management will have greater involvement in the development of applications.
This does not change the fundamental role of each group, but it does require a more integrated approach to the SLC. It will also mean that the output of Application Development will be more commoditized and that Application Management will be more involved in Development projects.
|
Figure 6.6 Role of teams in the Application Management Lifecycle |
This will require the following changes:
- A single interface to the business for all stages of the lifecycle and a common requirements and specification-setting process.
- A change in how both Development and Management staff are measured. Development teams should be held partly accountable for design flaws that create operational outages. Management staff should be held partly accountable for contribution to the technical architecture and manageability design of applications.
- A single Change Management process for both groups, with Change Control in each group being subordinate to the overall authority of Change Management (see Service Transition publication).
- A clear mapping of Development and Management activities in the lifecycle, which is illustrated at a high level in Figure 6.5. The exact activities and how they interact should be defined in each organization, although some generic guidelines are given in each of the ITIL publications.
- Greater focus on integrating functionality and manageability requirements early in the project.
Figure 6.6 shows a common Application Management Lifecycle with involvement from both groups. In this diagram it is clear that Application Development will be driving some phases with input from Application Management. In other cases Application Management will be driving the phase with input and support from Application Development. Both groups are subordinated to the IT Service Strategy of the organization and their efforts are coordinated through Service Transition mechanisms and processes.
6.5.7 Application Management Roles And Responsibilities
6.5.7.1 Applications Managers/Team Leaders
An Applications Manager or Team-leader (depending upon the size and/or importance of the team or department and the application they support, and the organization's structure and culture) will be needed for each of the applications teams or departments.
The role will:
- Take overall responsibility for leadership, control and decision-making for the applications team or department
- Provide technical knowledge and leadership in the specific applications support activities covered by the team or department
- Ensure necessary technical training, awareness and experience levels are maintained within the team or department relevant to the applications being supported and processes being used
- Involve ongoing communication with users and customers regarding application performance and evolving requirements of the business
- Report to senior management on all issues relevant to the applications being supported
- Perform line-management for all team or department members.
6.5.7.2 Applications Analyst/Architect
Application Analysts and Architects are responsible for matching requirements to application specifications. Specific activities include:
- Working with users, sponsors and all other stakeholders to determine their evolving needs
- Working with Technical Management to determine the highest level of system requirements required to meet the business requirements within budget and technology constraints
- Performing cost-benefit analyses to determine the most appropriate means to meet the stated requirement
- Developing Operational Models that will ensure optimal use of resources and the appropriate level of performance
- Ensuring that applications are designed to be effectively managed given the organization's technology architecture, available skills and tools
- Developing and maintaining standards for application sizing, performance modelling, etc
- Generating a set of acceptance test requirements, together with the designers, test engineers and the user, which determine that all of the high-level requirements have been met, both functional and with regard to manageability
- Input into the design of configuration data required to manage and track the application effectively.
An appropriate number of Application Analysts will be needed for each of the Application Management teams or department to perform the generic activities described in paragraph 6.5.5.
The ways in which Application Management groups can be organized, and the options available, are discussed in some detail in section 6.7 below.
6.5.8 Application Management Metrics
Metrics for Application Management will largely depend on which applications are being managed, but some generic metrics include:
- Measurement of agreed outputs. These could include:
- Ability of users to access the application and its functionality
- Reports and files are transmitted to the users
- Transaction rates and availability for critical business transactions
- Service Desk training
- Recording problem resolutions into the KEDB
- User measures of the quality of outputs as defined in the SLAs.
- Process metrics. Technical Management teams execute many Service Management process activities. Their ability to do so will be measured as part of the process metrics where appropriate (see section on each process for more details). Examples include:
- Response time to events and event completion rates
- Incident resolution times for second- and third-line support
- Problem resolution statistics
- Number of escalations and reason for those escalations
- Number of changes implemented and backed out
- Number of unauthorized changes detected
- Number of releases deployed, total and successful, including ensuring adherence to the Release Policies of the organization
- Security issues detected and resolved
- Actual system utilization against Capacity Plan forecasts (where the team has contributed to the development of the plan)
- Tracking against SIPs
- Expenditure against budget.
- Application performance. These metrics are based on Service Design specifications and technical performance standards set by vendors and will typically be contained in OLAs or SOPs. Actual metrics will vary by application, but are likely to include:
- Response times
- Application availability, which is helpful for measuring team or application performance but is not to be confused with Service Availability - which requires the ability to measure the overall availability of the service, and may use the availability figures for a number of individual systems or components
- Integrity of data and reporting.
- Measurement of maintenance activity, including:
- Maintenance performed per schedule
- Number of maintenance windows exceeded
- Maintenance objectives achieved (number and percentage).
- Application Management teams are likely to work closely with Application Development teams on projects, and appropriate metrics should be used to measure this, including:
- Time spent on projects
- Customer and user satisfaction with the output of the project
- Cost of involvement in the project.
- Training and skills development. These metrics ensure that staff have the skills and training to manage the technology that is under their control, and will also identify areas where training is still required.
6.5.9 Application Management Documentation
A number of documents are produced and used during Application Management. This list is a summary of some of the most important and does not include reports or documents that are produced by Application Management on behalf of other process or functions (e.g. RfC, Known Error documentation, Release Records, etc.)N.
6.5.9.1 Application Portfolio
The Application Portfolio is used primarily as part of Service Strategy, but is referenced here for completeness. The Application Portfolio is a list (more accurately a system or database) of all applications in use within the organization, together with the following information:
Key attributes of the application
- Customers and users
- Business purpose
- Level of business criticality
- Architecture (including the IT Infrastructure dependencies)
- Developers, support groups, suppliers or vendors
- The investment made in the application to date. In this respect the Application Portfolio can be used as an asset register for applications.
The purpose of the Application Portfolio is to analyse the need for and use of applications in the organization. It can be used to link functionality and investment to business activity and is therefore an important part of ongoing IT planning and control. Another benefit of the Application Portfolio is that it can be used to identify duplication and excessive licensing of applications.
The Application Portfolio forms part of the overall IT Service Portfolio, which is discussed in detail in the Service Strategy publication.
The Application Portfolio and the Service Catalogue
The Application Portfolio should not be mistaken for the Service Catalogue and should not be advertised as a list of services to customers or users. Applications are one of the components used to provide IT services, usually not the service itself.
The Application Portfolio should therefore be used as a planning document only by those managers and staff who are involved with the development and management of the organization's IT Strategy, as well as IT staff who are tasked with managing the applications or the platforms on which the applications run.
The Service Catalogue should focus on listing the services that are available, rather than simply listing applications and assuming that users and customers can make the link. Having said that, there are times when the application is synonymous with the service, e.g. word-processing applications are typically known by their name; an application hosting service will mention the names of the application hosted, etc.
|
6.5.9.2 Application Requirements
There are two sets of documents containing requirements for applications:
- Business Requirements outline the Business Case for the required application, in other words what the business will do with the application. This will include the Return on Investment for the application as well as all related improvements to the business. Business requirements will also include the Service Level Requirements as defined by the service customers and users.
- Application Requirements documents are based on the Business Requirements and specify exactly how the application will meet those requirements. In short, Application Requirements documents gather information that will be used to commission new applications or changes to existing applications, for example:
- To design the architecture of the application (specification of the different components of the system, how they relate to one another and how they will be managed)
- To specify a Request for Proposal (RFP) for a Commercial, Off the Shelf (COTS) application
- To initiate the design and building of an application in-house.
Requirements documents are normally owned by a project leader, either of a development project team, or for a team drawing up specifications for an RFP. Requirements documents are subject to document control for the project as they form part of the overall scope of the project.
Four different types of Application Requirements need to be defined (for more detailed information, please refer to the ITIL Service Design and Service Transition publications):
- Functional Requirements describe the things an application is intended to do, and can be expressed as services, tasks or functions the application is required to perform.
- Manageability Requirements are used to define what is needed to manage the application or to ensure that it performs the required functions consistently and at the right level. Manageability requirements also identify constraints on the IT system. These requirements serve as a basis for early system sizing and estimates of cost, and can support the assessment of the viability of the proposed IT system. Most importantly, they drive design of the operational models and performance standards used in IT Operations Management.
- Usability Requirements are normally specified by the users of the application and refer to its ease of use. Any special requirements for handicapped users also need to be specified here.
- Test Requirements specify what is required to ensure that the test environment is representative of the operational environment and that the test is valid (i.e. that it actually tests what it is supposed to).
6.5.9.3 Use and Change Cases
Use and Change Cases are managed as part of the Service Design and Continual Service Improvement processes, but are maintained by Application Management. For purchased software, it is common for the team that develops the functional specifications to maintain the Use Case for that application.
- Use Cases document the intended use of the application with real-life scenarios to demonstrate its boundaries and its full functionality. Use Cases can also be used as modelling and sizing scenarios and for
- facilitating communication between users, Developers and Application Management staff.
- Change Cases use scenarios to predict the impact of potential changes to utilization, architecture or functionality, and project the impact of specific change scenarios. Change Cases are used to clarify scope and direction with the sponsor. Extra architecture and design work will be needed at this point to ensure the Change Cases can be met in the future at reasonable cost. The sponsor must be prepared to pay the extra cost. If not, the Change Cases should be reduced to what the sponsor is prepared to pay for. Change Cases are also used to evaluate the architecture. They influence the development process enabling the design of appropriate architectural features to minimize the impact of future changes.
For more information, refer to the ITIL Service Design and Continual Service Improvement publications.
6.5.9.4 Design Documentation
This is not one specific document, but refers to any document produced by Application Development or Management staff that specifies how an application will be built. As these documents are generally owned and managed by the Development teams, this publication will not cover them in detail. However, to ensure successful operation, Application Management must ensure that design documentation contains:
- Sizing specifications
- Workload profiles and utilization forecasts
- Technical Architecture
- Data models
- Coding standards
- Performance standards
- Software Configuration Management definitions
- Environment definitions and building considerations (if appropriate).
For COTS applications, these documents take the form of Application Specifications that are used as input into the writing of RFPs. In these cases the documents are owned and managed by Application Management.
For more information on Design Documentation, refer to the ITIL Service Design publication.
6.5.9.5 Manuals
Application Management is responsible for the management of manuals for all applications. Although these are normally developed by the Application Development teams or third party suppliers, Application Management is responsible for ensuring that the manuals are relevant to the operational versions of the applications.
Three types of manuals are generally maintained by Application Management:
- Design manuals contain information about the structure and architecture of the application. These are helpful for creating reports or defining event correlation rules. They could also help in diagnosing problems.
- Administration or management manuals describe the activities required to maintain and operate the application at the levels of performance specified in the Design phase. These manuals will also provide detailed troubleshooting, Known Error and Fault descriptions, and step-by-step instructions for common maintenance tasks.
- User manuals describe the application functionality as it is used by an end-user. These manuals contain stepby-step instructions on how to use the application, as well as descriptions of what should typically be entered into certain fields, or what to do if there is an error.
Manuals and Standard Operating Procedures
Manuals should not be seen as a replacement for SOPs, but as input into the SOPs.
SOPs should contain all aspects of applications that need to be managed as part of standard operations. If they are not extracted from the manuals, there is a high likelihood that they will be ignored or performed in a non-standard manner. Application Management should ensure that any such instructions are extracted from the manuals and inserted into separate SOP documentation for Operations. It is also responsible for ensuring that these instructions are updated with every change or new release of the software.
|
6.6 Service Operation RolesN And Responsibilities
The key to effective ITSM is ensuring that there is clear accountability and roles defined to carry out the practice of Service Operation. A role is often tied to a job description or work group description but does not necessarily need to be filled by one individual. The size of an organization, how it is structured, the existence of external partners and other factors will influence how roles are assigned. Whether a particular role is filled by a single
individual or shared between two or more, the importance is the consistency of accountability and execution, along with the interaction with other roles in the organization.
6.6.1 Service Desk Roles
The following roles are needed for the Service Desk.
6.6.1.1 Service Desk Manager
In larger organizations where the Service Desk is of a significant size, a Service Desk Manager role may be justified with the Service Desk Supervisor(s) reporting to him or her. In such cases this role may take responsibility for some of the activities listed above and may additionally perform the following activities:
- Manage the overall desk activities, including the supervisors
- Act as a further escalation point for the supervisor(s)
- Take on a wider customer-services role
- Report to senior managers on any issue that could significantly impact the business
- Attend Change Advisory Board meetings
- Take overall responsibility for incident and Service Request handling on the Service Desk. This could also be expanded to any other activity taken on by the Service Desk - e.g. monitoring certain classes of event.
6.6.1.2 Service Desk Supervisor
In very small desks it is possible that the senior Service Desk Analyst will also act as the Supervisor - but in larger desks it is likely that a dedicated Service Desk Supervisor role will be needed. Where shift hours dictate it, there may be two or more post-holders who fulfil the role, usually on an overlapping basis. The Supervisor's role is likely to include:
- Ensuring that staffing and skill levels are maintained throughout operational hours by managing shift staffing schedules, etc.
- Undertaking HR activities as needed
- Acting as an escalation point where difficult or controversial calls are received
- Production of statistics and management reports
- Representing the Service Desk at meetings
- Arranging staff training and awareness sessions
- Liaising with senior management
- Liaising with Change Management
- Performing briefings to Service Desk staff on changes or deployments that may affect volumes at the Service Desk
- Assisting analysts in providing first-line support when workloads are high, or where additional experience is required.
6.6.1.3 Service Desk Analysts
The primary Service Desk Analyst role is that of providing first-level support through taking calls and handling the resulting incidents or Service Requests using the Incident Reporting and Request Fulfilment processes, in line with the objectives described earlier. The exact number of staff required is discussed in paragraph 6.2.4.1.
6.6.1.4 Super Users
Super Users are discussed in detail in the section on Service Desk staffing in paragraph 6.2.4. In summary, this role will consist of business users who act as liaison points with IT in general and the Service Desk in particular. The role of the Super User can be summarized as follows:
- To facilitate communication between IT and the business at an operational level
- To reinforce expectations of users regarding what Service Levels have been agreed
- Staff training for users in their area
- Providing support for minor incidents or simple request Fulfilment
- Involvement with new releases and rollouts.
6.6.2 Technical Management Roles
The following roles are needed in the Technical Management areas.
6.6.2.1 Technical Managers/Team Leaders
A Technical Manager or Team-leader (depending upon the size and/or importance of the team and the organization's structure and culture) may be needed for each of the technical teams or departments. The role will:
- Take overall responsibility for leadership, control and decision-making for the technical team or department
- Provide technical knowledge and leadership in the specific technical areas covered by the team or department
- Ensure necessary technical training, awareness and experience levels are maintained within the team or department
- Report to senior management on all technical issues relevant to their area of responsibility
- Perform line-management for all team or department members.
6.6.2.2 Technical Analysts/Architects
This term refers to any staff member in Technical Management who performs the activities listed in paragraph 6.3.3, excluding the daily operational actions, which are performed by Operators in either Technical or IT Operations Management. Based on the list of generic activities in paragraph 6.3.3, the role of Technical Analysts and Architects includes:
- Working with users, sponsors, Application Management and all other stakeholders to determine their evolving needs
- Working with Application Management and other areas in Technical Management to determine the highest level of system requirements required to meet the requirements within budget and technology constraints
- Defining and maintaining knowledge about how systems are related and ensuring that dependencies are understood and managed accordingly
- Performing cost-benefit analyses to determine the most appropriate means to meet the stated requirements
- Developing Operational Models that will ensure optimal use of resources and the appropriate level of performance
- Ensuring that the infrastructure is configured to be effectively managed given the organization's technology architecture, available skills and tools
- Ensuring the consistent and reliable performance of the infrastructure to deliver the required level of service to the business
- Defining all tasks required to manage the infrastructure and ensuring that these tasks are performed appropriately
- Input into the design of configuration data required to manage and track the application effectively.
The ways in which Technical Management can be organized, and the options available, are discussed in some detail in section 6.7.
6.6.2.3 Technical Operator
This term is used to refer to any staff who performs day-to-day operational tasks in Technical Management. Usually, these tasks are delegated to a dedicated IT Operations team, and this role is therefore discussed in paragraph 6.6.3.4 on IT Operators.
6.6.3 IT Operations Management Roles
The following roles and needed in the IT Operations Management area:
6.6.3.1 IT Operations Manager
An IT Operations Manager will be needed to take overall responsibility for all of the IT Operations Management activities, which include:
- Operations Control, which oversees the execution and monitoring of the operational activities in the IT Infrastructure. This can be done with the assistance of an Operations Bridge or Network Operations Centre. In addition to executing routine tasks from all technical areas, Operations Control also performs the following specific tasks:
- Console Management, which refers to defining central observation and monitoring capability and then using those consoles to exercise monitoring and control activities
- Job Scheduling, or the management of routine batch jobs or scripts
- Backup and Restore on behalf of all Technical and Application Management teams or department and often on behalf of users
- Print and Output management for the collation and distribution of all centralized printing or electronic output.
- Facilities Management, which refers to the management of the physical IT environment, typically a Data Centre or computer rooms and recovery sites together with all the power and cooling equipment. Facilities Management also includes the coordination of large-scale consolidation projects, e.g. data centre consolidation or server consolidation projects. In some cases the management of a Data Centre is outsourced, in which case Facilities Management refers to the management of the outsourcing contract.
The role of the IT Operations Manager is to:
- Provide overall leadership, control and decision-making and take responsibility for the IT Operations Management teams and department
- Report to senior management on all IT Operations issues
- Perform line-management for all IT Operations team or department managers/supervisors.
6.6.3.2 Shift Leaders
Many IT Operations areas will work extended hours - on either a two- or three-shift basis. In such cases a shift
leader will be needed on each of the shifts, to perform the following activities:
- Take overall responsibility for leadership, control and decision-making during the shift period
- Ensure that all operational activities are satisfactorily performed within agreed timescales and in accordance with company policies and procedures
- Liaise with the other shift leader(s) to ensure handover, continuity and consistency between the shifts
- Act as line-manager for all Operations Analysts on his/her shift
- Assume overall health and safety, and security responsibility for the shift (unless specifically designated to other staff members).
6.6.3.3 IT Operations Analysts
IT Operations Analysts are senior IT Operations staff who are able to determine the most effective and efficient way to conduct a series of operations, usually in high-volume, diverse environments.
This role is normally performed as part of Technical Management, but large organizations may find that the volume and diversity of operational activities requires some more in-depth planning and execution. Examples include Job Scheduling and the definition of a Backup strategy and schedule.
6.6.3.4 IT Operators
IT Operators are the staff who perform the day-to-day operational activities that are defined in Technical or Application Management and, in some cases, IT Operations Analysts. Typical Operator roles include:
- Performing backups
- Console operations, i.e. monitoring the status of specific systems, job queues, etc. and providing first level intervention if appropriate
- Managing print devices, restocking with paper, toner, etc.
- Ensuring that batch jobs, archiving, etc. are performed Running scheduled housekeeping jobs, such as database maintenance, file clean-up, etc.
- Burning images for distribution and installation on new servers, desktops or laptops
- Physical installation of standard equipment in the Data Centre.
6.6.4 Application Management Roles
6.6.4.1 Applications Managers/Team Leaders
An Applications Manager or Team-leader should be considered for each of the applications teams or departments. The role will:
- Take overall responsibility for leadership, control and decision-making for the applications team or department
- Provide technical knowledge and leadership in the specific applications support activities covered by the team or department
- Ensure necessary technical training, awareness and experience levels are maintained within the team or department relevant to the applications being supported and processes being used
- Involve ongoing communication with users and customers regarding application performance and evolving requirements of the business
- Report to senior management on all issues relevant to the applications being supported
- Perform line-management for all team or department members.
6.6.4.2 Applications Analyst/Architect
Application Analysts and Architects are responsible for matching requirements to application specifications. Specific activities include:
- Working with users, sponsors and all other stakeholders to determine their evolving needs
- Working with Technical Management to determine the highest level of system requirements required to meet the requirements within budget and technology constraints
- Performing cost-benefit analyses to determine the most appropriate means to meet the stated requirement
- Developing Operational Models that will ensure optimal use of resources and the appropriate level of performance
- Ensuring that applications are designed to be effectively managed given the organization's technology architecture, available skills and tools
- Developing and maintaining standards for application sizing, performance modelling, etc.
- Generating a set of acceptance test requirements, together with the designers, test engineers and the user, which determine that all of the high-level requirements have been met, both functional and with regard to manageability
- Input into the design of configuration data required to manage and track the application effectively.
An appropriate number of Application Analysts will be needed for each of the Application Management teams or department to perform the activities described elsewhere in this publication, primarily in paragraph 6.5.5.
The ways in which Application Management groups can be organized, and the options available, are discussed in some detail in section 6.7.
6.6.5 Event Management Roles
It is unusual for an organization to appoint an 'Event Manager', as events tend to occur in multiple contexts and for many different reasons. However, it is important that Event Management procedures are coordinated to prevent duplication of effort and tools. The roles of the Service Operation functions in Event Management are as follows.
6.6.5.1 The Role of the Service Desk
The Service Desk is not typically involved in Event Management as such, unless an event requires some response that is within the scope of the Service Desk's defined activity, for example notifying a user that a report is ready. Generally, though, this type of activity is performed by the Operations Bridge, unless the Service Desk and Operations Bridge have been combined.
The investigation and resolution of events that have been identified as being Incidents will initially be undertaken by the Service Desk and then escalated to the appropriate Service Operation team(s).
The Service Desk is also responsible for communicating information about this type of incident to the relevant Technical or Application Management team and, where appropriate, the user.
6.6.5.2 The Role of Technical and Application Management
Technical and Application Management plays several important roles as follows:
- During Service Design, they will participate in the instrumentation of the service, classify events, update correlation engines and ensure that any auto responses are defined
- During Service Transition they will test the service to ensure that events are properly generated and that the defined responses are appropriate
- During Service Operation these teams will typically perform Event Management for the systems under their control. It is unusual for teams to have a dedicated person to manage Event Management, but each manager or team leader will ensure that the appropriate procedures are defined and executed according to the process and policy requirements
- Technical and Application Management will also be involved in dealing with incidents and problems related to events
- If Event Management activities are delegated to the Service Desk or IT Operations Management, Technical and Application Management must ensure that the staff are adequately trained and that they have access to the appropriate tools to enable them to perform these tasks.
6.6.5.3 The Role of IT Operations Management
Where IT Operations is separated from Technical or Application Management, it is common for Event Monitoring and first-line response to be delegated to IT Operations Management. Operators for each area will be tasked with monitoring events, responding as required, or ensuring that Incidents are created as appropriate. The instructions for how to do so must be included in the SOPs for those teams.
Event Monitoring is commonly delegated to the Operations Bridge where it exists. The Operations Bridge can initiate and coordinate, or even perform, the responses required by the service, or provide first-level support for those events which generate an incident.
6.6.6 Incident Management Roles
The following roles are needed for the Incident Management process.
6.6.6.1 Incident Manager
An Incident Manager has the responsibility for:
- Driving the efficiency and effectiveness of the Incident Management process
- Producing management information
- Managing the work of incident support staff (first- and second-line)
- Monitoring the effectiveness of Incident Management and making recommendations for improvement
- Developing and maintaining the Incident Management systems
- Managing Major Incidents
- Developing and maintaining the Incident Management process and procedures.
In many organizations the role of Incident Manager is assigned to the Service Desk Supervisor - though in larger organizations with high volumes a separate role may be necessary. In either case it is important that the Incident Manager is given the authority to manage incidents effectively through first, second and third line.R
6.6.6.2 First-Line
This is covered in detail under the Service Desk (section 6.1) and will not be repeated here.
6.6.6.3 Second-Line
Many organizations will choose to have a second-line support group, made up of staff with greater (though still general) technical skills than the Service Desk - and with additional time to devote to incident diagnosis and resolution without interference from telephone interruptions.
Such a group can handle many of the less complicated incidents, leaving more specialist (third-line) support groups to concentrate on dealing with more deep-rooted incidents and/or new developments etc.
Where a second-line group is used, there are often advantages of locating this group close to the Service Desk to aid with good communications and to ease movement of staff between the groups, which may be helpful for training/awareness and during busy periods or staff shortages. A second-line support manager (or supervisor if just a small group) will normally head this group.
It is conceivable that this group may be outsourced - and this is more likely and practical if the Service Desk itself has been outsourced.
6.6.6.4 Third-Line
Third-line support will be provided by a number of internal technical groups and/or third-party suppliers/maintainers. The list will vary from organization to organization but is likely to include:
- Network Support
- Voice Support (if separate)
- Server Support
- Desktop Support
- Application Management - likely that there may be separate teams for different applications or application types - some of which may be external supplier/maintainers. In many cases the same team will be responsible for Application Developments as well as support - and it is therefore important that resources are prioritized so that support is given adequate prominence
- Database Support
- Hardware Maintenance Engineers
- Environmental Equipment Maintainers/Suppliers.
6.6.7 Request Fulfilment Roles
Initial handling of Service Requests will be undertaken by the Service Desk and Incident Management staff.
Eventual fulfilment of the request will be undertaken by the appropriate Service Operation team(s) or departments and/or by external suppliers, as appropriate. Often, Facilities Management, Procurement and other business areas aid in the fulfilment of the Service Request. In most cases there will be no need for additional roles or posts to be created.
In exceptional cases where a very high number of Service Requests are handled, or where the requests are of critical importance to the organization, it may be appropriate to have one or more of the Incident Management team dedicated to handling and managing Service Requests.
6.6.8 Problem Management Roles
The following roles are needed for the Problem Management process.
6.6.8.1 Problem Manager
There should be a designated person (or, in larger organizations, a team) responsible for Problem Management. Smaller organizations may not be able to justify a full-time resource for this role, and it can be combined with other roles in such cases, but it is essential that it not just left to technical resources to perform. There needs to be a single point of coordination and an owner of the Problem Management process. This role will coordinate all Problem Management activities and will have specific responsibility for:
- Liaison with all problem resolution groups to ensure swift resolution of problems within SLA targets
- Ownership and protection of the KEDB
- Gatekeeper for the inclusion of all Known Errors and management of search algorithms
- Formal closure of all Problem Records
- Liaison with suppliers, contractors, etc. to ensure that third parties fulfil their contractual obligations, especially with regard to resolving problems and providing problem-related information and data
- Arranging, running, documenting and all follow-up activities relating to Major Problem Reviews.
6.6.8.2 Problem-Solving Groups
The actual solving of problems is likely to be undertaken by one or more technical support groups and/or suppliers or support contractors - under the coordination of the Problem Manager.
Where an individual problem is serious enough to warrant it, a dedicated problem management team should be formulated to work together in overcoming that particular problem. The Problem Manager has a role to play in making sure that the correct number and level of resources is available in the team and for escalation and communication up the management chain of all organizations concerned.
6.6.9 Access Management Roles
Since Access Management is an execution of Security and Availability Management, these two areas will be responsible for defining the appropriate roles. It is unusual for an organization to appoint an 'Access Manager', although it is important that there is a single Access Management process and a single set of policies related to managing rights and access. This process and the related policies are likely to be defined and maintained by Information Security Management and executed by the various Service Operation functions. Their activities can be summarized as follows.
6.6.9.1 The Role of the Service Desk
The Service Desk is typically used as a means to request access to a service. This is normally done using a Service Request. The Service Desk will validate the request by checking that the request has been approved at the appropriate level of authority, that the user is a legitimate employee, contractor or customer and that they qualify for access.
Once it has performed these checks (usually by accessing the relevant databases and Service Level Management documents) it will pass the request to the appropriate team to provide access. It is quite common for the Service Desk to be delegated responsibility for providing access for simple services during the call.
The Service Desk will also be responsible for communicating with the user to ensure that they know when access has been granted and to ensure that they receive any other required support.
The Service Desk is also well situated to detect and report incidents related to access. For example, users attempting to access services without authority; or users reporting incidents that indicate that a system or service has been used inappropriately, i.e. by a former employee who used an old username to gain access and make unauthorized changes.
6.6.9.2 The Role of Technical and Application Management
Technical and Application Management play several important roles as follows:
- During Service Design, they will ensure that mechanisms are created to simplify and control Access Management on each service that is designed. They will also specify ways in which abuse of rights can be detected and stopped
- During Service Transition they will test the service to ensure that access can be granted, controlled and prevented as designed
- During Service Operation these teams will typically perform Access Management for the systems under their control. It is unusual for teams to have a dedicated person to manage Access Management, but each manager or team leader will ensure that the appropriate procedures are defined and executed according to the process and policy requirements
- Technical and Application Management will also be involved in dealing with Incidents and Problems related to Access Management
- If Access Management activities are delegated to the Service Desk or IT Operations Management, Technical and Application Management must ensure that the staff are adequately trained and that they have access to the appropriate tools to enable them to perform these tasks.
6.6.9.3 The Role of IT Operations Management
Where IT Operations is separated from Technical or Application Management, it is common for operational Access Management tasks to be delegated to IT Operations Management. Operators for each area will be tasked with providing or revoking access to key systems or resources. The circumstances under which they may do so, and the instructions for how to do so, must be included in the SOPs for those teams.
The Operations Bridge, if it exists, can be used to monitor events related to Access Management and can even provide first-line support and coordination in the resolution of those events where appropriate.
6.7 Service Operation Organization Structures
Some general information has already been provided about organizational considerations for each function (see paragraphs 6.2.3, 6.3.4 and 6.5.6.). This section considers some specific organizational structures for all functions. There are a number of ways of organizing Service Operation functions, and each organization will have to make it own decisions, based upon its scale, geography, culture and business environment. Some options are discussed in the rest of this section.
6.7.1 Organization By Technical Specialization
In this type of organization, departments are created according to technology and the skills and activities needed to manage that technology. IT Operations will follow the structure of the Technical and Application Management departments. The implication of this is that IT Operations is geared toward the operational agendas of the Technical and Application Management departments.
This structure can work well, provided that these groups are fully represented in the Service Design, Testing and Improvement processes, which will ensure that their agendas are aligned with the requirements of the business.
This structure also assumes that all Technical and Application Management departments have clearly distinguished between their Management activity and operations activity. It also requires that they have standardized these operational activities so that they can be effectively managed by the IT Operations Manager without undue interference from the Technical and Application Management teams or departments.
An example of an IT Operations organization structure based on technical expertise is given in Figure 6.7
The advantages of this type of organizational structure include:
- It is easier to set internal performance objectives since all staff in a single department have a similar set of tasks on a similar technology
- Individual devices, systems or platforms can be managed more effectively since people with the appropriate skills are dedicated to manage these and measured according to their performance
- Managing training programmes is easier since skill sets are clearly defined and separated into specific groups.
The disadvantages of this type of organizational structure include the following:
- When people are divided into separate departments the priorities of their own group tend to override the priorities of other departments. An example of this is when departments refuse to accept ownership of an incident, each one blaming the other while the business continues to be disrupted.
- Knowledge about the infrastructure and relationships between components is difficult to collect and fragmented. Individual groups tend to collect and maintain only the data that is required to support their own function, and do not give access to it very easily.
- Each technology managed by a group is seen as a separate entity. This becomes a problem on systems that consist of components managed by different teams, e.g. an application, managed by the Application Management team, runs on a server managed by the Server Management department, using a network segment managed by the Local Area Networking department. If a change is made by one team or department without consulting the others, this could be disastrous for the service.
- It is more difficult to understand the impact of a single department's poor performance on the IT Service since there are many different groups contributing to the same service, each with its own set of performance objectives.
- It is more difficult to track overall IT Service performance since each group is being measured on an individual basis.
- Coordinating Change Assessments and Schedules is more difficult since many different departments have to provide input for each change.
- Work requiring knowledge of multiple technologies is difficult since most resources are only trained for and concerned with the management of a single technology. Projects therefore have to include cross-training, which is time-consuming and expensive.
|
|
| Figure 6.7 IT Operations organized according to technical specialization (sample) |
|
6.7.2 Organization by Activity
This type of organization structure focuses on the fact that similar activities have to be performed on all technologies in the organization. This means that people who perform similar activities, regardless of the technology, should be grouped together, although within each department there may be teams focusing on a specific technology, application, etc.
In this type of organization, there is no clear differentiation between the different Technical and Application Management areas. Similar activities from many different areas can be grouped into a single department.
Examples of departments that have been set up to perform a specific set of activities across multiple technologies include:
- Maintenance (this implies that one team will coordinate and perform all maintenance across all technologies)
- Contract Management or Third Party Management
- Monitoring and Control
- Operations Bridge
- Network Operations Centre
- Operations Strategy and Planning (which, as part of the Service Design processes, normally defines the standards to be used in IT Operations) - this department can set strategy or standards for every type of Technical and Application Management area.
The Operations Strategy and Planning department is used to illustrate this type of structure in Figure 6.8.
The advantages of this type of organizational structure include the following:
- It is easier to manage groups of related activities since all the people involved in these activities report to the same manager
- Measurement of teams or departments is based more on output than on isolated activities. This helps to build higher levels of assurance that a service can be delivered.
The disadvantages of this type of organizational structure include the following:
- Resources with similar skills may be duplicated across different functions, which results in higher costs
- Although measurement is more output-based, it is still focused on the performance of internal activities rather than driven by the experience of the customer or end user.
|
|
| Figure 6.8 A department based on executing a set of activities |
|
6.7.3 Organizing To Manage Processes
It is not a good idea to structure the whole organization according to processes. Processes are used to overcome the 'silo effect' of departments, not to create silos. However, there are a number of processes that will need a dedicated organization structure to support and manage it. For example, it will be very difficult for Financial Management to be successful without a dedicated Finance department - even if that department consists of a small number of staff.
In process-based organizations people are organized into groups or departments that perform or manage a specific process. This is similar to the activity-based structure, except that its departments focus on end-to-end sets of activities rather than on one individual type of activity.
It should be noted that this type of organization structure should only be used if IT Operations Management is responsible for more than just IT Operations. In some organizations, for example, IT Operations is responsible for defining SLAs and negotiating UCs.
In addition, processes specifically exist to link the activities of different groups to achieve a specific outcome. Using processes as the basis to create departments can defeat the purpose of having processes in the first place. Process-based departments are really only effective when they are able to coordinate the execution of the process through the entire organization.
This means that process-based departments should only be considered if IT Operations Management is to play the role of Process Owner for a specific process.
Examples of process-based groups or departments include:
- Capacity Operations
- Availability Monitoring and Control
- IT Financial Management
- Security Administration
- Asset and Configuration Management (including equipment installation and deployment).
The advantages of this organizational structure include the following:
- Processes are easier to define
- There is less role conflict as job descriptions and process role descriptions are the same. In other structures a single job description will typically include activities for several roles
- Metrics of team or department performance and process performance are the same, effectively aligning 'internal' and 'external' metrics.
The disadvantages of this organizational structure include the following:
- A basic principle of processes is that they are a means of linking the activities of various departments and groups. By using processes as a basis for organizational design, additional processes need to be defined to ensure that the departments work together.
- Even if a department is responsible for executing a process, there will still be external dependencies. Groups may not view process activities outside of their own process as being important, resulting in processes that cannot be fully executed because dependencies cannot be met.
- While some aspects of a process can be centralized, there will always be a number of activities that will have to be performed by other groups. The relationship between the dedicated team or department and the people performing the decentralized activities is often difficult to define and manage.
6.7.4 Organizing IT Operations by Geography
IT Operations can be physically distributed and in some cases each location needs to be organized according to its own particular context.
This structure is typically used in the following circumstances:
- Data Centres are geographically distributed
- Different regions or countries have different technologies or provide a different set of services
- There are different business models or organizational structures in the different regions, i.e. the business is decentralized by geography and each Business Unit is fairly autonomous
- Different legislation applies to different countries or regions (e.g. safety regulations)
- Different standards apply to different countries or regions
- Cultural or language differences exist between staff managing IT.
An example of this type of structure is given in Figure 6.9. Note that in this example each geographical department is structured internally using Technical Specialization. This could be different in each region. For example one region
may be structured in this way, while another region uses a process- or activity-based structure.
Figure 6.9 also illustrates that one location could perform centralized operations for all regions if they are similar enough. In this example, the American Server Operations Department manages all server operations in all locations, Brussels manages all database operations and Singapore manages all storage operations.
The advantages of this type of organizational structure include the following:
- Organization structure can be customized to meet local conditions
- IT Operations can be customized to meet differing levels of IT service from region to region.
The disadvantages of this type of organizational structure include the following:
- Reporting lines and authority structures can be confusing. For example, does Network Operations report into the local Data Centre Manager or to a centralized Network Operations Manager?
- Operational standards are difficult to impose, resulting in inconsistent and duplicated activities and tools, resulting in reduced economies of scale, which in turn increases the overall cost of operations.
- Duplication of roles, activities, tools and facilities across multiple locations could be very costly.
- Shared services, such as e-mail, are more difficult to deliver as each regional organization operates differently.
- Communication with customers and inside IT will be more difficult as they are not co-located and it may be difficult for staff in one location to understand the priorities of customers or staff in another location.
|
|
| Figure 6.9 IT Operations organized according to geography |
|
6.7.5 Hybrid Organization Structures
It is unlikely that IT Operations Management will be structured using only one type of organization structure. Most organizations use a technical specialization, with some additional activity- or process-based departments.
The type of structure used and the exact combination of technical specialization, activity-based and process-based departments will depend on a number of organizational variables.
Organizational structure variables
The exact criteria chosen and the resulting organizational structure will depend on a number of variables, which may include:
- The nature of the business
- Business requirements and expectations
- The technological and technical architecture
- The stability of the current IT Infrastructure and the availability of skills to manage it
- The governance of the organization (i.e. the way in which authority is assigned and decisions are made - as well as any formal governance framework that is used, such as COBIT or SOX)
- The legislative, political and socio-economic environment of the organization
- The type and level of skills available to the organization
- The size, age and maturity of the organization
- The management style of the organization
- Dependence on IT for business-critical activities, processes and functions
- The way in which IT participates in the value network (i.e. the way IT interacts with the business and its partners, suppliers and customers)
- The relationship between IT and its vendors.
For a more complete description of how these factors influence organizational design, please refer to the 'Organizational Development' section of the Service Strategy publication.
|
|
|
| Figure 6.10 Centralized IT Operations, Technical and Application Management structure |
|
6.7.5.1 Combined Functions
One last type of organization should be discussed. This structure incorporates IT Operations, Technical and Application Management departments into a single structure. This is sometimes the case where all groups are co-located in a single data centre. Here, the Data Centre Manager takes responsibility for all Technical, Application and IT Operations Management.
This type of organization structure is illustrated in Figure 6.10.
In this structure, IT Operations Management is responsible for the Technical and Application Management functions, which in turn are responsible for managing their own operational activities. Each department is able to delegate some of these activities to the Operations Control department.
The advantages of this organization structure are:
- There is greater consistency and control between the more tactical and more operational Technical Management activities
- It is easier to enforce the performance standards and technical architectures that are created in Service Design, since the people who were involved in design are managing the activities of the people who are executing those activities
- As there is no duplication between location or activity, this structure is often more cost-effective.
The disadvantage of this organization structure is:
- The scope of this structure makes it very difficult to manage effectively in large organizations or in organizations with multiple Data Centres.
6.7.5.2 Organizing Application and Technical Management
Technical and Application Management organizations tend to be fairly straightforward. As stated in paragraphs 6.3.4 and 6.5.6, Technical Management departments are usually based on the technology they manage and Application Management departments on the applications and sets of applications they manage.
However, there are some alternative organization structures and variations, which are discussed in this section.
6.7.5.3 Geography
In organizations with multiple locations, it is common for the Technical and Application Management departments to be represented in each physical location. However, this does not mean that each location will have all the same departments, or that they are all responsible for the same actions.
As support and management tools mature more and more IT Infrastructure and application CIs can be managed remotely. This means that each department will have a strong, centralized Technical or Application Management team, with local members to provide specialized, on-site activities or support.
For example, in Server Management, the central team will help to create standards for server configuration, they will monitor and control remote devices, perform backups, perform Operating System upgrades, etc. The local teams will provide basic on-site support, hardware maintenance and repair and configuration and installation of new servers.
In Application Management, the central team could participate in ongoing design and testing of the application, monitoring and control; perform backups, data integrity checks, etc. The local team could provide on-site support and education to end users and work with the local Technical Management team to resolve more complex problems involving local equipment.
There is one potential issue that needs to be resolved however, and that is who the local team reports to. In some organizations they report to the manager of the centralized team. This has the added advantage of consistent performance and management across the whole enterprise.
In other organizations the local teams report to the most senior IT Manager at that site. This has the added advantage that IT Services can be customized to meet local conditions, but it creates a lot of confusion about who the local teams should take direction from.
The advantages of this type of organizational structure include the following:
- Organization structure can be customized to meet local conditions
- Technical and Application Management can be customized to meet differing levels of IT service from region to region.
The disadvantages of this type of organizational structure include the following:
- Reporting lines and authority structures can be confusing
- Standards are difficult to impose, resulting in inconsistent and duplicated activities and tools, resulting in reduced economies of scale, which in turn increases the overall cost of operations
- Duplication of roles, activities, tools and facilities across multiple locations could be very costly.
6.7.5.4 Combined Technical And Application Management Structure
Some organizations organize their Technical and Application Management functions according to systems. This means that each department will consist of application specialists and IT Infrastructure technical specialists, all geared towards managing the services based on that set of systems. Components that are shared across all these systems, such as the network, will be managed by dedicated Technical Management departments.
The advantage of this organization structure is:
- It is easier to produce high-quality output to the end user because all department members are focused on the success of the system as a whole, rather than the performance of an individual technology component or application.
The disadvantages of this organization structure are:
- Duplication of skills and resources across several departments will increase the cost of the organization. For example, each group is likely to have an individual or team dedicated to managing servers - each of which will be doing very similar tasks.
- Communication between staff who are managing similar technology is reduced. This reduces the amount of learning by experience and increases reliance on collaborative knowledge management tools.
- When people with similar skills are in the same department, the department will compensate for members with lower skill and competency levels. When there is only one person with Server Management skills on a system-based department, and their competency is minimal, it will affect the performance of the entire department.