Tools for Monitoring Complex Applications across the Enterprise
Author: David Junkin, Senior Enterprise Architect
Download the PDF
Customers demand consistent, quick response time. Unfortunately, physical assets like storage or networking can fail. And even when using best practices, software can be implemented with bugs. To minimize the impact of these issues, monitoring tools allow organizations to quickly identify and react to issues; the most prized capability being able to predict and prevent outages.
Choosing the right tools is imperative. Most layers in an enterprise stack (networking, middleware, server and database) have their own monitors. While these are adequate for a specific layer, it often requires people from many departments with their own monitoring tools working together to solve a problem. Another alternative is for a few people in a centralized team learn many disparate tools. These approaches are not optimal. Therefore, there has been a growth of tools that span across the enterprise. These advanced tools can not only pinpoint where an issue is, but also show how the transaction traversed various layers. Often it is hard to know if the items flagged by a monitoring tool are the cause of the issue or the result. Having this multi-layered capability shows how an issue on one layer can lead to issues on another, resulting in a quicker identification of the cause.
When choosing an enterprise monitoring tool it is important to know the tool’s core approach, whether it is top down or bottom up. The top down approach takes events and relates them to business processes. One example is "logging in", or "clicking on an update button" in a particular application. Some tools make the user define these, resulting in extra time and the possibility of missing key steps. Others are more liberal, automatically pulling in everything and allowing you to group activity by the user’s actions it detects. Tools that are good at this top down approach are useful in finding chronic issues, such as network latency, poorly designed SQL and improper architecture. A long history of events may be key in identifying the problem. Another benefit of this approach is that after an issue is complete determining the actual impact to the business is easier. Often internal and external stakeholders will ask for details around the scope and duration of any outage. Without a top down tool this can be difficult and time consuming.
The bottom up approach is useful for dealing with acute issues. The system can go from running smoothly to a halt in minutes. Examples would be a disk failure, a bad network card or backed up messages in a queue. Rarely is there time to drill down through business processes to determine the issue. Bottom up solutions will look at key measures like thread counts, memory thresholds and queue sizes and automatically warn of an issue. They can often let you know of an issue before the users are aware of it. As with the top down, some tools make you define these which can be time consuming. Tools that apply these rules automatically can save time and make sure items aren’t missed.
Finally look at a tool's ability to export data into external data sources such as a Hadoop ecosystem that will allow you to run elastic type searches, mine data, find relationships and correlations that may not be seen with the naked eye.
Choosing the right tool is key in being able to predict, quickly react, identify business impact and prevent future issues. When working with vendors ask them how they would identify both chronic and acute issues. Also ask about the configuration time required when setting up the monitoring. Ask about exporting data and solutions they may have in looking for correlations. These steps will help you get the most out of the money you are spending and result in happier customers.
MANAGING BUSINESS PROCESS MANAGEMENT (BPM) THROUGH A DYNAMIC DECISION MAKING MODEL
Author: Vijay Yarkala, Architect
Download the PDF
Business Process Management (BPM) is about automating individual tasks or business processes and providing workflow management. It can then be expanded with IT infrastructure to respond to business needs. Companies can recognize, fix and address business issues as they occur; they will be better equipped to adapt to change. BPM is the technology that enables organizations to adjust and prosper.
Business users need to re-examine business processes by focusing on the core competency of BPM and process automation. A dynamic decision model enables business users to do more with streamlining their key business processes. BPM has changed from simple process automation and management into a more complex, infrastructure-type commitment that requires resources and research while addressing issues.
To construct a dynamic decision making model - people, processes and data need to be considered. People define processes to present data to the users. Data can be spread all over in enterprise systems (including legacy systems). Decision based business rules need to be defined to configure processes for changing business needs. Rules are stored in a central place and applied across the company. This gives the ability to adapt to variations with minimal coding changes.
Most BPM solutions integrate with content management systems to fetch documents, and images. Defined processes capture analytics for user actions, such as what kind of data has been requested at a specific point, and to which location the information is being retrieved. Once more analytics are captured; data and analytics can be combined to define dynamic decision rules to respond to changing business needs in real time. These analytics need to be directly integrated with application processes and should generate proactive alerts. This helps in responding to potential problems before they occur with service level agreements.
BPM can increase the agility and automation of critical business processes. It is critical to understand processes behavior over time to better handle ongoing application support and future optimization. We need to identify constraints and dependencies and define the process accordingly for future adjustments. Predictive analysis needs to be performed on whether a specific process will meet its objective within the defined timeline. This type of process analytics helps in anticipating problems before they occur, enabling better human and automated decision-making.
IS YOUR DATA MODELING MAXIMIZING YOUR PRODUCTIVITY?
Author: Glenn Child, Senior Data Modeler Architect
Download the PDF
Our company purchased a data modeling tool to improve productivity and quality. Like most tools in the market, it enforces standards, validates input, compares models/databases, and generates reports and Data Definition Language (DDL). We recognize the value the tool provides. But we discovered by utilizing the supplied Application Programming Interface (API), we could improve productivity and quality even more. Using Microsoft Access and the API, we created our Data Modeling Toolbox.
In order to make best use of the tool, we realized additional features should be added to the Data Modeling Toolbox. The existing reports were numerous but we wanted one that supplied the developer with the table/entity, attribute/column information, all relationships and indexes. We were able to create the report we wanted including the ability to easily select the metadata is displayed in the report. We also included the option to place the data in the report into a well formatted spreadsheet. This has made supplying our customers' reporting needs flexible and efficient.
Our data modeling tool has all the bells and whistles when entering the information in the screens but that's a lot of clicking and typing. Our requirements are supplied to us mainly via spreadsheets. So, we developed a program in our Data Modeling Toolbox to read the spreadsheet and insert the metadata into the data model for new tables or addition columns to an existing table. Once that is done, the data modeler draws the relationship lines and manually makes updates or deletes to existing columns. This eliminates errors caused by manually typing and time savings is significant.
The data modeling tool does have a spell check for each object or an option to export selected data to a spreadsheet, spell check the spreadsheet and import the metadata back. All of this is time intensive so we programmed our Data Modeling Toolbox to read in the metadata of interest from the data model, invoke Microsoft Office Spell Check, make the changes and put the changes back into the data model. Within a few minutes all entity/attribute names and definitions can be spell checked.
Our Data Modeling Toolbox has many other functions incorporated that help us do a better job. There is effort needed to learn how to use the API and program the functions desired. But the improvement to quality and productivity are worth the effort and allow us to fulfill the data modeling function with a relatively small staff.
There are many data modeling tools available to enterprise architects. However, it may be required to provide added functionality in order to maximize productivity.
DB2 DBMS AVAILABILITY, RECOVERABILITY, AND PERFORMANCE
Author: Jon Capelli, Senior Enterprise Architect
Download the PDF
Availability, recoverability, and performance are key factors that must be considered and planned for in any DBMS (Database Management System). Database administrators are tasked with ensuring that these three factors are an integral part of database design, although each one by itself is critical. It doesn't suffice to have databases be recoverable if you can't access them, or if the performance of retrieving critical data is below customer expectations. How many times have you closed your internet browser because the response was too slow and went to another website? It is frustrating for customers when this happens.
Availability refers to having the database and all the tables, indexes, views associated with that database available to the user as much as possible. In today's world of information being available on every device from cell phones to personal computers and everything in between this presents some challenges from a database administration standpoint. Over time, tables within databases get fragmented from heavy workloads of updating, deleting and adding data. This fragmentation can hinder performance. It's necessary to reorganize the data to maintain performance. Reorganization is accomplished by running database utilities against the database. In addition to that workload, there is a need to change the tables to accommodate new business. Over the years DB2 has grown into a DBMS that can support these demands. With every release of DB2, the product becomes more robust to allow online changes and reorganizations to repair fragmentations.
Recoverability of data is another critical piece of database administration. If we lose the ability to recover data, we lose confidence of our customers and eventually their business. There are two different flavors of recoverability: disaster recovery (DR) and local recovery. Disaster recovery refers to a situation where the data center and all the data stored is no longer available due to a catastrophic event. Not just the data, but all the system components need to be backed up offsite in the event of a true disaster. Local recovery refers to data loss due to a hardware failure or application software issues. Years ago, hardware failure was a more common reason for recovery of local data loss. Now, it's usually associated to an application software issue. The good news with the latter is that most of this type of recovery can be accomplished without business interruption. DBAs can access logs to back out data changes that were in error or provide information to assist application areas for the correction of data.
Performance and the relevance of reviewing access paths are the final key component of DBMS design. Now that we discussed the availability and recoverability of data, it's equally important that the data can be accessed and processed in a timely manner to meet the business needs of our customers. SQL (Structured Query Language) and DML (Data Manipulation Language) are used to interface with a relational database to retrieve and manipulate data. In order to satisfy the business needs of today, it's essential that the access to this data is performs like a fine tuned engine.
Availability, recoverability, and performance are all equally important on any DBMS platform; it's hard to say that one is more important than the other. Every new release of DB2 has been making these three key factors more robust and easier to maintain and achieve.
HADOOP DATA LAKES CAN BE SHARK-INFESTED WATERS
Author: Steve Swartzlander, Lead Architect
Download the PDF
Many of today's enterprises are embarking on "big data" projects to realize the benefits of various types of analytics such as predictive modeling, machine learning, natural-language processing, and others. A key part of such efforts includes gathering volumes of data into a large distributed repository often called a "data lake" which is frequently hosted in a Hadoop cluster.
Such projects come with hazards lurking below the surface. Building a small Hadoop cluster in a lab-like environment for testing and experimentation is rather simple. Building one for a large enterprise that must be secure, stable, and highly available is another matter, particularly if one is in a highly-regulated industry that deals with sensitive data.
It is necessary to accept a higher-than-usual level of uncertainty when building out a Hadoop environment. Change is rapid across the ecosystem. New Hadoop-based projects and components are constantly being introduced. Most of them are open source. A production Hadoop cluster will often include key components that are still Apache incubator projects at a 0.x release level.
Not all answers will be found in a vendor's documentation. That's true of most things in IT, of course, but it is magnified with Hadoop. Online communities are critical, as is a certain willingness to dive into code, poke through obscure files, and experiment.
Security can be a particularly interesting challenge with Hadoop. Its authentication mechanisms use Kerberos, which can be implemented in a couple major ways. Engaging the enterprise's IT security team early is important. They may be unfamiliar with the details of Kerberos and need to gain new skills themselves to be effective partners in the project.
A less-technical challenge are the third-party vendors. Word will get out that the enterprise is starting a big-data effort, and vendors will swarm. They will have both software and appliances with a big-data angle that promise to add value. Be wary of these until the intrinsic capabilities of the base components are well-understood relative to project requirements. A particular area of concern is compatibility. Each third-party product added will complicate administration, may introduce constraints on upgrading to new versions of Hadoop components, and may also have limitations in regard to operation with secured clusters.
The bottom line is that going alone into a data lake project is inadvisable. Find a partner who's been there. Consult with those who have already navigated through the barriers. Build something small first, and learn as much as possible before committing to an enterprise-class implementation. Expect to iterate through architectural options before finding a combination that works well for your environment.
A data lake project can be a very rewarding and value-creating effort. Go in with eyes open, a willingness to adapt, a desire to learn, and some expert assistance, and the benefits can be significant.
TECHNICAL DEBT,...IT'S YOUR PROBLEM TOO
Author: Doug Riley, Sr. Architect
Download the PDF
Debt is something most of us understand in a monetary sense.
In today's "everything was due yesterday" world much of technical debt is incurred by taking short cuts. Why? Because everything was due yesterday, yesterday was Sunday, and there was a great football game on TV. Monday the project manager stops by (or stands by if you are Agile) and helps you out by suggesting that everything is fine if you just cut a couple corners here and there. I mean, really, testing (for example) is a waste of time. If there are problems in the code the users will find it eventually and you can fix it then. One way of looking at technical debt is that you promised to deliver something but what you delivered is less than what you promised. The difference between the two is a form of technical debt. Another common form of technical debt is when older or existing technology is used because it is cheaper, already exists, or is easier to write. This is technical debt consciously created by development decisions. In an increasingly cost conscious word technical debt is sometimes introduced when less experienced developers are given tasks that should have been given to more experienced (and expensive) developers. The result can be debt by poorly written code.
However, there are almost as many ways to incur technical debt as there are stars in the sky. One example that comes to mind is the "killer application". These come along every few years and revolutionize the way computer code/systems/projects are accomplished. They have been amazing. They doubled or tripled our productivity, changed whole paradigms, made the rest of the world stand up and notice. The Borland C development environment, Lotus Notes, PowerBuilder, dBase IV, Java, just to name a few. Now many of them would be found on lists of technical debt. It is the way the world works.
The steam locomotive was an amazing piece of engineering. I live across from two sets of railroad tracks. I haven't seen any steam locomotives. But, I have seen plenty of PowerBuilder (PB) GUI screens. I am old enough to remember the first demo I saw for PB. When the power behind the product hit me it was monumental. This was a game changer. Soon knowing PB was a useful and marketable skill. Major systems were written with PB front ends. There was soon a bevy of competing products. The client server revolution was underway. I picked PB as an example because I remember its beginnings. I also remember why it is no longer the huge force it was. Does the phrase "fat client" ring a bell? Too much work on the client level that can be done better by an application server. A fat client produces code that runs on the client operating system. A Unix front end and a Windows front end had to use different code even if they looked the same. Then came the crushing blow; web browser front ends and Java. Write once run everywhere became the order of the day. There is nothing particularly bad about PB. I used it merely as an example of the kind of technical debt that started out as a revolutionary idea and now, while it still works fine, is rarely chosen for new projects. The skill pool is shrinking and many of the former PB gurus have moved on.
Remember! Only you can prevent technical debt. Well,... Ok, You can't really prevent it either. But, make it an important part of your projects to try to dodge debt caused by short cuts and other avoidable ways of introducing technical debt into your environments. Plan for elimination of technical debt when it is discovered or knowingly created. As for those killer apps that ended up running out of steam, replace them as soon as it is practical. Everything has a cost. Keeping around technical debt might have a larger cost than you think.
INFORMATION TECHNOLOGY TRENDS IMPACTING HEALTH CARE
Author: Tim Barnickel, Lead Enterprise Architect
Information technology continues to evolve at an increasing rate. Four of the major trends currently underway are addressed in this white paper: Artificial Intelligence, Cloud Computing, Advanced User Experience/Interfaces, Big Data and Advanced Analytics.
All of these trends currently impact, or will impact health care from both provider and payer perspectives. It is important to note that there is significant synergy among these technologies; they are often used together to achieve an optimal solution.
BUSINESS RULES: "TO EMBED OR NOT TO EMBED"
Author: David Nowka, Senior EA Architect
Download the PDF
Business rules are key in making logic decisions in business processes and applications. Rules provide a consistent answer based on the variable input passed to it. An example would be the need for branding a product (logo) based on who the audience is using the process or application. The business rule would have a set of input data supplied to it, and after running the data through the ruleset, it would supply back an answer of what brand to use.
Most times that same ruleset could be used elsewhere in the enterprise for branding. What happens quite often is the developer ends up embedding the business rules in the business process or application. This does not allow for reuse and makes it difficult to change the rules without having to redeploy the business process. As a consequence, there becomes more than one way to determine branding, as each business process or application has its own ruleset and doesn’t necessarily produce a consistent result across the enterprise. Embedding is fine if we have "one off" rules that will never be used by any other business process or application and are rarely changed. An example would be simple: an if-then logic that uses the same variable. If we determine that the business rule is changed often and/or will be needed in more than one place, exposing those rules as a Decision Service separate from the business process or application, allows for reusability and easier maintainability… change it once fix it everywhere.
The use of a Business Rules Management System (BRMS) to support implementing a decision service, allows encapsulating the business rules and exposing them out in the service layer. This allows for consumption by any business process or application that needs it, and gives us reusability and consistency. This also makes it possible to change the business rules as often as needed without having to redeploy any business process or business application that consumes them.
How do I identify business rules that can be made into a decision service? Looking at your business processes and applications, there are always decision points within them that drive what happens next. Often the decision points will consist of all the business rules needed to make a decision. So a decision service is then a reusable implementation of one or more decision points. Identifying those decision points is important and will help in planning for rules discovery, analysis and implementation of a decision service. Business rules that return a consistent answer are essential to making decisions within business processes and applications. Separating that logic into a decision service may be a different way of thinking for most solution architects or developers but is vital when designing business processes or applications if we want to get the most reusability and ease of adaptability to keep up with our ever evolving business.
UTILIZING PROACTIVE, PREVENTATIVE, AND PREDICTIVE (3P'S) PROCESSES TO REDUCE DISRUPTION IN A DIGITAL WORLD
Author: Sam June, Director, IT
Download the PDF
Consumers of IT services in today's Digital world have high expectations; they expect reliable, available, and quality services. Many companies are consuming Infrastructure as a Service (IAAS), Software as a Service (SAAS), APIs, and Micro services to provide business capabilities to their consumers. When there are failures, warnings or disruptions in these services, there is high demand to alert customers. All of this is occurring in real-time with the expectation of highly accurate information. Dashboards, text alerts, message board postings, and automated error handling routines can be effective methods in delivering information. The challenge to delivering this information is capturing all the technical data available and translating it into business terminology that the customer can understand, so they, in turn, can inform their customers effectively. Simply put, "What does this mean to me and what is the impact to my business and partners?"
In order to meet the challenge of accurate and timely communicating of information and reducing failure and disruptions, there should be a focus on three processes. I use the following terms, proactive, preventative and predictive (3P's) to describe this concept.
Proactive is knowing a service before the customer and informing them of status, impacts and workarounds that is not "technical jargon." A key to providing timely information is the integration of multiple monitoring solutions, alerts, robotics and error messages that map to configuration management databases (CMDB) which correlate and translate to business systems (order entry, claims, billing). Publish and Subscribe tools can be implemented to provide consistent and timely messages to consumers and customers. Common proactive measures that can take action and reduce service disruption are processes that trigger specific error messages or events.
Preventative concepts are meant to reduce and/or avoid disruptions. Preventative examples include: continuous process improvement programs, automation methods that fix before fail, elimination of single points of failure, chaos engineering principles, and shift left initiatives.
The predictive process method is the highest maturity level of the 3P's and the most active market trend capability. Emerging tools and processes such as predictive analytics, artificial intelligence, and machine learning can be leveraged to manage the mounds of log data and generated alerts. These tools can attempt to avoiding disruption, reduce labor, forecast results, and provide conclusions. Think of it as an early warning system that foresees actions to be taken before an inevitable failure or disruption.
The 3P's concept is just as much a cultural shift as it is processes and tools. Service desks, IT Operations and Maintenance areas prioritize efforts with proactive initiatives. IT Architects, IT Engineers and App Developers focus on preventative and predictive initiatives. 3P's outcomes have the potential to be large contributors to any quality or customer communication strategy. The right communications in conjunction with the 3Ps, will allow the goal of reduced disruptions in the digital world.
HIGHLY AVAILABLE INFRASTRUCTURE
Author: John McDowell, Lead Enterprise Architect
Download the PDF
When you pick up the phone, you want a dial-tone every time. Business areas expect the same thing from an IT Infrastructure operation. When companies use hosted services, they are expected to work consistently. With a modern IT Infrastructure management environment, there are a variety of methods to keep systems operating continuously. This ranges from Uninterruptible Power Supplies (UPS) to features more obscure, such as the recovery time of modern data routing protocols.
As IT Infrastructure specialists, we need to build a highly available environment in a layered approach. It starts from the ground up with the selection of physical space. One needs to start by choosing a data center location that is not easily impacted by local events; the building shouldn't be sitting by a river that floods every spring. Next, layer on redundant power feeds combined with UPS’s that will dual feed into every cabinet. The cabinets themselves need to be arranged in a manner consistent with proper air-flow, keeping in mind that detailed heating/cooling plans are needed. If components start failing in August from overheating, then it is time to rethink the cooling plan.
Once the basic facilities are planned out, you move up the stack. The selection of telecommunications partners is critical to the success of any business. If the wrong partner is selected or the design/implementation of the services is done poorly, your business will be cut-off at inopportune times. Losing communications and having service down-time will undermine credibility and is a strong driver for clients to look elsewhere. Planning to go with multiple partners or staying with a single-source is an important decision. On one side you potentially avoid the vendor-wide outage, on the other side you have to successfully keep multiple services with varied implementations running optimally. A multi-vendor plan assumes diverse paths of entry; otherwise a single telephone pole event could take them both out at once.
From here, the decisions don't get any easier. The data networking team needs to select and implement the right routing protocols. If they don't understand your needs, the applications could flounder while the network is rebuilding itself from a minor event. The storage teams have significant decisions to make around the type of work required. Does the business have a need to do a lot of real time transactions, historical analysis, fast intake of mass-data, immediate duplication? Understanding these requirements helps the storage team to balance between high-cost chip-speed memory to long term storage as well as the placement of systems. They also need to manage replication of data to ensure your systems recover quickly following a component outage. Compute (CPU) resource services face similar concerns about what the application needs and where it is needed.
Once the essential resources (memory, CPU, network, power) are finalized, you enter the realm of application High Availability. Thanks to the world of virtualization, most operating systems are really application resources that run on abstracted hardware resource pools. Virtual operating systems can reserve resources on any hardware pool they are allowed to reach and reattach themselves to the storage pools. The teams that manage these virtual operating systems need to understand the designs of all the resources they depend on so as not to mismatch availability plans.
These virtual operating systems already offer a default level of High Availability based on the designs of all the resource areas they utilize. A virtual OS may be able to recover itself in real-time or near realtime as it transitions from a failing hardware resource pool to a healthy pool. From an application level one may also incorporate multiple systems to handle load, High Availability pairs, live-live services, georedundancy, cloud based recovery options, etc. Just like the rest of the stack, if the application owners make improper assumptions about the resources they depend on, there could be an outage if resource plans are not aligned. Cross-team planning is paramount to having a fully realized highly available solution.
So how do we best aid a business function as Infrastructure specialists in terms of High Availability? We design our services from the ground up to provide the highest level of availability through building reliable (quality components) and resilient (well designed) systems. We work with our business areas to understand their specific needs and do not try to apply a one-size fits all approach to applications. Above all, we never take any component for granted; they all count.
BUILDING A RELIABLE AND RESILIENT IT ENVIRONMENT THROUGH CULTURE CHANGE
Author: David Howie, Enterprise Architecture Technology
Download the PDF
The way of architecting business applications has gone through multiple transformations in the last 30 years, from a centralized computing platform with dumb terminals, to a distributed model with desktop applications. The transformation continued with HTML generated by a collaboration of server side applications and a Service Oriented Architecture (SOA). The current paradigm consists of dynamic HTML in combination with a server side compute model, a vast SOA deployment, and the latest in REST/JSON implementation patterns.
The IT infrastructure has continued to transform as well, with the proliferation of firewalls, load balancers, web servers, web application servers, messaging products, Enterprise Service Buses, specialized appliances, multiple Database Management Systems, and so on.
Business models have transformed as well. Customers and business partners expect 24 x7 availability. Business users may span across multiple time zones or the other side of the planet. Perhaps there is a market for the internal business applications and your organization is now a provider of Software as a Service (SAAS). And aside from all of that, the expectations of the internal business users have changed; unplanned downtime cannot be tolerated.
So what do you do when you find challenges with avoiding unplanned downtime? A common reaction may be thinking that more or better technology is needed. Although there may be opportunities to improve the technology footprint, this thinking may fall short. The IT culture may very well be where the primary focus should lie.
Have your employees adapted to the changing demands of the IT world? When an outage occurs, are they going beyond simply restoring service, driving to true root cause, and deploying solutions to prevent repeat occurrences? Do they recognize that the solution to prevent a repeat is not always a technical solution, but may be a gap in procedures, or a gap in the expectation for team members to follow procedures? Do they strive to understand the collaboration between all the moving parts in the environment, and how a change in one area may affect other areas? Are they figuring out ways to load and stress test the IT services they provide to the organization, or are they assuming such testing will be covered by someone else's efforts? Do they trust that the redundancy works as advertised, or are they finding innovative ways to simulate unexpected events in order to validate that everything behaves as expected when a failover occurs? If they're not doing these things, then they may not be aware of expectations.
How do you change the culture? The first step is communication, and lots of it. You need to communicate the objectives and the business imperatives- "why things must change." You need to communicate expectations relative to the objectives, particularly the behavioral expectations necessary to achieve the desired outcomes. There needs to be support from the top down. All must be on-board; continually pushing the future state vision downwards through their organizations. The message needs to be repeated often. One way to engage individuals is to create a short catchy name that everyone readily associates with the effort, yet a name that succinctly drives home the objectives.
Beyond communication, metrics are needed that are well-aligned with the objectives and easy to understand. The metrics need to be put in front of everyone on a regular basis as a continual reminder. If there is a wide divide between current state and desired end state, then the metric goals may need to be adjusted over time, making each adjustment somewhat of a stretch to achieve while remaining reasonably attainable. Don't underestimate the effort needed to routinely gather, format, and publish the metrics. You need to deploy the people, processes, and automation necessary to make the metric reporting sustainable.
When embarking on an effort to improve the reliability and resiliency of IT environment and business applications, having the right people, processes, and technology are all key elements to long term sustained success.
STAYING SAFE ONLINE
Author: Jeff Howe, Senior EA Architect
Download the PDF
Every day it seems there are new reports of cybersecurity breaches and online account thefts. October is National Cybersecurity month and along with this event comes numerous articles on how to remain safe online. Below are a few practices that can help protect online accounts from being compromised by attackers.
Password managers have become an essential tool to help create strong passwords and manage access to online accounts. Key features include browser and mobile app integration to allow automatic capture of login details and the ability to automatically login. Multiple device and platform support can allow management and access to login details from all connected devices.
Another safeguard promoted by cybersecurity experts is enablement of multi-factor authentication. This process typically involves designating a secondary email address or device which receives a notification with a code that must be entered before a password reset is permitted. Enabling multi-factor authentication where supported can reduce the risk of an attacker gaining access and taking control of online accounts.
On mobile devices, security features should be enabled which require additional authentication before changes can be made to device settings. Do not use the same password or passcode which provides access to the device itself. Enable features which automatically lock devices after a specified time period in order to reduce the likelihood your device is compromised without your knowledge.
Security questions used to authenticate a password reset are another area of concern. Such questions are thought to provide information that only the account holder would know. Answers to questions such as, "What city were you born in" or "What is your oldest sibling's first name," can be discovered through internet searches or by reviewing social media profiles. Using the same thought process as with passwords, provide password-like answers rather than the actual answers. Use a password manager to store the questions and answers.
Reviewing passwords and enabling additional security features is time consuming. However, consider the time spent in preparing a strong defense worthwhile when compared to the time spent recovering from an attack which compromises online accounts.
CLOUDS OF MANY COLORS
Author: Thad Henry, Lead EA Architect
Download the PDF
The concept of moving applications and/or processes to the cloud is a dream of every business. This move can result in cost reduction, higher mobility and scalability that can be challenging with current on premise ("on-prem") infrastructure solutions. Unfortunately many of these dreams have turned into nightmares. Every cloud platform is shaded in many colors and knowing that they are not all white and fluffy is something every business needs to understand. There are ways to make this process less daunting, but it does require you to do some homework first.
When trying to decide what is needed, evaluating the current topology is a great first step. Every map has a legend, and using colors to separate one important piece from another is helpful for guidance. Taking a critical piece of business and moving it to the cloud can have direct or indirect effects. Gaining better insight into this topology and how each application interacts with one another is the first step to a successful implementation. The first step is to identify the systems that feed or are consumed by the piece that is moving. Nothing beats an electronic scan to see how things interact, but this only gains part of the overall picture. Including the use of architectural resources is a good way to help identify pieces that are known to the environment, but may not be picked up on a scan. Even with these resources, they are never 100% accurate and need to be supported by other information. It is critical to know that even with an on premise solution the colors on the map vary greatly.
In addition to identifying integration points, you also need to consider intangibles such as administration of data, user connectivity, business partner impacts, security impacts and other key pieces to make the application work. If you move away from an on premise solution, how will these interact or work in the future? Keeping in mind that moving to the cloud does not mean these go away, they are just handled differently, and how they are handled can differ with each cloud vendor.
Now that you know what you want your piece of the cloud to be colored, understand what it is that separates one color from another. This process can be time consuming and requires dedication to the details. The differences between cloud platforms, even though they seem small, can have large impacts to your business. Items such as user administration are vastly different and sometimes require a lot of data from the existing on premise solution. These do take time to work through and coordinate.
Performing evaluations of on premise versus cloud is a key decision point when making platform investment decisions. If this is your first move to the cloud it is highly recommended that you find an experienced partner to help you with the journey. They will make the journey easier and more bearable now and in the future.
As Gilbert Chesterton said "There are no rules of architecture for a castle in the clouds".
TRANSPARENCY FOR HEALTH PLANS: CONSUMER ENGAGEMENT
Author: Dan Hatfield, Enterprise Architecture
Download the PDF
Significant changes in the U.S. healthcare system have led consumers to take a greater responsibility in their healthcare. One of the challenges of engaging consumers to undertake this responsibility is the complexity of the health care ecosystem. In responding to the need for better transparency and engagement, providers have increased their focus on patient education. Payers and providers have spent time and resources ensuring that consumers have several means to access information, such as online content, search tools, options for providers, etc. Much of the focus has been on price and metrics and while these changes are vital, they are not the whole story. Just as important is engaging with consumers in ways that encourage them to manage their own health and well-being.
Health plans have looked to connect directly with consumers through enrollment, care management and healthy living offerings. Consumers interact with a dizzying array of family doctors, urgent care providers, pharmacists and specialists. The health plan ecosystem is also complex because it includes tiered networks, pharmacy, vision, and dental benefits, government and employer programs as well as high deductible and health savings plans. In order to engage effectively with consumers and provide superior health care and service in the future, providers and plans must bring together the information and interactions that are scattered across health plans and providers. The disparate information must be gathered in a timely manner so it can be utilized to drive consumer engagement at the appropriate times. By doing so, the consumer will experience health care services that anticipate their needs and simplify their interactions with the complex health care system.
In order to achieve the next level of integrated consumer engagement, information system investments in event collection and analytics systems are being made. The goal is to bring these disparate interactions together as they occur throughout provider and health plan systems. Once gathered, these interactions are run through new analytic capabilities that can identify consumer engagement opportunities. Using these information system investments to develop relationships with consumers is critical as health plans and providers seek to deliver superior health care and service.
TESTING WITH DE-IDENTIFIED PROTECTED HEALTH INFORMATION (PHI)
Author: Wade Donahue, Lead EA Architect
Download the PDF
It seems all too frequent, another data breach involving protected health information (PHI). This has not only resulted in a greater focus on data protection by regulatory agencies, but consumers are now demanding greater levels of assurance that their personal information is being protected. As a result, many organization are making significant people, process, and technology investments to reduce the risk of a data breach. One way to mitigate risk is to test with production data that has been de-identified.
De-identification is a process of taking data fields that contain PHI and either removing them or replacing the contents with other realistic data. For example, phone numbers are replaced with fictitious phone numbers. The HIPAA Safe Harbor method has become a benchmark for the field types to be de-identified. These field types include name, address, phone numbers, Social Security Numbers, etc.
Moving to testing with de-identified data starts with communicating that goal to all levels of the organization. Make sure to communicate early and often using a variety of media. Now that everyone understands the goal, you need a software solution that provides a variety of methods for de-identifying data. Unless you have very clean data, make sure any solution chosen has custom logic capabilities to enable conditional de-identification.
As you are looking at solutions for de-identification, you'll find many of them provide capabilities to perform scans of your data and identify candidate data fields for de-identification. For example, the scan will identify a continuous string of nine numbers as an SSN. This type of scan only gets you started, as the software doesn't know what custom fields you may have created in that could identify an individual. This is where data owners and subject matter experts come in to play as they need to verify and augment the scan findings. By the way, did I mention communicating early and often? Let them know this is coming. The results of the verification process provides you with the information you need to configure the de-identification software.
Now that we have the software and have it configured, we are ready to start de-identifying data. It is important to allocate sufficient time to test the data de-identification process and applications because you will have unanticipated data issues to deal with. Unless you have very mature and automated test practices, be sure to allocate time in your plan to address your test case inventory to align the scenarios with the newly created set of de-identified data.
Moving away from testing with production data to testing with de-identified data is not only a technical problem, it requires a major culture shift for everyone involved in the development, testing and support of your applications. By using de-identified data for testing you have taken one step in reducing the risk of a data breach and meeting the expectations of regulatory agencies and healthcare consumers.
WEB APPLICATION SECURITY
Author: Vince Crose, Manager IT
Download the PDF
The protection of personal data has always been a concern of consumers in the health industry. The prevalence of recent security breaches has brought further focus to this important topic. Consumers are demanding greater levels of security and assurances that their data is protected. This has brought about a greater focus on the "Principle of Least Privilege" - allow only access to information and resources that are necessary for legitimate purposes. What this really means is that users should only have the privilege and access to resources that are essential for them to complete their work.
In the Web application space, this has brought about the importance of "walling off - firewall" applications at a technology infrastructure level to prevent malicious users of the applications from being able to exploit any existing or future software vulnerabilities. While the use of firewalls for general web network traffic has been around for a long time, there is now an increased focus on using application specific firewalls. Web Application Firewalls (WAF) take the "walling off" one step further by plugging into the application layer and providing the ability to define rulesets and roles that can access specific applications.
In short, the protection of consumer data is going to require greater levels of controls around how data is accessed. The Web Application Firewall (WAF) technologies provide this ability at the infrastructure level to carry the Principle of Least Privilege beyond user access to the applications and software layers.
DATA PROTECTION: THE CENTER OF EVERYTHING WE DO
Author: Nate Dell, Manager, IT Infrastructure and Security
Download the PDF
Defining Data Protection strategies is critical in today's digital and data driven business. Data protection strategy can be broken down into three distinct groups or functions. The first is tagging or labeling data based on specific data elements. Every organization may place different amounts of value on different data elements. For example, a finance institute may have a high value on bank account numbers or credit cards whereas a healthcare organization may care about member ID numbers or patient information. What is important is that data elements need to be simple and discoverable in both the unstructured and structured world.
Tagging data is not a trivial task especially in the unstructured world. For example, let's take a look at everyone's favorite nine digit number, the Social Security Number (SSN). Simply creating a regular expression that searches all ASCII file types for a 9 digit number will result in a large amount of false positives. This rule must be coupled by identifying multiple key common words, (SSN, Social, Social Security, Security Number, Membership ID number, etc.) within proximity, 4 excel columns or the first 1/3 of the page of a word document of that 9 digit number. Tagging this information in the structured format is inherently less complicated because the data is structured and typically labeled within a database. By first identifying a key word list for each data element one can simply use that list to search database column names. Note data tagging is a "set it and forget it" type activity; the technology used to identify this information will need constant minor adjustments as people develop new business document templates or bring new databases online.
Once the data has been tagged and labeled across the organization, a huge feat within itself, it must be classified into buckets. Note that some data elements like SSN can be classified in multiple categories, regulated and confidential, and by default should follow the control with the most rigors. This classification scheme will drive how technology is deployed. Data is protected in following three distinct states
Once tagged and classified, the data must now create processes and technical controls on how the data will be protected. For example, let's follow a document that can be found in most health care organizations. A word document tagged with data elements SSN, Member ID, First/Last Name, Address, and medical claims information. Now think about the life cycle of this document and how it must be protected in all three states.
- Data at Rest - Any document that contains these data elements must be housed on an encrypted storage medium. This means regardless of where this document is stored (Network Fire Share, laptops/Desktops, Database, USB Thumb Drives, etc.) all drives must be encrypted
- Data In Motion - Any time this document is transmitted it must be encrypted and can only be done by authorized personnel. This would mean before the document is transmitted to a new location authorization is needed. Once the first hurdle is passed, restrictions to approve encrypted communication channels to transfer this file (Encrypted Emails, IM, Web Uploads, Secure File Transport Protocols, etc) are required
- Data in Use - Only authorized personnel can read this document based on business reason and only on authorized systems. This means that rigorous access controls and monitoring needs to be set up so at any point the question of who/when/why can be answered.
This data protection strategy is a powerful framework and although it sounds simple, is one of the most complex strategies to institutionalize. Data protection impacts everyone and everything.
USING INNOVATIVE TECHNOLOGY TO ENABLE ADVANCED MEDICAL PROVIDER SEARCH
Author: Tim Barnickel, Lead Architect, Enterprise Architecture HM Health Solutions
Download the PDF
Emerging innovative technology can now be applied to assist consumers to easily search for medical providers. A core aspect of this new advanced search engine technology supports "natural language like" and auto-suggest capabilities, similar to what end users are accustomed to with web search engines such as Google. A key requirement of this new approach, versus existing medical provider search capabilities, is that the consumer is not forced into a rigid and complex structured user interface. As an example of the new approach, a consumer can now input "foot doctor" into a general search box and the system will search for podiatrists, using advanced synonym capabilities, while also leveraging geospatial technology to further guide the search.
To complement advanced search engine technology, user interface (UI) design and technology has also rapidly evolved to enable consumers to conduct their search across devices with a wide range of capabilities and form factors, from smartphones through desktop computers. To provide a high quality UI, the "mobile first" design technique can be leveraged to optimize the consumer experience on a smartphone, while also leveraging responsive web design to ensure that the UI is rendered appropriately based upon the device's form factor. Furthermore, mobile phone geolocation capabilities assist with proximity based search. Advanced "open source" UI frameworks continue to rapidly evolve with new capabilities to facilitate the implementation of the UI layer.
HM Health Solutions has recently built a new provider search capability that leverages IBM Watson search technology, part of IBM's emerging "cognitive computing" platform. The UI was built with the popular open source framework, AngularJS from Google. Moving forward, HMHS will continue to evaluate advances in natural language search and other innovative technologies to further enhance provider search capabilities for its customers.