We have “Information Overload” in Clinical Guidelines.

There is an increasing push for physicians to practice “Evidence Based Medicine”. However, the “evidence” is getting harder and harder to come by. The creation of “Guidelines” by expert bodies may be of little help.  There are simply too many of them.

Alvin Feinstein, MD (1925-2001) and David Sackett, MD (1934-2015) began to consider how evidence might be applied to clinical decision-making (Clinical Judgment) in the late 1960’s and early 1980’s, respectively. One of Sackett’s students introduced the term EBM to the medical profession in the 1990’s. Sackett graduated from the University of Illinois Medical School and practiced of Internal Medicine and Clinical Epidemiology at McMaster University in Hamilton, Ontario.

In the 1970’s, the randomized clinical trial (RCT) came into favor as the preferred way to determine the efficacy of treatments. Since then the types of papers published have changed markedly. No longer do we have many case reports, small studies or “reviews of the literature”. Most medical publications are now replete with RCTs of various size and complexity. The number of papers published in biomedicine has increased by 5-6 times from the approximately 50,000 a year in 1970. Keeping this published information straight has become harder and harder, some might say almost impossible. Sometimes a series of RCTs or observational studies are combined into a “Meta Analysis”. One might say that a MA is simply a more structured “review of the literature”, with a somewhat stronger mathematical bent.

Until the 1970’s, guidelines rarely existed. Some documents proposed schemes to help with diagnosis. These were initially the work of a single expert clinician[i]. Today guidelines are frequently produced to help clinicians in the care of chronic diseases[ii] or for the use in imaging and diagnostic testing[iii].

One of the very first sets of standards for provision of one form of patient care was the “Standards for cardiopulmonary resuscitation (CPR) and emergency cardiac care (ECC)”, published in 1974[iv]. In the next edition[v] the guidelines were specifically designated NOT be a legal document and were NOT to be construed as evidence in legal proceedings.[vi]

Dr. Sackett always regarded the evidence base as one of several components of patient care. He once said that EBM had “… three arms: very good evidence, seen by a very good clinician and integrated with patients’ expectations.”[vii]

A review of data at the National Guideline Clearing House  (NGCH), a service of the Agency for Health Care Research and Quality, revealed over 2,400 unique guidelines sets (GS). There are 111 unique agencies that participated in the development of more than 5 GS each. There is frequently significant overlap with several organizations collaborating on several sets of guidelines. The table shows that there are MANY sets of guidelines for 7 chronic cardiac conditions. It also illustrates how rapidly the guideline numbers are being changed:

Table:         Some Cardiac Conditions with Guidelines at NGCH

Number of Guidelines

September ‘15            October ‘15            November ‘15

Hypertension                            442                          468                               486
Heart Failure                           369                           515                                517
Myocardial Infarction            201                           230                               230
Peripheral vascular disease   151                            151                                188
Atrial Fibrillation                    108                           108                                129
Angina                                         78                             78                                 86
Aortic Aneurysm                       53                             53                                  59

It may strain one’s sense of credibility to imagine that any single physician would be able to evaluate 442 sets of guidelines for the evaluation and management of hypertension, or for any heart failure clinic to have evaluated and combined the recommendations of 369 sets of guidelines for heart failure.

Looking at Heart Failure alone, I found 486 sets of guidelines that relate to heart failure. Of those 113 have been published or revised since 2000. Of those 113, 30 are primarily related to diagnosis or treatment of Heart Failure or CHF. The remainder has statements regarding the place that the presence of heart failure may modify the recommendations in that set of guidelines.

The American College of Radiology leads the list of GS, having collaborated in creation of  239 sets. The National Institute for Health and Care Excellence (NICE) in Great Britain was the next most prolific organization publishing 211 sets. One hospital has developed 112 GS, which they called “best evidence statement” or BEST.

In an effort to be thorough, medical organizations and societies may have actually complicated the job of the clinician in his/her quest to remain up to date on what is “the most appropriate” strategy for diagnosis and treatment.

This leads to the question, “Why is this guideline group promulgating this set of guidelines?” It would appear that at least some groups are trying to give a “distinctive voice” to a unique subset of stakeholders in any specific clinical condition.

What are we to do with this “Tower of Babel” of guidelines?
The clinical system or clinician, who might want to use summaries of available evidence to help with clinical judgment, would be well served to ask if each source of available guidelines is relatively unbiased, and associated with a nationally recognized leader in the field[viii]. Governmental sources are frequently relatively unbiased[ix]. National Medical Associations[x] are generally considered reputable organizations with minimal biases. Recommendations published in the journals sponsored by them should be well rounded.. Finally, another filter to help find reliable guidelines would be where they are published. Guidelines published in well-recognized journals with as good a rigorous peer review process as possible are more likely to be reliable. Whether a provider group should accept a published series of guidelines or try to synthesize an analysis of others may be up to the group. However, in spite of the best intentions of the guideline writing groups, payers, lawyers and quality review organizations most often refer to a published set of guidelines. This makes trying to create an individual set of guidelines often counterproductive.

[i] The Jones’ Criteria for Diagnosis of Rheumatic Fever published in 1944 is one example   – the problem with these often is that there was no “gold standard”
[ii] Hypertension, Heart Failure, Diabetes, Asthma are some conditions for which guidelines may be necessary.
[iii] ACC – Appropriate Use Criteria – the first set of AUCs appeared in 2005
[iv] JAMA, 1974, 227 Suppl: 833-868
[v] JAMA,1980, 244, 453-509
[vi] ibid p 505.
[vii] J of Undergraduate Life sciences; 2010, 45, 66-67
[viii] In cardiovascular disease some relatively unbiased sources might include the American College of Cardiology, American Heart Association, and the European Society of Cardiology
[ix] United States Public Health Service (USPHS), the National Institute for Health and Care Excellence (NICE) in the United Kingdom are examples
[x] British Medical Association, American Medical Association, Massachusetts Medical Society among others

Posted in CV, Guidelines, Policy, Quality | Leave a comment

Helping our Patients and Ourselves Navigate the Internet for Reliable Health Information.

In June 2015 Dr. Arthur Caplan opined on Medscape that physicians should be prepared to help patients in some way as they try to navigate the morass of medical information that is available on the Internet[i]. One oft quoted study from the Pew Research Institute (2013) suggests that over half of US adults have looked for health information on the internet and that up to 80% begin at a commercial search engine. What is less clear, however, is how many people get their attention waylaid by some other web based source of medical information, but it is likely a large number. Caplan suggests that at least several of these sources may have “Evil People” behind them.

There is generally some skepticism about information on the Internet, but this degree of skepticism is likely not as prevalent as would be desirable. In 2012 the State Farm Insurance Company sponsored a TV ad that began with the skit that ran:

“ ‘Where did you hear that?’; ‘The Internet’; ‘And you believed it?’; ‘Yeah, they can’t put anything on the internet that isn’t true.’; ‘Where did you hear that?’; ‘The internet’.”                                                                                      


If we are to be able to help our patients navigate the Internet, it would be helpful to determine what is easily available. There are several search engines, or search engine groups. One website posited at least 12 sites in addition to Google, which seems to be the de facto market leader, with upwards of 65% of searches beginning there.

In order to see what was available, I looked for health related information, in the top 4 commercially available search engines (Google, Bing (a Microsoft site), Yahoo, and Ask). I used the initial search term “Health Information Websites”. There are well over 180,000,000 active sites identified by each of the search engine sites (Yahoo claimed over 480,000,000 sites).

Some sites that may be helpful in finding reliable information include two on evaluating websites:

  1. https://nnlm.gov/outreach/consumer/evalsite.html accessed 7/6/15.

This one page site with an easy to read and follow process map on evaluating a webpage, is from the National Network of Libraries of Medicine, and may be among the most important sites available.

 2. http://caphis.mlanet.org/consumer/index.html accessed 7/6/15

This is a website of the: Consumer and Patient Health Information Section of the Medical Library Association, Inc (https://www.mlanet.org/ ) , which has an interesting “User’s Guide” for finding and evaluating health information on the web (https://www.mlanet.org/resources/userguide.html)

 They then have a list of “top 100 health websites you can trust”, which refers to NIH and other governmental websites as well as websites for many of the major disease specific professional organizations. This may be a good approach, but would likely be frustrating to a person who has a simple question to answer.

 Evaluating Information on the Web:

In looking at a website to estimate reliability, most sources suggest looking at the site itself for its “professionalism”. Secondly one should ask whether there is an evident bias in the material – especially if the site is trying to sell something or asking for money. Thirdly, a searcher should ask whether the information that is being purported as real, is backed up with information from other sites. Finally, if the information on the site is supported by references or hyperlinks, it is more likely to be reliable. Evaluating these four characteristics: Professionalism, Bias, Uniformity through several sites, and provision of References should allow a person looking for health information to find reliable information upon which to make decisions.

Perhaps the easiest site to use and find reliable information is the Website of the National Institutes of Health: http://health.nih.gov/

My interpretation after spending time on the site is that it s well organized and has a robust search window, as well as directions to topics of interest, such as clinical trials, Medicare, and others. It was rated as in the top 3 of all the search engines.

Another Governmental Site that has links to many useful concepts is:  https://www.nlm.nih.gov/medlineplus/

Other sites that were common to all the search engines were:

There are many more, often included in the website of universities, insurance companies or other large healthcare providers, but if we send our patients, friends and other associates to these six sites, they will have a very good head start on their journey through the minefields of getting health information.



[i] Caplan, A.L.: Are Evil People Influencing Your Patients? /Medscape, Jun 24, 2015

Posted in General Interest, Health Information, Health Informtion Exchange, Literature | 3 Comments

Dr. Gawande has done it again – almost – a review of “Being Mortal”; Gawande, Atul; Metropolitan Books; New York; 2014

This book is almost on track to be a potential game changer.

The title is engaging. However, on my first reading, I found the book a little difficult to follow. Dr. Gawande has essentially written about two distinct components of “the modern experience of mortality” – The first five chapters discuss aging and the optimization of the life experience of aging patients . The second portion of the book deals with care in advanced disease – mostly in the context of wide spread Cancer.

As has been his habit in his other books Atul tells stories of his own experiences and interviews he has done with some innovators in the delivery of care. Keeping in mind that the pleural of anecdote is not data, he uses stories to make his use of data more personal and meaningful to a lay reader (there are 12 pages of citations – pp. 265-277). As usual, Dr. Gawande is thoughtful. In this book he may be even more provocative than he has been before.

He has investigated the nursing home concept as it applies to care of elders who have lost the capability of being fully independent because “things fall apart”. He makes a clear argument that aging is not a medical condition, but the result of “the accumulated crumbling of one’s body systems”, including unsteadiness, loss of position sensation, loss of flexibility, and muscle weakness. He also notes that there is a time related deterioration of many conditions with the passage of time that is called the “natural history” of the disease. Illnesses such as Heart Failure, Emphysema, and Atherosclerosis are examples of these. Some forms of arthritis are also often considered a natural component of aging. Dr. Gawande includes stories of many facilities that have improved the experience of living in older age, by assisting with living, not assisting with dying. He makes the distinction between helping people live in old age or managing the dying experience. In the dying experience, where the goal is “patient safety”, often elders end up with a ”life designed to be safe, but devoid of anything that they care about.” He quotes Bill Thomas, MD who describes what he calls the three plagues of Nursing homes for the aging person: “Boredom, Loneliness, and Helplessness”. There are stories of facilities for elders that help some seniors live better, including Park Place in Oregon, Chase in upstate NY, NewBridge on the Charles in Boston, and Peter Sanborn Place in Reading MA among others.

The second portion of the book addresses a concept that physicians often refer to as “futile care”, almost always related to diagnoses of cancer. He is not addressing things such as treating cancer when it is first diagnosed in early stages, but more the continuing use of newer therapies that may prolong life by a short period of time (often measured in days or weeks only) at the expense of quality of life resulting from the side effects of treatments. He uses his own father’s clinical condition relating to a tumor in his cervical spine, that he lived with for a prolonged period because he still had life experiences that he wanted to accomplish. Dr. Gawande introduces us to Dr. Susan Block who had helped develop the concept of asking what is important to people who may have to make hard choices . Keeping the discussion in line with the individual patient’s values and goals, as a means of directing treatment decisions, should increase patients’ quality of life in the times of difficult conversations. He also discusses the benefits of hospice care and making advanced directives – using the experience of LaCrosse, WI where there was a concentrated effort to improve end of life discussions so that physicians knew what should be considered if and when a patient came for care. He also discusses several data sets that suggest that hospice care is associated with increased, not decreased, longevity in patients with advanced disease that is not responding to “modern medical therapy”. Gawande then points out that he is not suggesting giving up early, but that the physician directing care should be like an army general … “in a war that you can’t eventually win, you don’t want Custer. You want Robert E. Lee, someone who knows how to fight … and how to surrender when you can’t” win.

In several parts of the book, I teared up, but then I am a softy.

If I had my choice, I would have liked some help in keeping track of the characters in his stories and some of the concepts that he is discussing. I counted at least 12 different patient stories, which were sometimes scattered throughout the book. There were over 8 physicians and several other key people. An index might have made keeping up with them all a little easier. Also, I would have found it easier to understand the book if Dr. Gawande made explicit the two different segments of his arguments.

Posted in General Interest | Leave a comment

General Shineski Needn’t Have Been Ousted – He Was Betrayed

At the end of May, after a series of exposés and congressional hearings, General Eric Shinseki, was pressured to resign as Secretary of the Department of Veterans’ Affairs. The major reason for his departure was that the department, including up to 1,700 potential sites of care, couldn’t see Veterans in a timely manner. These problems have been known to the VA, including some misreporting of wait times, since at least 2005 (OIG Report of 5/28/2014). At some time a program was instituted to incentivize the CEOs of individual VA facilities to create a culture of rapid response to a request for an appointment. If appointments were reported to be available, the CEO and staffs could receive a financial incentive. It should not come as a surprise then that intelligent people who were trained in a business model were able to find a way to “hide” the fact that many veterans (1,700 in Phoenix alone) were not getting appointments within 2-3 weeks of a request. A “ghost” waiting list was kept at in least two hospitals (Phoenix, Arizona, and Hines, IL – interestingly one common thread between those hospitals is that the CEO was the same person – first at Hines (Feb. 2010-2012) & then at Phoenix (2012-2014)). This CEO is reported to have received at least one significant financial reward for what appears to be misreporting results of her administration. This is almost a perverted application of the concept “you get what you pay for or what you measure”.

If Gen. Shinseki’s transgression was that patients were not being seen promptly, it might be that he believed what he was being told. General Shinseki certainly understood leadership. He had participated in writing a book (Be, Know, Do: Leadership the Army way). He certainly was aware that more junior military officers were supervised and mentored to ensure that they understood ethics, how to adjust to stress and how to adjust to try to achieve commander’s intent. There are multiple data/opinions (from the lay press, but buttressed by data from at least the Harvard Business School and Northwestern’s Kellogg School of Management among others) suggesting that, in many business settings, CEOs with Military Experience tend to be highly ethical and generally able to lead civilian organizations to success. The bureaucratic leaders that Shinseki inherited in the DVA didn’t seem to embrace this ethical culture. He observed, “I can’t explain the lack of integrity among some (italics are mine) of the leaders of our healthcare facilities. This is something I rarely encountered in 38 years in uniform.” In addition one might question whether the employees that staff the VA system understand/understood the overall mission and goals of the VA system – VA is committed to developing a culture that is advanced, forward-thinking and completely Veteran-focused.

We might take as a lesson that in large organizations new leaders might entertain a healthy degree of skepticism relating to the ongoing conduct of the staff. General Shinseki certainly knew about leading by walking about – being visible to his subordinates and reinforcing the message of the mission. If he had been aware of the problem of having veterans seen promptly, could he have been able to convince his superiors (congressional committees) to increase funding for the VA for more providers? About 3 years ago, I volunteered to help in the clinics at a local VA hospital outpatient department. I was told that there was no need for more physician coverage. In retrospect, I doubt that. There may have been no budget for more physician coverage, but they might have accepted volunteer physician help.

Dealing with a civilian bureaucracy, with trade unions representing much of the workforce was certainly something that military training may not have prepared a CEO to handle. This structure would have made it more difficult to reprimand recalcitrant staff than it would have been in the army. However, I would hope that a military leader could emulate General Marshall who is reputed to have taken ineffective commanders out of their role, and then give them a second chance to acquire the skills to subsequently become a real leader. This would seem to be a better way to help a subordinate grow than simply removing or firing someone. Those working within the framework of a second chance may be more motivated to embrace and encourage a culture that we are looking toward. There are suggestions that ongoing culture and competence training of VA intake or appointment staff wasn’t continued. Many effective organizations (Mayo clinic for example) have ongoing culture classes for all levels of the organization. In addition, a very clear, but simple set of values and expected behaviors that is promulgated prominently throughout the organization should help improve honesty (Dan Ariely in “Predictably Irrational” (2008) showed that this type of nudge can influence behavior)

Could the VA scandal have been prevented? – In all likelihood yes. Would it have been easy to prevent? – No. How did it happen in an organization that was thought to be amongst the best in the late 1990’s? The most likely answer to that question is that someone took her/his foot off the gas that had kept ongoing training and culture intact and allowed the system to sink back into mediocrity. The new leadership didn’t go back to the beginning to ensure that meritocracy resurfaced.

Posted in General Interest, Leadership, Policy | 2 Comments

Diagnosis may be the Achilles Heel of Incentive Based Payment.

“Diagnosis is the mental act of selecting the one explanation most compatible with all the facts of clinical observation”.  – Raymond Adams in Harrison’s Principles of Internal Medicine – 4th edition

In almost all instances, Government and other third party payer incentives for improving performance in medicine rely on a clinical diagnosis upon which to judge performance. The data reported in Hospital Compare rely on an accurate and inclusive diagnosis of Myocardial Infarction, Congestive Heart Failure (CHF), and Pneumonia for hospital ratings. In each of these clinical conditions there are specific criteria for making the diagnosis. However, also in each, the diagnosis must be considered before the diagnostic criteria can be applied. Once criteria are applied they must be evaluated. There are no specific criteria for the diagnosis of CHF for example – there are several sets of diagnostic “criteria”, including proposals by the Framingham study group[1], a Harvard Study Group[2], and a group from the University of Virginia[3], which have sensitivities ranging from 0.41 to 0.71 and specificities from 0.89 to 0.97[4]. The addition of BNP values doesn’t help much especially when the diagnosis is not suspected.  Estimates of error in diagnosis suggest that there is an error in diagnosis in somewhere between 10% and 15% of encounters[5],[6]. Many of these may never be detected (see our prior post on a diagnostic error (“Who worries about physician behavior …”). If a patient has a clinical condition, but it is not appropriately diagnosed, then that patient never appears in any denominator of performance (in either process or outcome measures). Consider a hypothetical patient who is overweight (BMI 32), smokes, is mildly short of breath (SOB), coughs and has mild ankle swelling. If this patient is diagnosed as being obese and having chronic pulmonary disease, then he/she will be looked at as if they might have COPD. Suppose again that this patient comes into the hospital with a mild fever, an increase in cough and SOB, and is diagnosed as having an exacerbation of COPD, he is treated with antibiotics and then recovers after 3 days. Again, the performance measures are met for COPD. Three years later, the same patient comes in with orthopnea, PND, moderate ankle edema and cardiomegaly. At this time, the diagnosis of CHF is entertained. For four years this patient has been considered, by those looking at quality metrics, as having a condition that, in retrospect, was probably not correct. For those years, if there was P4P, P4Q or some other reward system, the physician practice or health care system would have been the recipient of inappropriate incentive compensation.

There are other areas where an error in diagnosis is important – a diagnostic error can, as shown above, delay initiation of appropriate treatment for a patient. There is legal exposure for diagnostic errors. Some estimates suggest that over 29% of malpractice claims and judgments are for diagnostic errors[7].

Defining diagnostic errors themselves is difficult – the final arbitrator may always be challenged. Coming up with the causes of diagnostic errors is even harder. Not all physicians are equally adept at arriving at a correct diagnosis (even sometimes an experienced physician when presented with the same clinical context may not even arrive at their prior diagnosis). Certainly research and training of physicians should help in understanding causes of error and increase vigilance to try to avoid such errors. Sometimes it may be enough to encourage a diagnostician to be aware of biases that may cloud judgment (one estimate is that there are over 15 potential biases that may impede accurate diagnosis)[8]. In other instances it may be enough to encourage diagnosticians to be aware that over reliance on heuristics in approaching a patient can lead to errors in reasoning.

Expert diagnosticians in the past used to insist that after a diagnosis was reached that the clinician be encouraged to keep an open mind by defining a minimum of 3 alternate explanations for the clinical presentation – called a “differential diagnosis”. Over reliance on advanced diagnostic imaging may also lead the clinician astray. Many physicians and surgeons believe that advanced imaging techniques such as CT and MRI scans are a “gold standard” of anatomic diagnosis. However, almost every orthopedic surgeon has more than several instances in which the MRI scan suggested a diagnosis in which the result was either not confirmed, or had nothing to do with the patient’s illness/complaints. One study suggests that for the diagnosis of meniscus tears the MRI is only accurate (compared to intraoperative findings) approximately 75% of the time[9]. Another showed that operating on the back, based on the findings  of an MRI exam didn’t necessarily improve patient symptoms[10],[11]

The real gold standard may still be the autopsy, which has fallen out of favor as a check on our clinical diagnosis. William Osler considered the autopsy so important in his and others education that he did his own. Richard Cabot brought the autopsy to the fore when in the early 1900s he proposed the Clinical Pathologic Conference as a teaching tool. This became formalized in 1925 when the NEJM began publishing a weekly CPC under the rubric of Case Records of the Massachusetts General Hospital[12].

Diagnosis has taken a back seat to proceduralism today, partly because as many other commentators have pointed out, there is little time or reward for non-procedural patient encounters. This leads to potential skimping in the taking a thorough history, performing a complete physical exam and coming up with a differential diagnosis because this behavior is not rewarded in today’s fee for service system. In spite of the importance of reaching the correct diagnosis to direct the correct treatment and determining the appropriateness of pay for performance/quality, the outstanding diagnostician isn’t rewarded. Not all physicians today are outstanding diagnosticians, nor were they in the past. In the past going to see a “diagnostician” was something patients often valued. Then it was understood that a prerequisite to effective treatment was the right diagnosis.

[1] McKee NEJM, 1971 285, 1441

[2] Carlson: J Chron Dis, 1985, 38, 733

[3] Gheorghiade, M et al; Am J Card, 1983, 51, 1243

[4] The sensitivity and specificity values in all three were against an “expert” diagnostician or panel of expert diagnosticians.

[5] Graber, M: Joint Commission J on Quality and Patient Safety, 2005, 31, 106

[6] Berner, ES, Graber, M: Am J Med 2997, 121, S2

[7] Tehrani, ASS, et al: BMJ Qual Saf, 2013, 22,672

[8] Landro, L; Wall Street Journal 2013, Nov 17: http://online.wsj.com/news/articles/SB10001424052702304402104579151232421802264 Accessed 1/21/14  Subscription may be required

[9] Hardy, JC et  al: Sports Health 2012, 4, 222

[10] Deyo, RA et al: J Am Board Fam Med 2009, 22, 62-68

[11] http://www.npr.org/blogs/health/2014/01/13/255457090/pain-in-the-back-exercise-may-help-you-learn-not-to-feel-it accessed 1/21/14

[12] Roberts CS. The Case of Richard Cabot. In: Walker HK, Hall WD, Hurst JW, editors. Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd edition. Boston: Butterworths; 1990. Available from: http://www.ncbi.nlm.nih.gov/books/NBK702/

Posted in General Interest, Policy, Quality, treatment options | 2 Comments

What is Quality? It Depends on Who Does The Measurement

As I said last month, quality is difficult to define and is almost in the eye of the beholder. This is very much like Humpty Dumpty’s assertion that, “When I use a word, it means just what I chose it to mean – neither more, nor less”. Maybe then quality means what a group says they are measuring is what defines quality. There are many groups that report quality metrics to the public in the “lay press”. There are more than a dozen sets of publically reported quality metrics. These are in no particular order:

  1. US News and World Report annual surveys
  2. Consumer Reports
  3. Hospital Compare (CMS website’s public reporting)
  4. National Quality Forum
  5. HealthGrades
  6. The Leapfrog Group
  7. Truven (used to be Thompson Reuters which itself used to be Solucient)
  8. The Joint Commission (with its ORYX set)
  9. National Committee on Quality Assurance (with its HEDIS – Health Employer Data and Information Set (1998), or Healthcare Effectiveness Data and Information Set (2012))
  10. Premier Healthcare Alliance (with its QUEST – QUality Efficiency Safety & Transparency) reports
  11. Several non governmental insurers including:
    1. Blue Cross – Blue Shield  (with its Blue Distinction)
    2. United Health
    3. Humana
    4. Aetna

The number of metrics going into a hospital ranking is not consistent, ranging from approximately 8 to more than 80. In addition, a single ranking organization will often change the metrics from year to year. Many of these organizations use data from other sources to help with the quality rankings. One of the most often used outside sources is the Agency for Health Care Research & Quality, which provides the Consumer Assessment of Healthcare Providers and System (CAHPS) scoring – first used in 1995. Many Healthcare systems and providers also use the Press Ganey Company (founded in 1985) scoring of patient satisfaction.

It should not be surprising that none of these several organizations report on the same set of metrics including the same process or outcome measures. I compared the top several hospitals in the Chicago area by each of the rating groups. None listed the same hospitals.

Are there potential unanticipated consequences of reporting quality? Some providers (practitioners and health care systems) may tend to skew their behavior toward a quality metric that may or may not be really associated with a desired outcome. One that was recently reported was how physicians are potentially providing unnecessary and potentially harmful services in order to help with Press Ganey satisfaction scores[1].

In the mid 1990’s one ranking agency listed a Chicago area hospital in the top three in the region. The next year, the Health Care Financing Authority (the predecessor of the Centers for Medicare and Medicaid Services of the federal government) determined that there were enough potential quality lapses that it began an investigation of that same hospital. 13 years later, the hospital closed. This example would suggest that in some cases, rankings by outside agencies may not always be a reliable indicator of a healthcare providers performance.

There are, however, data that suggest that those healthcare systems that do better on any one scoring system tend to have better “hard” outcomes such as lower readmission rates, lower hospital acquired complication rates, and even lower mortality rates, than those that don’t rank well on any. These data are, however, likely skewed by reporting biases and are not necessarily embraced by all those who are looking at the research.

What is one to do with this plethora of data?

Hospitals and Systems might charge the clinical leaders such as the  CMO & CNO with picking one or two of the major reporting groups to report to. Institutions should do well on marketing or advertising favorable results from a reporting group. Thus, I would posit that this initiative should be supported from the marketing dept. budget.

Consumers – patients, families and caregivers should review their hospital’s web site to see what they are reporting. They should also look at one or two sites that rank their local hospitals, in addition to Hospital Compare to determine whether they look at things like:

  • Re-admission rates (higher readmission rates expose patients to lost time out of hospital and increased likelihood of hospital acquired conditions such as infections and adverse reactions to medications, among others).
  • Rates of hospital acquired conditions.
  • 30-90 day post hospital mortality from common illnesses such as heart attack, pneumonia, or surgery among others.
  • Use of Electronic Medical Records (EMR) and connectivity to physician offices as well as ability to access “your” data.
  • Patient satisfaction scores (what percentage of patients would recommend the hospital/system to friends or relatives) – while these are potentially biased, they among other considerations can help suggest whether you may want to use that system.

To date there is minimal transparency of reliable data for patients to use to determine what real quality exists in their health care delivery system. Some of these considerations may help in choosing wisely.

[1] http://forbes.com/sites/kaifalkenberg/2013/01/02/why-rating-your-doctor-is-bad-for-your-health/ accessed 2/25/13

Posted in General Interest | Tagged , , | 2 Comments

What is Evidence Based Medicine?

One definition would be: Delivery of Medical Care based on results of best available evidence. This usually means finding or relying upon data, some of which will be from outside one’s immediate memory to help answer a clinical question. EBM began to be championed by investigators, probably in the early 1990’s[1].

Studies that comprise the evidence are often in the form of Clinical trials. Clinical Trials are a relatively recent addition to the armamentarium of the tools available to clinicians – the first Randomized Clinical Trial (RCT) was done in 1952[2]. Trials should, but often don’t, look at how an innovation compares with prior knowledge, standards of care, or diagnostic accuracy. This is called comparative effectiveness (CE). In addition, trials often achieve “statistical significance”, which may, however, be of little pragmatic significance. The biological significance of much newer data is frequently incremental, not disruptive. Trials are almost never done to look at cost, although some report cost considerations in analyses that are done after the original publication. When cost does become a consideration, political or other influences challenge the focus or reduce funding of whoever is doing the research. For instance, the dissolution of the US OTA[3], which functioned from 1965 to 1995, is an example of outside influences resulting in the dissolution of a system that had worked to improve care in the 30 years the office was in existence.

When evaluating evidence one needs to be aware of at least three considerations/tests:

  1. What is the source of the evidence (can you identify prospective biases, are the data reliable – Have non confirming data been looked for?)?
  2. What is the Strength of any recommendations?
  3. What is the Strength of the data that underlie the recommendations?

The AHA/ACC construct on strength of recommendations and the data supporting them is as good as any. Most other writing groups use some variations of those:

The strength of Recommendation is categorized as:

Class 1   Benefits Markedly Outweigh Risk, so the procedure/treatment should be done
Class IIa Benefits Somewhat Outweigh Risk, and it is reasonable to do the procedure/treatment
Class IIb Benefits May Equal or Minimally Outweigh Risk and it MAY be considered
Class III There is No Benefit or Harm may ensue. The procedure/treatment should generally be avoided unless there is a good reason to use it.

The Strength of the data is categorized as:

Level A Data are from multiple Randomized Clinical Trials (RCTs) or Meta-analyses
Level B Data are from limited populations (single RCT or non randomized studies)
Level C Consensus expert opinion, “standard of current care”, case studies.

There are two additional ways to look at EBM:

EBID – Evidence Based Individual Decision
EBG – Evidence Based Clinical Guidelines

Each has its place in the practice of EBM. In neither case should the evidence be looked at in isolation. New evidence should always be evaluated in the context of prior information, even though prior concepts may not be as soundly evidence based as newer information. The concept of Bayesian analysis where new evidence is evaluated based on prior heuristics should be applied. If new evidence comes about which markedly contradicts common or personal practice, the reliability of the source should be closely examined.

Guidelines are not new. One of the early guidelines was developed to help standardize and improve the diagnosis of Rheumatic Fever in 1944  – the Jones Criteria which were developed because, “From a study of the medical literature it is obvious that each observer has his (sic) own diagnostic criteria and these may differ widely”[4]. Thus, physicians couldn’t evaluate how therapies worked in this illness, because they didn’t really have a common definition of the illness. The guideline gave a way to be more confident of homogeneity in the diagnosis. The paper also allowed “clinical judgment” to override the criteria if done for good reason. CardioPulmonary Resuscitation was codified in 1966 because, “…clinical results vary widely and depend on the exact technique taught, the effectiveness of training…”[5]

Guidelines most often work for the majority of patients. When a patient or group of patients don’t respond to guideline based treatments, then a physician needs to determine why. Should the physician document that he/she has tried “standard or guideline recommended care” and that it hasn’t worked? How much should the patient know about a physician embarking on non-EBM care for this specific clinical condition? How much should the patient participate in any such “experiment” that such deviations from prior EBM/Guidelines represent? How much responsibility should the physician then have in sharing his/her observations with others? Alvin Feinstein in Clinical Judgment[6] discusses this concept.

Some Guidelines have multiple sources. Cardiac guidelines are often prepared with input from multiple physicians and other involved stakeholders from the ACC/AHA and the European Society of Cardiology. Some confusion may arise because there are “too many” guidelines. For heart failure, for example, there are FIVE distinct sets of guidelines, which are not concordant. Clearly for use by an individual or in groups, guidelines should be reviewed. For use by an institution or system a workable set should be developed.  This exercise often helps bring physician groups and practices together and helps them work better within an institution.

Some Sources of “Evidence”

Cochrane Collaboration

POEMS (Patient Oriented Evidence that Matters – Primary Care Oriented)

USPSTF (United States Preventive Services Task Force)

CADTH (Canadian Agency for Drugs and Technology in Health)

NICE (National Institute for Clinical Excellence – Great Britain)

SBU (Swedish Council on Health Technology Assessment)


[1] McMaster EBM working group; Evidence Based Medicine: A new Approach to Teaching the Practice of Medicine; JAMA 1992, 268, 2420-5

[2] IOM (Institute of Medicine). 2011. Engineering a learning healthcare system: A look at the future: Workshop summary. Washington, DC: The National Academies Press

[3] OTA – Office of Technology Assessment: referred to in O’Donnell, JC et al: Health Technology Assessment: Lessons Learned from Around the World-An Overview: Value in Health, 2009, 12, Suppl. 2, S1-S5

[4] Jones, TD; The Diagnosis of Rheumatic Fever; JAMA, 1944, 126, 481-4

[5] Ad Hoc Committee on Cardiopulmonary Resuscitation … National Research Council: Cardiopulmonary Resuscitation; JAMA, 1966, 198, 138-145

[6] Feinstein, A; Clinical Judgment, 1967, Williams & Wilkins, Baltimore

Posted in effectiveness/efficacy, General Interest, Quality, treatment options | 2 Comments

Providers, Patient Care Delivery and Policy: Hospitalist story

There are often perverse incentives in health care. These incentives can, at times, create competing drives where providers are encouraged to do things that directly increase the costs of care.  Consider the reaction to the mandate to cut down hours that physicians in training (PIT) work to allow them to be able to think more clearly. This gave rise to the hospitalist movement[1], which was also in response to the Diagnosis-Related Group (DRG) concept signed into law by Ronald Regan in1983[2]. In the mid 1990s, hospitals began contracting with groups of physicians to care for hospitalized patients. Hospitalists began with an incentive to improve efficiency in hospital care and the inevitable shortening of the duration of hospitalization. Under DRG based payment, hospitals that could provide care to Medicare beneficiaries at a cost lower than the DRG payment level could keep the difference to cover other expenses. This was to encourage hospitals to become operationally more efficient and to negotiate more vigorously with suppliers to decrease their costs.

There have been multiple examples of Hospitalists being more proactive in navigating a patient’s inpatient care, thus decreasing the duration and cost of a hospitalization. It quickly became evident to hospital executives and their boards of directors that hospitalists appeared to decrease the hospital’s costs of delivering care, thus improving their “bottom line”.

A recent study from the University of Texas, reported in the Annals of Internal Medicine[3], confirmed that hospitalist do, in fact, keep the cost of an individual hospitalization down. However, on average, the costs of the post hospital care are greater. Therefore, while the hospital spends less on an individual hospitalization, the payer ends up spending more on the same patient mix. This leads to competing incentives. The hospital is encouraged to use hospitalists for care, because of the reduction in cost of a single hospitalization. This increased the margin from the DRG payment and also added potential revenue from subsequent hospitalizations[4]. Payers, on the other hand, may look at these data and question whether paying hospitalists for care gets the patient and payer better overall quality. On a population level, this might lead to an incremental health care cost of $1.1 Billion dollars a year.

The study suggests that the reason for higher total costs relates to differences in behaviour of hospitalists from PCPs. Patients cared for by hospitalists were less likely to be discharged to their home. In addition these patients were less likely to see their PCP shortly after discharge and more likely to be readmitted or seek ER care than were patients cared for by their PCPs while in the hospital. These latter observations suggest that there may be a suboptimal hand-off of care from the hospitalist back to the PCP. There may be other undefined reasons for increasing post hospital care as well.

For hospitals to improve overall quality, a culture of cooperation and communication between hospital based physicians and those in outpatient care (Primary Care) must be encouraged and enforced. This may in the short run interfere with profits, but will in the longer term improve patient and payer satisfaction.

[1] Wachter, RM, Goldman, L: The Emerging Role of “Hospitalists” in the American Health Care System: NEJM, 1966; 355, 514-17

[2] Part of Public Law 92-21 (http://thomas.loc.gov/cgi-bin/bdquery/z?d098:HR01900:@@@L&summ2=m& )

[3] Kuo, Y-F; Goodwin, JS: Association of Hospitalist Care with Medical Utilization After Discharge: Evidence of Cost Shift from a Cohort Study: Ann Intern Med, 2011, 155, 152-9

[4] Recent CMS policies to be implemented in 2012 will impose a penalty for readmissions that are potentially preventable. The effect of these proposed penalties on discouraging readmissions is not clear.

Posted in General Interest, Operational effectiveness, Policy, Quality | Tagged , , | 3 Comments

Why do Physicians Behave the Way They Do?

I believe that the vast majority of physicians do “the right thing” for their patients. I don’t think I’m being a Pollyanna. On the other hand, the “The Tragedy of the Commons”, which describes behavior in many cultures, doesn’t bypass the rod of Aescapelus. The story of “The Tragedy” reveals how individuals frequently tend to behave in a manner that satisfies short term personal goals rather than taking into account longer term outcomes of behavior that will benefit groups larger than a person’s close circle. This relates to the way that physicians behave relating to patient care.

There are several reasons that physicians “do things” for/to patients. The most common is that the individual physician honestly believes, based on his/her interpretation of information that is in their construct[1], that a diagnostic test or therapy will help the patient get better. When an appropriate test/treatment is provided to a patient, an improvement in health is most often the outcome. If the test/treatment is not appropriate, frequently either no benefit or an adverse outcome will result. (See “Who is responsible”). In addition to providing the right therapy, it should be in the right dose/form. It is not uncommon to see the phenomenon of the “treatment-paradox” where physicians will apply an effective therapy or test to those patients at low risk, but then avoid using the same procedure in sicker patients, who stand a greater absolute chance of benefiting. No one has adequately explained this behavior. In some instances it is related to contraindications to drug therapies in sicker patients. However, in many other instances there are no clear explanations (Peterson, P, et al: Circ Cardiovasc Qual Outcomes. 2010;3:309-315.).

A second explanation for physician behavior relates to experience embedded in the concept of “to a man with a hammer, everything looks like a nail that needs to be pounded”, When a physician is trained to do something, she/he becomes familiar with that and tends to use it because he/she believes it is beneficial. Who hasn’t heard a surgeon say, “A chance to cut is a chance to cure”? Physicians or surgeons who may be suspected of doing unnecessary procedures most often actually believe, because of this familiarity bias, that they are/were doing “the right thing”. I believe that this is partly why many physicians are still doing angioplasty/stent in patients with chronic stable angina before maximal medical therapy. They simply “believe” that angioplasty makes so much sense that it must “work”.

A third reason for physician choice in therapies may be that they are afraid to not do something because of fear of “liability exposure”. This has given rise to the malpractice debate. However, there are suggestions that even in regions where there is “tort reform”, physicians continue to apply procedures to avoid “malpractice risk”. Gawande, in his essay “The Cost Conundrum”, points this out. This behavior may relate to how badly physicians want to avoid legal confrontations even if the likelihood that they will personally loose much money is remote.

The elephant in the room, however, is when physicians do procedures for personal gain. He/she may duplicate a diagnostic test or therapeutic procedure because they know that they will be paid for doing it and that no one is going to check (How many of us read an explanation of benefits and understand it, much less inform our payer that a billed procedure wasn’t performed?). I have seen instances in which a physician group has a culture of doing repeat diagnostic tests exactly when payers say that they will pay for it. There are also instances in which patients are kept in hospital until the day insurance benefits run out, when miraculously the patient is “cured” or has received “maximal hospital benefit”. Many believe, even when it is difficult to prove, that this behavior is frequently applied to patients in psychiatric facilities.

Can unnecessary procedures, tests, therapies be identified and stopped? This will be difficult in a system where the culture is one of individual profit maximization and one where patients are not responsible for approving a bill before payment. Our current system of Relative Value Units or Diagnosis Related Groups as the basis for payment of physicians has led to the concept of “productivity”, which is often narrowly defined as how much one can bill. This is akin to the legal profession’s “billable hours”. (Does anyone remember the story of St. Peter talking to a lawyer who died at 45 years. When asked why now, St Peter said that he thought, based on hours billed, that the man had lived to be 100)? If this is the way that we will continue to pay (also called incentivize) physicians (and other caregivers) we will probably have physicians work only on treating patients, and doing as much as they can deliver. Taken to its “reductio ad absurdum”, payers would essentially be telling the physicians in our systems to do hands on care up to 10 -12 hours a day, keeping patients to themselves, not “sharing”. This would discourage working with “physician extenders”. It would also actively discourage our physicians from keeping time for other activities, which should improve medicine in general. Some of these outside activities would include: working on their own continuing education, or the continuing education of support staffs, participating in committee activities (which should help improve the way the system is run), participating in registries, working on improving quality locally or in national committees, or participating in community outreach (“free” clinics or community education). There are, I’m sure other things that we hope that physicians would participate in. We can, unfortunately, be fairly confident that there is no incentive to participate in any of these activities.

[1] An individuals construct will be influenced by prior teaching, “personal experience” or exposure to expert opinion

Posted in General Interest, Quality, treatment options | Tagged , , , | Leave a comment

Registry Participation will help Develop Alignment and Improve Quality

Many hospitals and hospital systems are trying to ensure that they are satisfying quality metrics to help with accreditation, and to confirm that they are satisfying their mission and providing community benefit. Superior performance in achieving clinical quality may allow an institution or system to acquire a competitive advantage. However, the definition and measurement of excellence is clearly more credible if from an independent source. In many ways participation in a disease based clinical registry will be the optimal way for institutions to know what they are really doing with respect to clinical quality. The most recent definition of what constitutes a credible registry is in the 2010 AHRQ publication on registries. AHRQ defines a registry as “an organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by a particular disease, condition, or exposure,” Usually registry participation is across several systems (as in the New York State, Northern New England, or national medical association registries). In some very large health care delivery systems, the registry may be developed and maintained within the system (as in the Intermountain, Kaiser, or Geisinger systems). Registries can be used to develop rankings in quality measures. In addition, they can help institutions or groups understand rare diseases/conditions, or the outcomes of various therapies or results of implants. In almost all instances, registries reflect the real world rather than the somewhat artificial constraints of experiments (clinical trials). The size of effects will usually be less marked than those seen in the initial clinical trials. Thus, participation in registries will help improve the real scientific basis of medicine (“Evidence Based Medicine”).

Most other reporting methods of institutional performance have major shortcomings.

Measures of the process of care for conditions such as Acute Myocardial Infarction, Heart Failure and Pneumonia, often called “Core measures”, actually encourage a rush to mediocrity. They generally are summaries of what one of my colleagues is fond of calling “Checking the box”. There is a place on a report form to say, “I probably thought of this”. However, checking the box does not really ensure the robustness of the intervention that allowed the box to be checked. For example, everyone knows that smoking cessation in smokers is a valuable health outcome. The box on a discharge form can be checked if the patient is given a preprinted form that says, “stop smoking”. A more difficult, but probably more effective, intervention is to sit down with the patient who is a smoker, assess the smoking habit, assess willingness to quit and then help the physician institute a comprehensive behavior modification program, either with or without some pharmacologic adjuncts. Both allow the box to be checked.

CMS, in its hospital compare website, does report outcome data, which are more rigorous than process measures. However, providers frequently complain that these data, which are often approximately a year old, are dated. Everyone that I have talked to has said to me, “Those are old numbers, we are now doing better than that.” Often on review of updated numbers they didn’t really improve.

Many other data sources do give “benchmarks” and allow one to rank oneself. One complaint is that to provide data to these is expensive and may not have a real ROI (that is, drive more paying patients to the hospital). On the other hand, participation in most registries facilitates providing CMS and other payers many of the requirements for public reporting. This physician and system benefit works for both employed physicians and your independent physicians who will feel a greater sense of affiliation (alignment) to a system or institution, because it helped them do a better job.

Rankings such as Thompson, Health Grades, USNWR and others are often fraught with difficulties. In the early 1990s, one Chicago hospital system had a high grading from a national ranking organization, and the next year was investigated by HCFA (the predecessor to CMS) for financial and clinical irregularities. Most hospitals appear on someone’s top 100 hospitals of some kind or other.

On the other hand, registry participation, with groups such as the national cardiovascular and oncology systems has been a way to ensure that hospitals or systems are working toward optimal care. While the initial Cancer Cooperative groups were and still are means of collecting large amounts of data on cancer participation they also helped move participants toward practicing to a national standard.

Participants registries have almost uniformly improved performance and have seen provider and patient satisfaction increase, which makes for a marketing (Public Relations) bonanza. Helping with registry participation is a low cost way for institutions/systems to improve alignment with physicians, and to demonstrate superior performance to payers.

For a listing of some of the premier registries and their web address, send us an e-mail. We will send our version to you.

September 4, 2011

Recent reports from the European Society of Cardiology meeting held in August 2011 have confirmed the proposition that registries are going to be more and more important as we continue trying to improve medicine

Posted in Competition, CV, effectiveness/efficacy, Quality, treatment options | Tagged , , | 2 Comments