How Can We Align Incentives As We Move From Volume to Value?

The adage that “One person’s cost is another’s Revenue or Income” comes to the fore, when we are discussing changing payment models for health care over the next decade.

As the 21st Century has begun, and health care costs in the US are escalating, Dr. Donald Berwick coined the concept of the “Triple Aim” in Healthcare: Better Care for Patients; Better Health for Communities and Lower Costs. These are laudable goals, but can they be achieved?

Currently, the majority of health care payments are based on “Fee for Service”. Everything that any provider does is paid for on a piecemeal basis. The incentive, even if not intended, then is for Hospitals, Physicians, Home Health Agencies, Drug Manufacturers, Device Manufacturers and others to do as much as possible. Payers, on the other hand, want to have less done (Medical Loss Ratio considerations may blunt this incentive, however). It has been variously estimated that there has been between $210 Billion ($210,000,000,000) and $750 Billion wasted on delivery of unnecessary or harmful care in the US annually.

There are initiatives in various legislative packages (PP-ACA – PL 111-148; MACRA – PL 114-10) to try to change financial incentives to improve the quality while reducing the cost of health care. One potential stumbling block in these initiatives is that if we spend less on health care, somebody is going to have their current incomes reduced. We then must question how will reduction in income change incentives? How will we “share” the savings in health care expenditure that could be realized from decreasing readmissions, shortening inpatient care stays, and doing less “routine” testing as suggested by the “Choosing Wisely” initiative that was begun in April 2012.

In the early 1900’s the Ford automobile company, when it was able to cut costs of production of the Model-T, actually cut the cost to the consumer by a significant amount. Ford and his employees still enjoyed a reasonable income and life style. It is difficult to imagine that a modern hospital manager, after improving processes and potentially cutting down on staff that might decrease costs of delivering care, would subsequently decrease billings. Physicians, who might stop doing some procedures, might see their fee for service payments decrease, decreasing their income. Would either the hospital employer or insurance payer that they are dependent on reward them with enough of a proportion of the savings to the health care system to keep up their lifestyle, even though they might not have to work as many hours a week? If the costs of care decrease, will an insurance company cut staff (they would certainly lose in the process) and pass the savings to the purchasers of the insurance product, rather than “pocket” the savings and continue with business as usual? Providers of drugs and devices are expected by investors to have their companies grow, to satisfy demands for return on investment to stockholders and other stakeholders.

In none of these brief examples is there any real incentive to continue to work to decrease the costs of providing medical care or treatments. The challenge to policy makers, health care systems, and physician groups will be to work together in order to ensure that none of the stakeholders in the healthcare system will feel unduly put upon. Can we do this?

Posted in General Interest, Operational effectiveness, Quality | Leave a comment

How Do We Use Statistics?

How we utilize Statistical Inference is indeed a critical piece in the evaluation of new information in the Biomedical Literature.

Many researchers today believe that a “statistically significant” P-value is the primary justification for publication of the findings of their studies. In the first two weeks of March 2016, two articles on the use of the “p-value” were published. The first was highlighted in the British Journal, Nature, summarizing a statement released by the American Statistical Association on the misuse of the p-value [i]. The ASA issued a strongly worded statement cautioning against the over reliance on a p-value that is “statistically significant” in driving major changes in concepts or public policy. The second article, published in the American journal, JAMA, discussed how the use of the p-value has increased in medical literature over the last quarter of a century[ii]. Both articles caution that use of a p-value, by itself, to drive a change in scientific thinking is potentially misleading. Indeed often a study with results that are “statistically significant”, as evidenced by the p-value, cannot be reproduced.

A p-value is supposed to tell us whether an observed difference in ratios or other numbers is potentially related to chance or may be “real” allowing one can reject the “null hypothesis” that the numbers are similar. It is easy to find a computer program that will do the arithmetic to calculate a mean and standard deviation, from which a p-value can be calculated. However, one would be well served to remember that classical statistics depend on “normally distributed” values, and that the sample must not be biased. If the collection of the sample to be studied is biased then the “statistical significance” will also be biased. Finally, we must be aware of the “pragmatic significance” of a difference in means or ratios. Who cares if the difference in the average height of a class of 11-year-old children is 4 feet, 3.5 inches or 4 ft, 3.25 in, even if the p value is less than 0.001? This would be a case of a highly statistically significant, but pragmatically unimportant difference.

One really inopportune use of statistical inference is when the sample sizes are very small. In this instance it is indeed likely that the samples are not normally distributed and that they have a high likelihood of being biased. Here any p-value can potentially lead us down a garden path to nowhere.

The increasing volume of research papers (from just over 400,000 in 1990 to almost 1,200,000 in 2014, an almost trebling in volume of papers over 25 years) often depend on “statistical significance” to get published. The exact proportion of irreproducible results among these millions of papers, unfortunately, isn’t clear.

One message that has been proposed is that before we change our perspective the first paper to show a difference/correlation really needs an independent confirmation. Perhaps the second paper is potentially more important than the first. In addition, we might be well served to ask, “So what? Is this a meaningful result, regardless of the p-value?”


[ii] Chavalarias, D; et al: Evolution of Reporting P Values in the Biomedical Literature, 1990-2015: JAMA, 2016, 315, 1141-1148

Posted in General Interest, Guidelines, Health Information, Statistics and Decision Making | Leave a comment

What Happens to my medical information? Where is it?

In this week’s MedPage, a medical information website, there is a post, by Dr. Leonard Lichtenfeld, about sharing personal health data. He was asked to sign a consent for information sharing that may have essentially taken away from him any control over how his personal health or Protected Health information (PHI)[1] could be shared.  I too had a screening colonoscopy and did sign the consent, because I thought it was a necessary evil of getting the procedure done. This unfettered sharing of Protected Health Information (PHI) may be frightening. On the other hand, not sharing information may hinder delivery of optimal care. Tomorrow, a relative of our family is going to a new physician for an evaluation. She has been getting care for urinary and skin infections at a urgent care center near her home. One of her caregivers asked me whether the new physician could access her records to see whether any of the meds that she is getting may relate to her recent health status deterioration. This brought up the question of sharing health information and how it happens, or doesn’t.

When I was in training, each person who saw a patient did a History and Physical (H&P). The format was fairly well prescripted. [2]After doing my H&P, I went to a room and wrote my findings on paper that was incorporated in the hospital chart for that patient for that admission. We then wrote progress notes daily or more often. After the patient left the hospital, I was required to dictate a Discharge Summary that hopefully went to the PCP and/or referring physician. All the paper from that admission was bound into a hospital chart some of which often ended up being over a foot and a half tall.  Many times, when I was seeing a patient, and asking the same questions again he/she would say something to the effect of, “Isn’t this all in my chart?” or “Get this from my chart”. I often would reply, “Yes some of it is, but I just want to be sure that I get it all right”. In the early 1990’s, my partner and I began to keep much patient information on our desktop computers in a flat file database (AppleWorks®)[3]. Our computer records migrated to disks that we would carry home, then to a lap top and then to our own intranet site. Having these data available made phone calls easier, and often allowed us to rapidly answer consultants questions. In the mid 1990s, we were introduced to the EPIC(r) EMR. It seemed wonderful, even if it slowed us down a bit. We were able to access our complete records immediately in the office (no more asking someone to “pull a chart” from the storage area), from the hospital when we were making rounds, and even from home at nights or on weekends. Having the chart available anywhere at any time, allowed us to provide more comprehensive care. We felt that, while it wasn’t perfect, this was an improvement in  the way we kept patient information for ourselves. Patients, themselves,  couldn’t access what we had though.

In his State of the Union Address on Jan 20, 2004, President Bush proposed computerizing medical records to help “avoid … medical mistakes reduce costs, and improve care”[4]. There was little other substance at that time. However, later that year his White House page proposed an ambitious goal that “Within the next 10 years, electronic health records will ensure that complete health care information is available for most Americans at the time and place of care, no matter where it originates. Participation by patients will be voluntary.”[5]

The EMR was supposed to help us share information between health care providers more easily. To date this hasn’t happened.

2009 brought ARRA[6] and HITECH[7] to help with info sharing. Health Information Exchanges were supposed to allow sharing of information between physicians and other caregivers and between hospitals and physicians. There was a flurry of activity to try to establish HIEs in cities and States. As of December 2015, none of these have really succeeded. This is at least in part because of inertia and also because there appears no incentive for any providers to share medical information.There is no financial advantage to having institutions/systems share information. In fact, one motive not to participate in an HIE might be that keeping the information confined to one hospital/system will encourage the patient to come back to that site(s) for care. There are no other incentives for physicians or systems to share patient data. A disincentive the privacy issue (often called HIPAA[8]). Sharing PHI is restricted by Federal Mandate[9]. At the minimum, someone has to sign a consent or release for information to be shared (Dr. Lichtenfeld didn’t sign a blanket release).

Equally important, there are almost no data supporting HIE use in decreasing healthcare costs.

So, where are the caregivers for my relative left? To get the information from the convenience clinic to the new provider will require she, or a person with her Power of Attorney to go to the clinic and sign a release of medical information. Hopefully the clinic will then release the data about her care so that any potentially important information can be shared with the new physician group that will be evaluating an entirely different aspect of her illnesses. It is unlikely that this information will be electronically available.

For the who are inclined to be more proactive, keeping a Personal Health Record (PHR) on a mobile device[10] should allow better continuity of care. The PHR has the potential advantage that individuals can keep records themselves and can then allow which providers can actually access their PHI. There is a website that can help us all understand some of the issues on PHR activities[11]. Alternatively, many hospitals and physician practices have a patient portal that allows patients, or those designated by patients, to access most, if not all, of the data that exist in that provider’s EMR. These data may then be shared with other providers.

[1] PHI is defined in HIPAA (see below)
[2] In the 1970s the way that we acquired and recorded information was modified by Larry Weed:  Weed, LL: Medical Records that guide and Teach; N Engl J Med. 1968;278:593-600.
[3] This was in addition to our charting in the office paper chart.
[4] State of the Union address, published in Washington Post (
[6] American Recovery and Reinvestment Act of 2009
[7] Health Information Technology for Economic and Clinical Health – a component of the ARRA
[8] Health Insurance Portability and Accountability Act (Kennedy Kassebaum Act of 1996
[9] history of Kennedy Kasselbaum?)
[10]  There are over 7 potential PHR APPs:

   1.    Microsoft HealthVault (
2.    WebMD Health Manager (
3.    My Medical (
4.    Capzule PHR (iOS only)
5.    iBlue Button (
6.    NoMoreClipboard (
7.    Track My Medical Records


Posted in General Interest, Health Informtion Exchange, Medical Records, Personal Health Record | Leave a comment

We have “Information Overload” in Clinical Guidelines.

There is an increasing push for physicians to practice “Evidence Based Medicine”. However, the “evidence” is getting harder and harder to come by. The creation of “Guidelines” by expert bodies may be of little help.  There are simply too many of them.

Alvin Feinstein, MD (1925-2001) and David Sackett, MD (1934-2015) began to consider how evidence might be applied to clinical decision-making (Clinical Judgment) in the late 1960’s and early 1980’s, respectively. One of Sackett’s students introduced the term EBM to the medical profession in the 1990’s. Sackett graduated from the University of Illinois Medical School and practiced of Internal Medicine and Clinical Epidemiology at McMaster University in Hamilton, Ontario.

In the 1970’s, the randomized clinical trial (RCT) came into favor as the preferred way to determine the efficacy of treatments. Since then the types of papers published have changed markedly. No longer do we have many case reports, small studies or “reviews of the literature”. Most medical publications are now replete with RCTs of various size and complexity. The number of papers published in biomedicine has increased by 5-6 times from the approximately 50,000 a year in 1970. Keeping this published information straight has become harder and harder, some might say almost impossible. Sometimes a series of RCTs or observational studies are combined into a “Meta Analysis”. One might say that a MA is simply a more structured “review of the literature”, with a somewhat stronger mathematical bent.

Until the 1970’s, guidelines rarely existed. Some documents proposed schemes to help with diagnosis. These were initially the work of a single expert clinician[i]. Today guidelines are frequently produced to help clinicians in the care of chronic diseases[ii] or for the use in imaging and diagnostic testing[iii].

One of the very first sets of standards for provision of one form of patient care was the “Standards for cardiopulmonary resuscitation (CPR) and emergency cardiac care (ECC)”, published in 1974[iv]. In the next edition[v] the guidelines were specifically designated NOT be a legal document and were NOT to be construed as evidence in legal proceedings.[vi]

Dr. Sackett always regarded the evidence base as one of several components of patient care. He once said that EBM had “… three arms: very good evidence, seen by a very good clinician and integrated with patients’ expectations.”[vii]

A review of data at the National Guideline Clearing House  (NGCH), a service of the Agency for Health Care Research and Quality, revealed over 2,400 unique guidelines sets (GS). There are 111 unique agencies that participated in the development of more than 5 GS each. There is frequently significant overlap with several organizations collaborating on several sets of guidelines. The table shows that there are MANY sets of guidelines for 7 chronic cardiac conditions. It also illustrates how rapidly the guideline numbers are being changed:

Table:         Some Cardiac Conditions with Guidelines at NGCH

Number of Guidelines

September ‘15            October ‘15            November ‘15

Hypertension                            442                          468                               486
Heart Failure                           369                           515                                517
Myocardial Infarction            201                           230                               230
Peripheral vascular disease   151                            151                                188
Atrial Fibrillation                    108                           108                                129
Angina                                         78                             78                                 86
Aortic Aneurysm                       53                             53                                  59

It may strain one’s sense of credibility to imagine that any single physician would be able to evaluate 442 sets of guidelines for the evaluation and management of hypertension, or for any heart failure clinic to have evaluated and combined the recommendations of 369 sets of guidelines for heart failure.

Looking at Heart Failure alone, I found 486 sets of guidelines that relate to heart failure. Of those 113 have been published or revised since 2000. Of those 113, 30 are primarily related to diagnosis or treatment of Heart Failure or CHF. The remainder has statements regarding the place that the presence of heart failure may modify the recommendations in that set of guidelines.

The American College of Radiology leads the list of GS, having collaborated in creation of  239 sets. The National Institute for Health and Care Excellence (NICE) in Great Britain was the next most prolific organization publishing 211 sets. One hospital has developed 112 GS, which they called “best evidence statement” or BEST.

In an effort to be thorough, medical organizations and societies may have actually complicated the job of the clinician in his/her quest to remain up to date on what is “the most appropriate” strategy for diagnosis and treatment.

This leads to the question, “Why is this guideline group promulgating this set of guidelines?” It would appear that at least some groups are trying to give a “distinctive voice” to a unique subset of stakeholders in any specific clinical condition.

What are we to do with this “Tower of Babel” of guidelines?
The clinical system or clinician, who might want to use summaries of available evidence to help with clinical judgment, would be well served to ask if each source of available guidelines is relatively unbiased, and associated with a nationally recognized leader in the field[viii]. Governmental sources are frequently relatively unbiased[ix]. National Medical Associations[x] are generally considered reputable organizations with minimal biases. Recommendations published in the journals sponsored by them should be well rounded.. Finally, another filter to help find reliable guidelines would be where they are published. Guidelines published in well-recognized journals with as good a rigorous peer review process as possible are more likely to be reliable. Whether a provider group should accept a published series of guidelines or try to synthesize an analysis of others may be up to the group. However, in spite of the best intentions of the guideline writing groups, payers, lawyers and quality review organizations most often refer to a published set of guidelines. This makes trying to create an individual set of guidelines often counterproductive.

[i] The Jones’ Criteria for Diagnosis of Rheumatic Fever published in 1944 is one example   – the problem with these often is that there was no “gold standard”
[ii] Hypertension, Heart Failure, Diabetes, Asthma are some conditions for which guidelines may be necessary.
[iii] ACC – Appropriate Use Criteria – the first set of AUCs appeared in 2005
[iv] JAMA, 1974, 227 Suppl: 833-868
[v] JAMA,1980, 244, 453-509
[vi] ibid p 505.
[vii] J of Undergraduate Life sciences; 2010, 45, 66-67
[viii] In cardiovascular disease some relatively unbiased sources might include the American College of Cardiology, American Heart Association, and the European Society of Cardiology
[ix] United States Public Health Service (USPHS), the National Institute for Health and Care Excellence (NICE) in the United Kingdom are examples
[x] British Medical Association, American Medical Association, Massachusetts Medical Society among others

Posted in CV, Guidelines, Policy, Quality | Leave a comment

Helping our Patients and Ourselves Navigate the Internet for Reliable Health Information.

In June 2015 Dr. Arthur Caplan opined on Medscape that physicians should be prepared to help patients in some way as they try to navigate the morass of medical information that is available on the Internet[i]. One oft quoted study from the Pew Research Institute (2013) suggests that over half of US adults have looked for health information on the internet and that up to 80% begin at a commercial search engine. What is less clear, however, is how many people get their attention waylaid by some other web based source of medical information, but it is likely a large number. Caplan suggests that at least several of these sources may have “Evil People” behind them.

There is generally some skepticism about information on the Internet, but this degree of skepticism is likely not as prevalent as would be desirable. In 2012 the State Farm Insurance Company sponsored a TV ad that began with the skit that ran:

“ ‘Where did you hear that?’; ‘The Internet’; ‘And you believed it?’; ‘Yeah, they can’t put anything on the internet that isn’t true.’; ‘Where did you hear that?’; ‘The internet’.”                                                                                      


If we are to be able to help our patients navigate the Internet, it would be helpful to determine what is easily available. There are several search engines, or search engine groups. One website posited at least 12 sites in addition to Google, which seems to be the de facto market leader, with upwards of 65% of searches beginning there.

In order to see what was available, I looked for health related information, in the top 4 commercially available search engines (Google, Bing (a Microsoft site), Yahoo, and Ask). I used the initial search term “Health Information Websites”. There are well over 180,000,000 active sites identified by each of the search engine sites (Yahoo claimed over 480,000,000 sites).

Some sites that may be helpful in finding reliable information include two on evaluating websites:

  1. accessed 7/6/15.

This one page site with an easy to read and follow process map on evaluating a webpage, is from the National Network of Libraries of Medicine, and may be among the most important sites available.

 2. accessed 7/6/15

This is a website of the: Consumer and Patient Health Information Section of the Medical Library Association, Inc ( ) , which has an interesting “User’s Guide” for finding and evaluating health information on the web (

 They then have a list of “top 100 health websites you can trust”, which refers to NIH and other governmental websites as well as websites for many of the major disease specific professional organizations. This may be a good approach, but would likely be frustrating to a person who has a simple question to answer.

 Evaluating Information on the Web:

In looking at a website to estimate reliability, most sources suggest looking at the site itself for its “professionalism”. Secondly one should ask whether there is an evident bias in the material – especially if the site is trying to sell something or asking for money. Thirdly, a searcher should ask whether the information that is being purported as real, is backed up with information from other sites. Finally, if the information on the site is supported by references or hyperlinks, it is more likely to be reliable. Evaluating these four characteristics: Professionalism, Bias, Uniformity through several sites, and provision of References should allow a person looking for health information to find reliable information upon which to make decisions.

Perhaps the easiest site to use and find reliable information is the Website of the National Institutes of Health:

My interpretation after spending time on the site is that it s well organized and has a robust search window, as well as directions to topics of interest, such as clinical trials, Medicare, and others. It was rated as in the top 3 of all the search engines.

Another Governmental Site that has links to many useful concepts is:

Other sites that were common to all the search engines were:

There are many more, often included in the website of universities, insurance companies or other large healthcare providers, but if we send our patients, friends and other associates to these six sites, they will have a very good head start on their journey through the minefields of getting health information.



[i] Caplan, A.L.: Are Evil People Influencing Your Patients? /Medscape, Jun 24, 2015

Posted in General Interest, Health Information, Health Informtion Exchange, Literature | 8 Comments

Dr. Gawande has done it again – almost – a review of “Being Mortal”; Gawande, Atul; Metropolitan Books; New York; 2014

This book is almost on track to be a potential game changer.

The title is engaging. However, on my first reading, I found the book a little difficult to follow. Dr. Gawande has essentially written about two distinct components of “the modern experience of mortality” – The first five chapters discuss aging and the optimization of the life experience of aging patients . The second portion of the book deals with care in advanced disease – mostly in the context of wide spread Cancer.

As has been his habit in his other books Atul tells stories of his own experiences and interviews he has done with some innovators in the delivery of care. Keeping in mind that the pleural of anecdote is not data, he uses stories to make his use of data more personal and meaningful to a lay reader (there are 12 pages of citations – pp. 265-277). As usual, Dr. Gawande is thoughtful. In this book he may be even more provocative than he has been before.

He has investigated the nursing home concept as it applies to care of elders who have lost the capability of being fully independent because “things fall apart”. He makes a clear argument that aging is not a medical condition, but the result of “the accumulated crumbling of one’s body systems”, including unsteadiness, loss of position sensation, loss of flexibility, and muscle weakness. He also notes that there is a time related deterioration of many conditions with the passage of time that is called the “natural history” of the disease. Illnesses such as Heart Failure, Emphysema, and Atherosclerosis are examples of these. Some forms of arthritis are also often considered a natural component of aging. Dr. Gawande includes stories of many facilities that have improved the experience of living in older age, by assisting with living, not assisting with dying. He makes the distinction between helping people live in old age or managing the dying experience. In the dying experience, where the goal is “patient safety”, often elders end up with a ”life designed to be safe, but devoid of anything that they care about.” He quotes Bill Thomas, MD who describes what he calls the three plagues of Nursing homes for the aging person: “Boredom, Loneliness, and Helplessness”. There are stories of facilities for elders that help some seniors live better, including Park Place in Oregon, Chase in upstate NY, NewBridge on the Charles in Boston, and Peter Sanborn Place in Reading MA among others.

The second portion of the book addresses a concept that physicians often refer to as “futile care”, almost always related to diagnoses of cancer. He is not addressing things such as treating cancer when it is first diagnosed in early stages, but more the continuing use of newer therapies that may prolong life by a short period of time (often measured in days or weeks only) at the expense of quality of life resulting from the side effects of treatments. He uses his own father’s clinical condition relating to a tumor in his cervical spine, that he lived with for a prolonged period because he still had life experiences that he wanted to accomplish. Dr. Gawande introduces us to Dr. Susan Block who had helped develop the concept of asking what is important to people who may have to make hard choices . Keeping the discussion in line with the individual patient’s values and goals, as a means of directing treatment decisions, should increase patients’ quality of life in the times of difficult conversations. He also discusses the benefits of hospice care and making advanced directives – using the experience of LaCrosse, WI where there was a concentrated effort to improve end of life discussions so that physicians knew what should be considered if and when a patient came for care. He also discusses several data sets that suggest that hospice care is associated with increased, not decreased, longevity in patients with advanced disease that is not responding to “modern medical therapy”. Gawande then points out that he is not suggesting giving up early, but that the physician directing care should be like an army general … “in a war that you can’t eventually win, you don’t want Custer. You want Robert E. Lee, someone who knows how to fight … and how to surrender when you can’t” win.

In several parts of the book, I teared up, but then I am a softy.

If I had my choice, I would have liked some help in keeping track of the characters in his stories and some of the concepts that he is discussing. I counted at least 12 different patient stories, which were sometimes scattered throughout the book. There were over 8 physicians and several other key people. An index might have made keeping up with them all a little easier. Also, I would have found it easier to understand the book if Dr. Gawande made explicit the two different segments of his arguments.

Posted in General Interest | Leave a comment

General Shineski Needn’t Have Been Ousted – He Was Betrayed

At the end of May, after a series of exposés and congressional hearings, General Eric Shinseki, was pressured to resign as Secretary of the Department of Veterans’ Affairs. The major reason for his departure was that the department, including up to 1,700 potential sites of care, couldn’t see Veterans in a timely manner. These problems have been known to the VA, including some misreporting of wait times, since at least 2005 (OIG Report of 5/28/2014). At some time a program was instituted to incentivize the CEOs of individual VA facilities to create a culture of rapid response to a request for an appointment. If appointments were reported to be available, the CEO and staffs could receive a financial incentive. It should not come as a surprise then that intelligent people who were trained in a business model were able to find a way to “hide” the fact that many veterans (1,700 in Phoenix alone) were not getting appointments within 2-3 weeks of a request. A “ghost” waiting list was kept at in least two hospitals (Phoenix, Arizona, and Hines, IL – interestingly one common thread between those hospitals is that the CEO was the same person – first at Hines (Feb. 2010-2012) & then at Phoenix (2012-2014)). This CEO is reported to have received at least one significant financial reward for what appears to be misreporting results of her administration. This is almost a perverted application of the concept “you get what you pay for or what you measure”.

If Gen. Shinseki’s transgression was that patients were not being seen promptly, it might be that he believed what he was being told. General Shinseki certainly understood leadership. He had participated in writing a book (Be, Know, Do: Leadership the Army way). He certainly was aware that more junior military officers were supervised and mentored to ensure that they understood ethics, how to adjust to stress and how to adjust to try to achieve commander’s intent. There are multiple data/opinions (from the lay press, but buttressed by data from at least the Harvard Business School and Northwestern’s Kellogg School of Management among others) suggesting that, in many business settings, CEOs with Military Experience tend to be highly ethical and generally able to lead civilian organizations to success. The bureaucratic leaders that Shinseki inherited in the DVA didn’t seem to embrace this ethical culture. He observed, “I can’t explain the lack of integrity among some (italics are mine) of the leaders of our healthcare facilities. This is something I rarely encountered in 38 years in uniform.” In addition one might question whether the employees that staff the VA system understand/understood the overall mission and goals of the VA system – VA is committed to developing a culture that is advanced, forward-thinking and completely Veteran-focused.

We might take as a lesson that in large organizations new leaders might entertain a healthy degree of skepticism relating to the ongoing conduct of the staff. General Shinseki certainly knew about leading by walking about – being visible to his subordinates and reinforcing the message of the mission. If he had been aware of the problem of having veterans seen promptly, could he have been able to convince his superiors (congressional committees) to increase funding for the VA for more providers? About 3 years ago, I volunteered to help in the clinics at a local VA hospital outpatient department. I was told that there was no need for more physician coverage. In retrospect, I doubt that. There may have been no budget for more physician coverage, but they might have accepted volunteer physician help.

Dealing with a civilian bureaucracy, with trade unions representing much of the workforce was certainly something that military training may not have prepared a CEO to handle. This structure would have made it more difficult to reprimand recalcitrant staff than it would have been in the army. However, I would hope that a military leader could emulate General Marshall who is reputed to have taken ineffective commanders out of their role, and then give them a second chance to acquire the skills to subsequently become a real leader. This would seem to be a better way to help a subordinate grow than simply removing or firing someone. Those working within the framework of a second chance may be more motivated to embrace and encourage a culture that we are looking toward. There are suggestions that ongoing culture and competence training of VA intake or appointment staff wasn’t continued. Many effective organizations (Mayo clinic for example) have ongoing culture classes for all levels of the organization. In addition, a very clear, but simple set of values and expected behaviors that is promulgated prominently throughout the organization should help improve honesty (Dan Ariely in “Predictably Irrational” (2008) showed that this type of nudge can influence behavior)

Could the VA scandal have been prevented? – In all likelihood yes. Would it have been easy to prevent? – No. How did it happen in an organization that was thought to be amongst the best in the late 1990’s? The most likely answer to that question is that someone took her/his foot off the gas that had kept ongoing training and culture intact and allowed the system to sink back into mediocrity. The new leadership didn’t go back to the beginning to ensure that meritocracy resurfaced.

Posted in General Interest, Leadership, Policy | 2 Comments

Diagnosis may be the Achilles Heel of Incentive Based Payment.

“Diagnosis is the mental act of selecting the one explanation most compatible with all the facts of clinical observation”.  – Raymond Adams in Harrison’s Principles of Internal Medicine – 4th edition

In almost all instances, Government and other third party payer incentives for improving performance in medicine rely on a clinical diagnosis upon which to judge performance. The data reported in Hospital Compare rely on an accurate and inclusive diagnosis of Myocardial Infarction, Congestive Heart Failure (CHF), and Pneumonia for hospital ratings. In each of these clinical conditions there are specific criteria for making the diagnosis. However, also in each, the diagnosis must be considered before the diagnostic criteria can be applied. Once criteria are applied they must be evaluated. There are no specific criteria for the diagnosis of CHF for example – there are several sets of diagnostic “criteria”, including proposals by the Framingham study group[1], a Harvard Study Group[2], and a group from the University of Virginia[3], which have sensitivities ranging from 0.41 to 0.71 and specificities from 0.89 to 0.97[4]. The addition of BNP values doesn’t help much especially when the diagnosis is not suspected.  Estimates of error in diagnosis suggest that there is an error in diagnosis in somewhere between 10% and 15% of encounters[5],[6]. Many of these may never be detected (see our prior post on a diagnostic error (“Who worries about physician behavior …”). If a patient has a clinical condition, but it is not appropriately diagnosed, then that patient never appears in any denominator of performance (in either process or outcome measures). Consider a hypothetical patient who is overweight (BMI 32), smokes, is mildly short of breath (SOB), coughs and has mild ankle swelling. If this patient is diagnosed as being obese and having chronic pulmonary disease, then he/she will be looked at as if they might have COPD. Suppose again that this patient comes into the hospital with a mild fever, an increase in cough and SOB, and is diagnosed as having an exacerbation of COPD, he is treated with antibiotics and then recovers after 3 days. Again, the performance measures are met for COPD. Three years later, the same patient comes in with orthopnea, PND, moderate ankle edema and cardiomegaly. At this time, the diagnosis of CHF is entertained. For four years this patient has been considered, by those looking at quality metrics, as having a condition that, in retrospect, was probably not correct. For those years, if there was P4P, P4Q or some other reward system, the physician practice or health care system would have been the recipient of inappropriate incentive compensation.

There are other areas where an error in diagnosis is important – a diagnostic error can, as shown above, delay initiation of appropriate treatment for a patient. There is legal exposure for diagnostic errors. Some estimates suggest that over 29% of malpractice claims and judgments are for diagnostic errors[7].

Defining diagnostic errors themselves is difficult – the final arbitrator may always be challenged. Coming up with the causes of diagnostic errors is even harder. Not all physicians are equally adept at arriving at a correct diagnosis (even sometimes an experienced physician when presented with the same clinical context may not even arrive at their prior diagnosis). Certainly research and training of physicians should help in understanding causes of error and increase vigilance to try to avoid such errors. Sometimes it may be enough to encourage a diagnostician to be aware of biases that may cloud judgment (one estimate is that there are over 15 potential biases that may impede accurate diagnosis)[8]. In other instances it may be enough to encourage diagnosticians to be aware that over reliance on heuristics in approaching a patient can lead to errors in reasoning.

Expert diagnosticians in the past used to insist that after a diagnosis was reached that the clinician be encouraged to keep an open mind by defining a minimum of 3 alternate explanations for the clinical presentation – called a “differential diagnosis”. Over reliance on advanced diagnostic imaging may also lead the clinician astray. Many physicians and surgeons believe that advanced imaging techniques such as CT and MRI scans are a “gold standard” of anatomic diagnosis. However, almost every orthopedic surgeon has more than several instances in which the MRI scan suggested a diagnosis in which the result was either not confirmed, or had nothing to do with the patient’s illness/complaints. One study suggests that for the diagnosis of meniscus tears the MRI is only accurate (compared to intraoperative findings) approximately 75% of the time[9]. Another showed that operating on the back, based on the findings  of an MRI exam didn’t necessarily improve patient symptoms[10],[11]

The real gold standard may still be the autopsy, which has fallen out of favor as a check on our clinical diagnosis. William Osler considered the autopsy so important in his and others education that he did his own. Richard Cabot brought the autopsy to the fore when in the early 1900s he proposed the Clinical Pathologic Conference as a teaching tool. This became formalized in 1925 when the NEJM began publishing a weekly CPC under the rubric of Case Records of the Massachusetts General Hospital[12].

Diagnosis has taken a back seat to proceduralism today, partly because as many other commentators have pointed out, there is little time or reward for non-procedural patient encounters. This leads to potential skimping in the taking a thorough history, performing a complete physical exam and coming up with a differential diagnosis because this behavior is not rewarded in today’s fee for service system. In spite of the importance of reaching the correct diagnosis to direct the correct treatment and determining the appropriateness of pay for performance/quality, the outstanding diagnostician isn’t rewarded. Not all physicians today are outstanding diagnosticians, nor were they in the past. In the past going to see a “diagnostician” was something patients often valued. Then it was understood that a prerequisite to effective treatment was the right diagnosis.

[1] McKee NEJM, 1971 285, 1441

[2] Carlson: J Chron Dis, 1985, 38, 733

[3] Gheorghiade, M et al; Am J Card, 1983, 51, 1243

[4] The sensitivity and specificity values in all three were against an “expert” diagnostician or panel of expert diagnosticians.

[5] Graber, M: Joint Commission J on Quality and Patient Safety, 2005, 31, 106

[6] Berner, ES, Graber, M: Am J Med 2997, 121, S2

[7] Tehrani, ASS, et al: BMJ Qual Saf, 2013, 22,672

[8] Landro, L; Wall Street Journal 2013, Nov 17: Accessed 1/21/14  Subscription may be required

[9] Hardy, JC et  al: Sports Health 2012, 4, 222

[10] Deyo, RA et al: J Am Board Fam Med 2009, 22, 62-68

[11] accessed 1/21/14

[12] Roberts CS. The Case of Richard Cabot. In: Walker HK, Hall WD, Hurst JW, editors. Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd edition. Boston: Butterworths; 1990. Available from:

Posted in General Interest, Policy, Quality, treatment options | 2 Comments

What is Quality? It Depends on Who Does The Measurement

As I said last month, quality is difficult to define and is almost in the eye of the beholder. This is very much like Humpty Dumpty’s assertion that, “When I use a word, it means just what I chose it to mean – neither more, nor less”. Maybe then quality means what a group says they are measuring is what defines quality. There are many groups that report quality metrics to the public in the “lay press”. There are more than a dozen sets of publically reported quality metrics. These are in no particular order:

  1. US News and World Report annual surveys
  2. Consumer Reports
  3. Hospital Compare (CMS website’s public reporting)
  4. National Quality Forum
  5. HealthGrades
  6. The Leapfrog Group
  7. Truven (used to be Thompson Reuters which itself used to be Solucient)
  8. The Joint Commission (with its ORYX set)
  9. National Committee on Quality Assurance (with its HEDIS – Health Employer Data and Information Set (1998), or Healthcare Effectiveness Data and Information Set (2012))
  10. Premier Healthcare Alliance (with its QUEST – QUality Efficiency Safety & Transparency) reports
  11. Several non governmental insurers including:
    1. Blue Cross – Blue Shield  (with its Blue Distinction)
    2. United Health
    3. Humana
    4. Aetna

The number of metrics going into a hospital ranking is not consistent, ranging from approximately 8 to more than 80. In addition, a single ranking organization will often change the metrics from year to year. Many of these organizations use data from other sources to help with the quality rankings. One of the most often used outside sources is the Agency for Health Care Research & Quality, which provides the Consumer Assessment of Healthcare Providers and System (CAHPS) scoring – first used in 1995. Many Healthcare systems and providers also use the Press Ganey Company (founded in 1985) scoring of patient satisfaction.

It should not be surprising that none of these several organizations report on the same set of metrics including the same process or outcome measures. I compared the top several hospitals in the Chicago area by each of the rating groups. None listed the same hospitals.

Are there potential unanticipated consequences of reporting quality? Some providers (practitioners and health care systems) may tend to skew their behavior toward a quality metric that may or may not be really associated with a desired outcome. One that was recently reported was how physicians are potentially providing unnecessary and potentially harmful services in order to help with Press Ganey satisfaction scores[1].

In the mid 1990’s one ranking agency listed a Chicago area hospital in the top three in the region. The next year, the Health Care Financing Authority (the predecessor of the Centers for Medicare and Medicaid Services of the federal government) determined that there were enough potential quality lapses that it began an investigation of that same hospital. 13 years later, the hospital closed. This example would suggest that in some cases, rankings by outside agencies may not always be a reliable indicator of a healthcare providers performance.

There are, however, data that suggest that those healthcare systems that do better on any one scoring system tend to have better “hard” outcomes such as lower readmission rates, lower hospital acquired complication rates, and even lower mortality rates, than those that don’t rank well on any. These data are, however, likely skewed by reporting biases and are not necessarily embraced by all those who are looking at the research.

What is one to do with this plethora of data?

Hospitals and Systems might charge the clinical leaders such as the  CMO & CNO with picking one or two of the major reporting groups to report to. Institutions should do well on marketing or advertising favorable results from a reporting group. Thus, I would posit that this initiative should be supported from the marketing dept. budget.

Consumers – patients, families and caregivers should review their hospital’s web site to see what they are reporting. They should also look at one or two sites that rank their local hospitals, in addition to Hospital Compare to determine whether they look at things like:

  • Re-admission rates (higher readmission rates expose patients to lost time out of hospital and increased likelihood of hospital acquired conditions such as infections and adverse reactions to medications, among others).
  • Rates of hospital acquired conditions.
  • 30-90 day post hospital mortality from common illnesses such as heart attack, pneumonia, or surgery among others.
  • Use of Electronic Medical Records (EMR) and connectivity to physician offices as well as ability to access “your” data.
  • Patient satisfaction scores (what percentage of patients would recommend the hospital/system to friends or relatives) – while these are potentially biased, they among other considerations can help suggest whether you may want to use that system.

To date there is minimal transparency of reliable data for patients to use to determine what real quality exists in their health care delivery system. Some of these considerations may help in choosing wisely.

[1] accessed 2/25/13

Posted in General Interest | Tagged , , | 2 Comments

What is Evidence Based Medicine?

One definition would be: Delivery of Medical Care based on results of best available evidence. This usually means finding or relying upon data, some of which will be from outside one’s immediate memory to help answer a clinical question. EBM began to be championed by investigators, probably in the early 1990’s[1].

Studies that comprise the evidence are often in the form of Clinical trials. Clinical Trials are a relatively recent addition to the armamentarium of the tools available to clinicians – the first Randomized Clinical Trial (RCT) was done in 1952[2]. Trials should, but often don’t, look at how an innovation compares with prior knowledge, standards of care, or diagnostic accuracy. This is called comparative effectiveness (CE). In addition, trials often achieve “statistical significance”, which may, however, be of little pragmatic significance. The biological significance of much newer data is frequently incremental, not disruptive. Trials are almost never done to look at cost, although some report cost considerations in analyses that are done after the original publication. When cost does become a consideration, political or other influences challenge the focus or reduce funding of whoever is doing the research. For instance, the dissolution of the US OTA[3], which functioned from 1965 to 1995, is an example of outside influences resulting in the dissolution of a system that had worked to improve care in the 30 years the office was in existence.

When evaluating evidence one needs to be aware of at least three considerations/tests:

  1. What is the source of the evidence (can you identify prospective biases, are the data reliable – Have non confirming data been looked for?)?
  2. What is the Strength of any recommendations?
  3. What is the Strength of the data that underlie the recommendations?

The AHA/ACC construct on strength of recommendations and the data supporting them is as good as any. Most other writing groups use some variations of those:

The strength of Recommendation is categorized as:

Class 1   Benefits Markedly Outweigh Risk, so the procedure/treatment should be done
Class IIa Benefits Somewhat Outweigh Risk, and it is reasonable to do the procedure/treatment
Class IIb Benefits May Equal or Minimally Outweigh Risk and it MAY be considered
Class III There is No Benefit or Harm may ensue. The procedure/treatment should generally be avoided unless there is a good reason to use it.

The Strength of the data is categorized as:

Level A Data are from multiple Randomized Clinical Trials (RCTs) or Meta-analyses
Level B Data are from limited populations (single RCT or non randomized studies)
Level C Consensus expert opinion, “standard of current care”, case studies.

There are two additional ways to look at EBM:

EBID – Evidence Based Individual Decision
EBG – Evidence Based Clinical Guidelines

Each has its place in the practice of EBM. In neither case should the evidence be looked at in isolation. New evidence should always be evaluated in the context of prior information, even though prior concepts may not be as soundly evidence based as newer information. The concept of Bayesian analysis where new evidence is evaluated based on prior heuristics should be applied. If new evidence comes about which markedly contradicts common or personal practice, the reliability of the source should be closely examined.

Guidelines are not new. One of the early guidelines was developed to help standardize and improve the diagnosis of Rheumatic Fever in 1944  – the Jones Criteria which were developed because, “From a study of the medical literature it is obvious that each observer has his (sic) own diagnostic criteria and these may differ widely”[4]. Thus, physicians couldn’t evaluate how therapies worked in this illness, because they didn’t really have a common definition of the illness. The guideline gave a way to be more confident of homogeneity in the diagnosis. The paper also allowed “clinical judgment” to override the criteria if done for good reason. CardioPulmonary Resuscitation was codified in 1966 because, “…clinical results vary widely and depend on the exact technique taught, the effectiveness of training…”[5]

Guidelines most often work for the majority of patients. When a patient or group of patients don’t respond to guideline based treatments, then a physician needs to determine why. Should the physician document that he/she has tried “standard or guideline recommended care” and that it hasn’t worked? How much should the patient know about a physician embarking on non-EBM care for this specific clinical condition? How much should the patient participate in any such “experiment” that such deviations from prior EBM/Guidelines represent? How much responsibility should the physician then have in sharing his/her observations with others? Alvin Feinstein in Clinical Judgment[6] discusses this concept.

Some Guidelines have multiple sources. Cardiac guidelines are often prepared with input from multiple physicians and other involved stakeholders from the ACC/AHA and the European Society of Cardiology. Some confusion may arise because there are “too many” guidelines. For heart failure, for example, there are FIVE distinct sets of guidelines, which are not concordant. Clearly for use by an individual or in groups, guidelines should be reviewed. For use by an institution or system a workable set should be developed.  This exercise often helps bring physician groups and practices together and helps them work better within an institution.

Some Sources of “Evidence”

Cochrane Collaboration

POEMS (Patient Oriented Evidence that Matters – Primary Care Oriented)

USPSTF (United States Preventive Services Task Force)

CADTH (Canadian Agency for Drugs and Technology in Health)

NICE (National Institute for Clinical Excellence – Great Britain)

SBU (Swedish Council on Health Technology Assessment)


[1] McMaster EBM working group; Evidence Based Medicine: A new Approach to Teaching the Practice of Medicine; JAMA 1992, 268, 2420-5

[2] IOM (Institute of Medicine). 2011. Engineering a learning healthcare system: A look at the future: Workshop summary. Washington, DC: The National Academies Press

[3] OTA – Office of Technology Assessment: referred to in O’Donnell, JC et al: Health Technology Assessment: Lessons Learned from Around the World-An Overview: Value in Health, 2009, 12, Suppl. 2, S1-S5

[4] Jones, TD; The Diagnosis of Rheumatic Fever; JAMA, 1944, 126, 481-4

[5] Ad Hoc Committee on Cardiopulmonary Resuscitation … National Research Council: Cardiopulmonary Resuscitation; JAMA, 1966, 198, 138-145

[6] Feinstein, A; Clinical Judgment, 1967, Williams & Wilkins, Baltimore

Posted in effectiveness/efficacy, General Interest, Quality, treatment options | 2 Comments