My Collected Blog Posts on Clinical Natural Language Processing

Short Link:

I created this post so I could tweet a link to more than one of my posts at a time. I’m sure there will be many more. I’ll add them here. By the way, the correct Twitter hashtag for natural language processing is #NLProc, not #NLP. Really! Click the previous links to compare for yourself.

Wordle Based on 40,000 Words in 500 Comments to NYT’s “Digital Shift on Health Data Swells Profits”

Short Link:

The recent New York Times’ A Digital Shift on Health Data Swells Profits in an Industry generated quite a buzz in Health IT social media. Curious what would emerge, I grabbed 40,000+ words from 500+ comments and ran them through


Yeah. Pretty darn unremarkable. You get a very different impression if you actually read the comments. I’m thinking about doing a sentiment analysis like I did for a previous NYT article about EHRs.

By the way, to just to be clear, I am much in favor of digitizing health data. The problem is that we digitized the “Data-at-Rest” but not “Data-in-Motion” or “Data-in-Use.” Between a unequal playing field created by Meaningful Use and inadequate technology (see My Fixing Our Health IT Mess), we precipitated a “colossal strategic error” (in the words of the our first health information czar).

Current EHRs and many HIT systems are still using mid-nineties non-process-aware technology. They’ve separated out data and user-interface (to some extent) but have not yet separated out process logic. Users hate their inflexible workflows.

How do we undo or minimize the damage of this colossal strategic error? We took the wrong fork in the road and we’re more than 10 billion steps down the wrong tine. It’s not really practical to go back and take the right turn, to rip-and-replace, so to speak. And it’s not possible to tunnel through a wormhole over to the other alternate reality, to turn current legacy EHRs into things they aren’t.

However, by hook or by crook, we need to move to modern process-aware EHR and HIT platforms. I refer to this as a shift from structured document management systems to structured workflow management systems. The best examples are business process management suites today.

Luckily, many technologies flooding into healthcare IT — social, mobile, analytics, cloud (so-called SMAC) — rely on workflow engines, process models, graphical editors, and other useful BPM-like capabilities.

We need to:

  • Influence the Influential: Use social media and complementary methods to get process-aware ideas and technology noticed, discussed, absorbed, and acted upon.
  • Highlight the Highlightable: Flush out hidden workflow engines, process definitions, and graphical editors among existing and new EHRs and HIT systems.
  • Reach out to the Reachable: Virtually every BPM professional I’ve met, in person or online, believes healthcare workflow is ripe for automation (where it can be automated) and support (where can’t be automated) using modern BPM and case management platforms, systems, and expertise.

Clinical & Business Intelligence, Meet Process Mining (Submitted to #HIMSS13 Blog Carnival)


[3/1/13 Update: This blog was one of three chosen by HIMSS to highlight in the mobile and clinical & business intelligence portion of the #HIMSS13 Blog Carnival. I am honored!]

EHRs increasingly mediate patient care effectiveness, resource efficiency, and user happiness. EHR process mining is a new medical “imaging” technique, one that allows process diagnosticians to view workflow blockages, errant processes, and unused resources. Process mining promises to do for healthcare workflow what Röntgen’s invention of X-rays and radiography in 1895 did for medicine proper.


Today, EHR process mining can discover, monitor and improve evidence-based processes (not assumed processes) by extracting knowledge from event logs available in (or “generatable” from) today’s EHRs. Process mining can answer three types of questions about a hospital or clinic: What is happening inside processes (Discovery)? It can compare what is happening with what should be happening (Conformance: especially relevant to medical error and patient safety). It can suggest ways to improve healthcare process effectiveness, efficiency, and user and patient satisfaction (Enhancement).

Of particular note to anyone interested in applying process mining techniques in healthcare is the Process Mining Manifesto. It is not specifically about healthcare, though it does mention “paper-based medical records” as examples of poor event logs. The Manifesto is authoritative (co-authored by more than 75 process mining experts), timely (recent and relevant to problems facing EHR adoption), and accessible to health IT and process improvement professionals.


The credited “godfather” of process mining is Professor Wil van der Aalst, a Dutch computer scientist and mathematician. Professor van der Aalst notes that healthcare is notorious for dismayingly complex “spaghetti” processes. Nonetheless, process mining research can learn a lot from tackling the healthcare domain. On one hand there is great opportunity to learn from intuitively creative medical experts. On the other hand spaghetti processes often are the greatest process improvement opportunities.

Process Mining and Clinical & Business Intelligence

The process mining of EHR event log data is a form of clinical & business intelligence. What van der Aalst notes generally about business intelligence also applies to clinical & business intelligence:

“Business intelligence tools tend to be data-centric while providing only reporting and dashboard functionality.”

This describes many clinical & business intelligence tools.

“They can be used to monitor and analyze basic performance indicators (flow time, costs, utilization).”

These are the KPIs, or Key Performance Indicators, in clinical & business intelligence reports and dashboards.

“However, they do not allow users to look into the end-to-end process.”

If you cannot look “into the end-to-end process” you cannot, in an evidence-based way, determine what is wrong—and therefore what is to be done—for ineffective, inefficient workflows and their unhappy users.

“Moreover, despite the “I” in BI, most of the mainstream BI tools do not provide any intelligent analysis functionality.”

Again, most current clinical & business intelligence tools are reports or dashboards. Without access to detailed evidence-based representations of end-to-end processes, clinical & business intelligence reporting and dashboard systems can flag process problems, but cannot diagnose and solve them.

EHR Event Logs

An EHR event log is a record of named activities (“Check Medications”, “Patient Examination”) created as a byproduct of EHR use. Encounter, or case, ids, tie together collections of events. Events occur in an order relative to each other, usually represented by time stamps. Intervals between time stamps can be years, in long-running chronic conditions; hours or minutes, in patient encounters; or seconds or less between user clicks on a single EHR screen.

The first three columns in the following EHR event log extract—CaseID, Activity, and TimeStamp—are required for process mining to create a process map, or model, from event data. The column of “…”s to the right represents additional data not shown: UserRole (a user such as Dr. Smith or Dr. Jones, or Physician vs. Nurse), EncounterType (such as Sick vs. Well Checkup vs. Vaccination), and Facility (such as Facility 5, 7 or 9, see upcoming illustrations).

CaseID Activity TimeStamp More columns→
7859, "Get Patient", 9/19/2011 15:44:27,
7859, "View Chart", 9/19/2011 15:53:58,
7859, "Current Meds", 9/19/2011 15:59:52,
7859, "Allergies", 9/19/2011 15:59:59
7859, "Labs", 9/19/2011 16:00:27,
7859, "New Note", 9/19/2011 16:05:46,
7859, "Examination", 9/19/2011 16:17:01,
More rows↓

Table 1: Portion of An EHR Event Log

Optional additional columns, over and above case id, activity name, and time stamp, depend on what you want to compare, explain, understand, or predict about your processes. Do you want to understand processes of a poorly performing clinic or hospital relative to a better performing clinic or hospital? You need a facility column. Do you want to do the same for users? You need a user column. Or do you want to understand workflows for sick visits compared to well checkups? Add a column for that. This additional information allows you to filter an event log and ask different questions about logged processes.

The bottom row of “…”s represents the many other rows, with different CaseIDs for separate process instances, usually required to generate useful process models. Healthcare processes generate a lot of time-stamped data that can result in large event logs. Process mining will be required to understand and leverage this “Big Data.”

Evidence-Based Process Maps

Below is a relatively unannotated (for example, no frequency or performance statistics) set of process models, or process maps, generated by ProM, a free and open source process mining tool. Even a simple example, with only five or six possible EHR activities, begins to looks like the aforementioned pile of spaghetti.


The process model can be simplified using event log filtering techniques and by asking specific questions to narrow investigations. The next illustration shows process-mined process maps comparing the most common workflow from three similar medical practices.


Suppose you know some Key Performance Indicators (KPIs) for these facilities, such as patient throughput and cycle time, cost per encounter or encounter type, or perhaps even measures of user or patient satisfaction. Process mining can generate process models that you can compare to explain differences between KPIs. Traditional clinical & business intelligence report and dashboard software may tell you what the KPIs are and help benchmark them. However, to understand the likely causes of flagged KPIs, you need evidence-based process models such as process mining provides.

Summary and Conclusion

Process mining of event log data from electronic health records promises new methods to systematically improve EHR-mediated patient care processes and workflow usability. Process mining is part of a larger front of process-aware business process management (BPM) technology diffusing into the healthcare information technology industry.


Process mining can discover evidence-based process models, or maps, from time-stamped user and patient behavior; detect deviations from intended process models relevant to minimizing medical error and maximizing patient safety; and suggest ways to enhance healthcare process effectiveness, efficiency, and user and patient satisfaction.

There's a great fit between traditional clinical & business intelligence KPIs, dashboards and process mining. Process mining provides an “X-ray” of workflows that can explain clinical & business intelligence KPIs. KPI dashboards alert users to systematic problems, while process mining shows subsystem tasks and workflows driving them. Combined, clinical & business process intelligence addresses central issues of healthcare reform: identification of best practices, coordination of care among clinical staff, consistency across patient care processes, and efficient use of healthcare resources.

For more information about process mining, the best place to start is the Process Mining Manifesto. It even mentions medical records.

P.S. This blog post was submitted to the #HIMSS13 Blog Carnival.

#EHRbacklash Isn’t About Electronic Health Records; It’s About Kludgy, Standalone, Workflow-Oblivious #EHR Applications

Update 2/19/13:

I love this tweet!

It concatenates three themes I’ve addressed over-and-over in this blog of 200+ posts and my Twitter account of 15,000+ tweets: EHR usability, interoperability, and workflow.

On the surface, these concerns may seem unrelated. But there are deep and profound connections among them. Lets start with the most interesting two words in this tweet: “workflow oblivious”.

What’s the Connection to Process-Awareness?

If “workflow-oblivious” is the devil, what is its opposite? It is “process-aware”, which I write and tweet about frequently. Workflow and process are approximate synonyms. (Yes, I am aware of potential different nuances in meaning. Make a comment and I’ll explore them together with you!). And “oblivious” and “aware” do seem diametrically opposite and antithetical.

So, I argue, condemning EHRs because they are workflow-oblivious is very close to, if not the same thing as, praising the alternative, EHRs that are process-aware.

What is process-aware? Check out My Rejected Presentation Proposal: Process-Aware Information Systems Come to Healthcare. It means software than can reason about workflow and act to make it make it fast, easy, and better. Bits-and-pieces include workflow engines, process definitions, and graphical editors to allow non-programmers, sometimes actual users, to create and edit their own workflows.

What’s the Connection to Usability?

Check out my five principles of usable EHR workflow:

  • Naturalness is the degree to which an application’s behavior matches task structure. In the case of workflow management, multiple task structures stretch across multiple EMR users in multiple roles. A patient visit to a medical practice office involves multiple interactions among patients, nurses, technicians, and physicians. Task analysis must therefore span all of these users and roles. Creation of a patient encounter process definition is an example of this kind of task analysis, and results in a machine executable (by the BPM workflow engine) representation of task structure.
  • Consistency is the degree to which an application reinforces and relies on user expectations. Process definitions enforce (and therefore reinforce) consistency of EMR user interactions with each other with respect to task goals and context. Over time, team members rely on this consistency to achieve highly automated and interleaved behavior. Consistent repetition leads to increased speed and accuracy.
  • Relevance is the degree to which extraneous input and output, which may confuse a user, is eliminated. Too much information can be as bad as not enough. Here, process definitions rely on EMR user roles (related sets of activities, responsibilities, and skills) to select appropriate screens, screen contents, and interaction behavior.
  • Supportiveness is the degree to which enough information is provided to a user to accomplish tasks. An application can support users by contributing to the shared mental model of system state that allows users to coordinate their activities with respect to each other. For example, since a EMR  workflow system represents and updates task status and responsibility in real time, this data can drive a display that gives all EMR users the big picture of who is waiting for what, for how long, and who is responsible.
  • Flexibility is the degree to which an application can accommodate user requirements, competencies, and preferences. This obviously relates back to each of the previous usability principles. Unnatural, inconsistent, irrelevant, and unsupportive behaviors (from the perspective of a specific user, task, and context) need to be flexibly changed to become natural, consistent, relevant, and supportive. (From 2009’s EHR/EMR Usability: Natural, Consistent, Relevant, Supportive, Flexible Workflow but presented in EHR Workflow Management Systems in Ambulatory Care in the published proceedings of 2005 HIMSS Dallas conference.)

What’s the Connection to Interoperability?

There is an important level of interoperability above syntax and semantics. It could be called workflow interoperability, or pragmatic interoperability (like syntax and semantics, pragmatics is a subdiscipline within linguistics).

The essential point is this. You cannot have great workflow between healthcare organizations without great workflow within healthcare organizations. I know the temptation to divide and conquer. EHRs have lousy workflow, so lets make up for it with great communication connections between healthcare organizations. Sorry, won’t work. Large virtual enterprises are made up of smaller virtual or non-virtual enterprises. If these smaller entities, hospitals, clinics, etc., have lousy workflows between inputs and outputs, then whatever we create out of their combination won’t be much, if any, better. I’m reminded of my statics and dynamics classes during the year I spent in undergraduate Civil Engineering: you can’t build a superior bridge out of inferior materials. You can’t build superior workflows out of inferior workflows.

So, in closing, let me simply repeat the tweet that made me write this blog post.

My Rejected Presentation Proposal: Process-Aware Information Systems Come to Healthcare

Short Link:

I’m fine! No problem. Only a small percentage of proposals to present at a certain major health IT conference are accepted. That conference shall remain nameless and blameless. High standards are part of the reason why its presentations are always so good. I’ve been lucky to present about a half-a-dozen times, batting about 500, which I understand is pretty good. And, regardless, I go!

This year, if anything like last year, and especially if year-to-year trends extrapolate, social media is gonna be big. Really big. And one of the great things about social media, especially the combo of blogs and live-tweeting, is you, me, anyone can self-publish directly into a roaring, swirling maelstrom of tweets, links, and even HatCam videos!

Therefore I am publishing, and tweeting, my own rejected conference presentation proposal. 🙂

Process-Aware Information Systems Come to Healthcare:
Business Process Management in Healthcare

Mobile, social, cloud, big data, etc. move over: PAIS.

Process-aware information systems (PAIS) ideas and technology — workflow management, business process management (BPM), and adaptive case management systems — are diffusing into healthcare from other industries. A Process-Aware Information System is “a software system that manages and executes operational processes involving people, applications, and/or information sources on the basis of process models.” The best known PAIS is a business process management system or suite. BPM suites include many of the following technologies:

  • Executable process models
  • Codeless development
  • Groupware-based collaboration
  • Event-driven processes
  • Process intelligence and monitoring
  • Simulation and optimization
  • Business rule management
  • Process component archives

Some of these technologies have counterparts in healthcare IT. Others are just beginning to appear. Regardless of maturity of individual technology, perhaps the BPM suite’s greatest value is as a model for how all of these technologies can fit together.

In some instances, process-aware ideas inform new solutions from health IT vendors. PAIS platforms developed outside healthcare are imported and adapted. Key PAIS components, such as workflow engines and process editors, are embedded in new health IT systems or retrofit to existing HIT systems. A cursory search for “workflow engine” AND “EMR” OR “EHR” in Google turns up increasing hits.

After a short history of process-aware systems, I’ll present a meta-analysis of twenty-five de-identified BPM in healthcare case studies. The de-identification prevents incidental, but unnecessary, commercialism. By “meta-analysis” I simply mean I’ll contrast and combine results from different case studies to identify patterns, disagreements, and aspects of interest to a health IT audience. Cases will be compared and contrasted relative to where in healthcare they occurred (hospital vs. ambulatory, back-office vs. point-of-care, and so on), who sponsored the case (academic vs. vendor), and claimed results (both qualitative and quantitative).

I’ll close with…

  • developing and deploying workflow via cloud (including Amazon and Google),
  • mobile workflow (including cross-platform),
  • structured versus unstructured data and workflow, and
  • how to incorporate social into the mix.

As healthcare avails itself of new platforms-as-services, it will find PAIS under the hood and along for the ride.

Takeaways include a powerful new idea (the executable process model), examples of successful applications of PAISs in healthcare, and a positive but skeptical attitude useful for further investigation. This presentation will be the first place many attendees will hear the next big idea in healthcare IT: process-aware information systems.

H-EHR-T Healthy EHR Doesn’t Give Physicians Heartburn! Happy Valentines Day!

Short Link:

H-EHR-T stands for Healthy Electronic Health Records Technology. I’m getting pretty good at catchy acronyms. Checkout S.Y.S.T.E.M., which stands for Saves You Substantial Time, Effort, and Money. Anyway, I had a Twitter convo with @SmyrnaGirl leading to her EHR Valentines Day post. So I might as well write one too!


There’s not much to this post. It’s just a place to collect a bunch of tweets for future reference. Maybe I’ll RT one, once-in-a-while. If I use HootSuite, and schedule the retweets, I’m good for a decade of Valentines Days!

Happy Valentines Day! Don’t forget to take your A.N.T.A.C.I.D! 🙂

S.Y.S.T.E.M.: An EHR should… Save You Substantial Time, Effort, and Money

I’m reposting a blog post from EMR Thoughts that reposted my comment on EMR and EHR. (Now, if someone reposts this blog post, how confusing!).

In my post on points of differentiation for EHR companies, Charles Webster MD MSIE MSIS recently created an acronym to talk about how EHR’s should save you time, effort and money. The acronym is S.Y.S.T.E.M.

An EHR should…

Save You Substantial Time, Effort, and Money.


Minimize encounter length, wait times, staff idle time, mental and physical effort, and Total Cost of Ownership.


You serve your patients; your EHR should serve you. (OK, its portal serves your patients, too)


Lots of:


Save time: see another patient; spend more time with each patient; go home on time.


Minimize mental and physical effort to learn and use.


Time is money. Save time, save money. Shift tasks from expensive personnel to less expensive personnel (but monitor task progress so nothing falls between the cracks).

P.S. Apparently I’m getting pretty good with acronyms!

Meaningful Use: There was NO WARNING of its ARRIVAL! It had no MERCY! It gave NO QUARTER!

Short link:

Begin update 2/19/13 ↓

End update 2/19/13 ↑


Follow me to find out!

Feel free to contact me to purchase emblazoned T-shirts, aluminum water bottles, posters, business cards, coffee mugs, drinkware, magnets, mousepads, coasters, keepsake boxes, calendars and wall clocks. Concessions available. Feel free to embed this image in blogs, websites, etc., as long as you provide link back to this post. Also feel free to avail yourself of the clever short URL (well, memorable, at least). Finally: feel free!

Disclaimer: No subliminal messaging or advertising techniques were intentionally or knowingly used in creating the above image or its surrounding text.

P.S. @ReasObBob came up with the #EHRbacklash hashtag.

@TechGuy wrote The Coming Physician EHR Revolt, sending me into a reverie resulting in the above artwork.

Workflow Tactics Deployed in Health Care Remain Stuck About 10-15 Years Behind the Times

Short Link:

Some of the best writing and insights on the web are comments. They get buried, below the fold, in web-speak borrowed from the newspaper world. For example, @bobbygvegas‘s comment, about EHR workflow, on 6 Ways AHRQ Will Study EHRs, Workflow, struck a chord with me. (I wonder if Bobby being musically inclined has anything to do with this?). I’ve written similar laments. So I asked if I could repost his comment here as a blog post, the first time I have ever done such a thing (and I have over 200 blog posts about EHR workflow on this blog!). I’m working on my own blog post about the AHRQ study, but @bobbygvegas beat me to the punch. Good on you mate!

@bobbygvegas‘s comment appears, in full, immediately after the Twitter exchange leading to this momentous first. It’s a well-written, slightly feisty (I like!), call to healthcare to catch up with the rest of the world in its thinking about workflow. Since it is, essentially, an editorial (with which I very much agree), I include some slides from a PowerPoint that @bobbygvegas mentions: “Workflow Demystified” (“Prepared by HealthInsight as part of our work as the Regional Extension Center for Nevada and Utah, under grant #90RC0033/01 from the Office of the National Coordinator, Department of Health and Human Services. 9SOW-UT-2010-00-112”). The diagrams do not correspond exactly to @bobbygvegas‘s editorial paragraphs. But I include them to encourage you to download and study the entire slide set.

And then…

…and then to …

So, here, posted with permission, is @bobbygvegas‘s complete comment.

[The following is reposted, with permission of its author @bobbygvegas from]

“4. Extract clinical data in logs and audit trails that have been time-stamped from the EHR to reconstruct clinical workflow related to the health IT system. This information validates and supplements the data recorded by human observers.”


Better late than never, one supposes. I’ve been arguing this for years. An EHR audit log is essentially an information workflow record that should be mined to analyze routine tasks times-to-completion and variability. Analysis can also reveal the “pain points,” i.e., iterative, recurrent “flow” barriers. You then couple these data with data taken regarding concomitant physical tasks to flesh out a more useful picture for systematic improvement activities.


The very word ‘workflow’ has become a cliche. Rolls readily off the tongue with little thought given to what it entails. A more apt analogy might be a traffic copter shot of the jerky stop-&-go freeway traffic of rush hour. In most clinics, it’s nearly ALWAYS rush hour.

I joked in one jpeg I did for my blog that this was my Primary’s office at 8:03 a.m.


See also (freely distributable)

A decade ago I was working in risk management in a relatively small private issuer credit card bank. I had free run of most of the internal network. I got to looking at our in-house developed collections call center system (~1,000 collectors working the phones every day), and knew the source language and data tables architecture, so I started importing the data into SAS and mining them.

I was able to rather quickly show management that their staffing deployment and call volumes were egregiously misaligned. We were typically spending $1,000 to collect $50 (or less). It was a lava flow of waste.

On the basis of these rather simple call log analytics we were able to save the bank about $5 million a year in Collections Ops cost, dragging the VP of Collections kicking and screaming all the way (his annual bonus was tied in part to his budget, which was the largest in the company).


“Workflow” tactics deployed in health care remain stuck about 10-15 years behind the times, as they don’t drill down into time consumed and error rates. Mining the audit logs might be of great utility here — though the datetime() stamps are gonna need to be more granular than just down to the second. SQL now supports time capture down to the microsecond, though tenths or hundreds would likely suffice.


Another barrier here in general might be “once you’ve seen one audit log data dictionary, you’ve seen one audit log data dictionary.” Recall that we have at this point nearly 1,800 “complete Certified EHR systems.”

Let’s hope this AHRQ study will move us usefully ahead.

PS (my, Chuck Webster’s, PS). I often use Google to search for new material about EHRs and workflow (or, egotistically, to see where my content ranks). Sometimes I’ll see a quote from a comment and get excited. Something new! Someone new! But when I click on it, it’s just me. It’s an example of jamais vu. Something that should seem old and familiar momentarily seems new and unfamiliar. @bobbygvegas‘s comment had the opposite effect on me. It was classic deja vu. Something that I could not have seen before, because @bobbygvegas just wrote it, seemed like something I’d written and misplaced, only to be rediscovered. I hope others will feel, similar to me, that @bobbygvegas speaks the truth. I hope he speaks for a growing number of health IT professionals and EHR users with increasingly sophisticated understanding of workflow and its relevance to managing healthcare’s spaghetti processes.