New Process Mining Tool Debuts: Healthcare Opportunities Abound!

Process mining time-stamped data from electronic health records (EHRs) and other health IT systems promises new methods to systematically improve patient care processes and workflow usability. It can discover evidence-based process models, or maps, from time-stamped user and patient behavior. It can detect deviations from intended process models relevant to minimizing medical error and maximizing patient safety. It can suggest ways to enhance healthcare process effectiveness, efficiency, and user and patient satisfaction.

Until now there was only one practical option for process mining EHR and HIT data, the free and open source ProM process mining tool available at It’s a great tool. I used it for my presentation on EHR process mining at the recent Healthcare Systems Process Improvement Conference in Las Vegas (six-page paper, narrated slide video). But, as I predicted, new and notable commercial mining tools are emerging. The most recent is Disco (for “Discovery” of processes) from Fluxicon.

I caught up with Anne Rozinat, one half of the co-founders of Fluxicon, who agreed to my (increasingly) infamous One-Minute interview (though using Skype, not my HatCam).

(Link to interview on YouTube, viewable on smartphones)

If you’re interested in process mining, you’ve got lots of options, including:

  • Download a demo of Disco and then contact Fluxicon, directly, about purchasing a license or renting one month-to-month. Depending on the revenue and expense associated with the process in question, even a small improvement will pay for the software. The rest, as they say, is gravy.
  • Work with a consultant who uses Disco. If you describe your needs, Fluxicon can recommend someone. If you’re dealing with HIT or EHR event data and processes… (That’s a hint to consider moi!)
  • Get yourself to the Process Mining Camp this upcoming June 4th in Einthoven, Netherlands. It’s free!
  • Download ProM, the free and open source process mining tool from That’s what I did for my presentation at HSPI in Las Vegas. I’ll even send you my forty line demo event log. Between my paper and online narrated slides, you’ll be up and running lickity-split. And ready to graduate to Disco.
  • Prepare your event log data in the right format (instructions here, also see second page of my EHR process mining paper), zip it, password protect it, and send it to me. I have permission from Fluxicon to turn the crank on Disco and send you back examples of process maps and related reports.

I’ve been promoting process-aware ideas and technology in healthcare for over a decade. Process mining can transform your process-unaware information system into a process-aware information system. You’re going to hear a lot more about process mining, so you might as well get started now.

Good luck to Anne Rozinat and Christian W. Günther of Fluxicon and their new process mining product Disco!

My Virtual Graduate Degree in Computational Linguistics and Natural Language Processing

Besides “real” degrees in Accountancy, Industrial Engineering, Intelligent Systems and Medicine (among which I have great fun exploring connections), I have an additional “virtual” graduate degree. I’ve not mentioned it here, because, well, until recently I hadn’t really noodled how it fits into this blog’s brand: EHR Workflow Management Systems. I think I figured it out. I’ll give it a try. Let’s see if it fits.

Language always fascinated me. My mom was a reading teacher who home schooled me. Oh I still went to regular school too. But that didn’t stop her. (Thank you Mom!) Eventually I was one of the first students in one of the first graduate programs in computational linguistics (I see lots, now). It eventually disbanded, though if you search for “Laboratory for Computational Linguistics” and “Carnegie Mellon” there’s lots of references to it. I also found the following blurb in a local Pittsburgh paper.


I took all the courses necessary for a computational linguistics degree. Then I transferred to artificial intelligence to get a degree in intelligent systems. Along the way, I took (from memory and after a deep breath) linguistics (intro), phonetics, phonology, morphology, syntax (Chomsky), syntax (HPSG), semantics, logic, pragmatics, formal languages and automata, natural language processing I, II and III, knowledge representation, and natural language generation (plus electives like neurolinguistics and communication pathology).

And then I switched majors! Why? Because “All grammars leak” as linguist Edward Sapir wrote in 1921.

I found this out the hard way. I wrote a grammar to parse sentences for DARPA’s Pilot’s Associate program. Well, my grammar leaked big time! At first writing it was easy. After a while, though, every time I tweaked a grammar rule to do something right, something else would go wrong. I’d fix the verbs; a noun would break. I’d fix the nouns, and a verb would be broken again. It reminded me of the joke about how many programmers does it take to change a light bulb (one, but in the morning the fridge and toilet are broken).


I’d hit the “All grammars leak” brick wall. I looked for a way around it, under it, or over it, but it was really wide, deep and tall. I’d invested all this time and effort, but my undergraduate degree in Accountancy saved me. The time and effort were sunk costs, irrelevant to prospective decision making. So I went off to artificial intelligence and cognitive science.

I did keep an eye on CL/NLP though. Gradually, over time, CL/NLP became less-and-less about symbols and rules and introspection and more-and-more about numbers and formula and machine learning. (Interestingly, the pendulum may be swinging back, but that’s another blog post!)


(By the way, if you are knowledgeable about statistical grammars or process mining, don’t take those numbers too seriously! These are cartoons that alude, not refer to, actually possible examples!)

Techniques I’d learned during my Industrial Engineering courses popped up all over the place in CL/NLP. Markov models, dynamic programming, and other operations research and mathematical programming techniques were adapted, with great success, to stop grammatical leaks. Today, combined with good software engineering and cheap hardware, computational linguistics and natural language processing appear on the verge something big, though exactly what, only more time will tell. Google uses CL/NLP (I believe) to help understand what folks are searching for. IBM’s Watson uses CL/NLP to ask the right question (AKA an “answer” on Jeopardy). My guess is that CL/NLP (under the hood and perhaps not widely appreciated yet) already adds millions, if not billions, of dollars of value to our digital economy.

So, last month I attended an NIH workshop on “Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making.” I’ll blog about that experience and my impressions later. However, the short version: The NLP/CDS workshop was great. I was impressed.


What motivated me to finally write about CL/NLP in this blog about EHR workflow? There are all sorts of interesting connections! I’ll list some below. I promise to write and tweet about them in the future.

  • Business process management and computational linguistics/natural language processing share an important problem representation. BPM often represents activities, and CL/NLP sometimes represents sentence structure, with state transition networks (transition networks decorate this blog post). This should not be surprising though. State transition networks are used to represent a wide variety of complicated sequential behavior, from genetic sequences to, well, workflow patterns and sentence structure.
  • Speech recognition promises ways to increase EHR usability, but workflow technology will be needed to optimally incorporate speech technology into EHR workflow. Executing process models also provide information about which words are most likely to be uttered where and when in the workflow.
  • Interpreting clinical text ultimately requires more than just sound, structure, and meaning. It involves goal, plan, and task recognition. Process definitions that workflow engines execute are, themselves, a prime source for an EHR to “understand” what its users are up to, so as to then stay out of their way or to proactively and appropriately help.
  • Clinical NLP, itself, is a complex set of “pipelined” tasks, getting from word to interpretation. Due to complexity of these sequences, processing steps need to be modular so they can be easily swapped in or out to improve global performance. Systems such as UIMA, used by Watson, have workflow engine and process definition mechanisms to manage this processing complexity.
  • Finally, EHRs need to communicate with other EHRs and HIT systems. These interactions need to become more “conversational,” if they are to become more resistant to errorful interpretation. “Which patient are you referring to?” (reference resolution) “I promise to get back to you” (speech act) “Why did you ask about the status of that report?” (abductive reasoning) These interactions include issues of pragmatic interoperabilty (workflow interaction protocols over and above semantic and syntactic interoperability).

I’m sure to think of other connections between computational linguistics and natural language processing on one hand, and EHR workflow management systems on the other. Stay tuned!