Communication Ethics in Healthcare and Health IT


Today’s #HCLDR (Healthcare Leadership) tweetchat topic, What to Say When the Wrong Thing Was Said, hosted by @researchmatters, reminds me of a paper I wrote and presented over two decades ago (in Hong Kong!): Communication Ethics and Human-Computer Cognitive Systems. I discuss communication ethics and its relevance to designing intimate human-technology interfaces. My paper is mostly about humans using and communicating with intelligent tools, from intelligent prostheses to smart robots. In this post I retrieve some of those ideas and apply them to ethical human-to-human communication.

Communication Ethics

“Communication ethics, traditionally, involves the nature of the speaker (such as their character, good or bad), the quality of their arguments (for example, logical versus emotional appeals), and the manner in which presentation contributes to long term goals (of the individual, the community, society, religious deities, etc.) (Anderson, 1991 [in Conversations on Communication Ethics]). These dimensions interact in complex ways”


“Consider Habermas’s (1984) ideal speech…. Communication acts within and among cognitive systems should be comprehensible (a criteria violated by intimidating technical jargon), true (violated by sincerely offered misinformation), justified (for example, not lacking proper authority or fearing repercussion), and sincere (speakers must believe their own statements). These principles can conflict, as when an utterance about a technical subject is simplified to the point of containing a degree of untruth in order to be made comprehensible to a lay person. Thus, they exist in a kind of equilibrium with each other, with circumstances attenuating the degree to which each principle is satisfied.”

Medical Ethics

“Four principles—observed during ethically convicted decision making—have been influential during the last decade in theorizing about medical ethics (Beauchamp & Childress, 1994): beneficence (provide benefits while weighing the risks), non-maleficence (avoid unnecessary harm), self-autonomy (respect the client’s wishes), and justice (such as fairly distribution benefits and burdens, respect individual rights, and adherence to morally acceptable laws). People from different cultures and religions will usually agree that these principles are to be generally respected, although different people (from different cultures or ethical traditions) will often attach different relative importance to them.”

Pragmatic Interoperability

In another series of posts (five parts! 10,000 words!) I wrote about the concept Pragmatic Interoperability. Key to pragmatic interoperability is understanding goals and actions in context, and then communicating in a cooperative fashion. Healthcare professionals are ethically required to cooperate with patients. Implicature part of the linguistic science of cooperative communication.

“We’ll start with implicature’s core principle and its four maxims.

The principle is:

“Be cooperative.”

The maxims are:

  • Be truthful/don’t say what you lack evidence for
  • Don’t say more or less than what is required
  • Be relevant
  • Avoid obscurity & ambiguity, be brief and orderly”

I think most, or all, of the above ideas are relevant to figuring out that to say next, when the wrong thing was said. I will be looking for examples during the Healthcare Leadership tweetchat.

Healthcare Leadership Tweetchat Topics

T1 Beyond classical adverse events like wrong-site surgery or incorrect medication dose, adverse communication events can also occur in healthcare. What types of troubling or harmful communication issues have you experienced that affected your care?

T2 Perceptions vary. Patients may perceive something as a problem, whereas the healthcare team just sees business as usual. How can patients help clinicians understand that perceived problems are as important as actual problems?

T3 What steps can help (quickly) establish rapport between health care practitioners and patients so that if communication goes off-track, each is better equipped to address the problem or perceived problem?

T4 If nurses or other care team members observe poor communication between a physician and patient, what is their obligation–how should they attempt to address the situation?

Why ICD-10? The “Most of What Government Does Isn’t Cost Effective Anyway” Defense

A.S.S. (4/2/14) Needless to tweet (but I’m sure to do so anyway), this blog post generated a lot of disagreement on Twitter. I’m prepending the choicest here, in what is called an “antescript.” In contrast to a postscript, it occurs at a document beginning. (Skip to blog post.)

A.S. (3/31/14) Well, ICD-10 was delayed for a year, to 2015. I wrote the blog post below the day before the vote. Today tweets containing #ICD10, #ICDdelay, #nodelay and #SGR flew fast and furiously. I predicted the outcome before the vote and extracted what I believe is the fundamental lesson.

My original blog post….

I was a premed Accounting major (from the perennially ranked #1 University of Illinois Department of Accountancy). I believe in cost-justifying anything by anyone, from me to companies to the government. I’m against stuff that harms physician workflow, productivity, and professional satisfaction (best route to patient satisfaction with their physician). So anyway, I’ve been following the debate about ICD-10 and tweeted a link to Kyle Samani’s Why ICD-10?

My, my, my!

I think Kyle wins the debate hands down, but this is the quote from a comment counterargument that gobsmacked me.

“I’ve read all of Halamka’s posts. He’s a smart guy for sure. If you want to take an Expected Value approach to making decisions then probably 80% of the things we do and what the government mandates wouldn’t pass muster. IMO a weak argument.”

The crazy thing is I get the same basic argument from lots of people! That and apparent inability to understand the concept of sunk cost re the potential ICD-10 delay.

Normally I absolutely hate animated GIFs. However, this one for “puzzlement” has a big strong Expected Value!


From Syntactic & Semantic To Pragmatic Interoperability In Healthcare

Reviews of this article and my idea of pragmatic interoperability for healthcare are rolling in!

So, what the heck is Pragmatic Interoperability?

“Pragmatic interoperability (PI) is the compatibility between the intended versus the actual effect of message exchange.” (ref) The best current practical candidate for achieving pragmatic interoperability is workflow technology (AKA BPM or Business Process Management and related tech). A candidate for measuring progress toward pragmatic interoperability in healthcare is diffusion of workflow technology into healthcare.

Last year I searched every HIMSS13 exhibitor website (1200+) for evidence of workflow tech or ideas (AKA Business Process Management & Dynamic/Adaptive Case Management). About eight percent of websites had such evidence. I tweeted links to this material on the #HIMSS13 hashtag during the conference. This year, before HIMSS14, I searched again. More than sixteen percent of websites had qualifying material. This material I also tweeted during on the HIMSS14 conference on the #HIMSS14 hashtag.

  • What is pragmatic interoperability? (Beyond that initial quote).
  • How is workflow technology relevant to pragmatic interoperability?
  • What do I hope to see next year at HIMSS15?

What is pragmatic interoperability?

To understand pragmatic interoperability, you must also understand syntactic and semantic interoperability. Syntax, semantics, and pragmatics are ideas from linguistics. You can think of language as being built from successive layers of information processing: phonetics, phonology, morphology, syntax, semantics, pragmatics, and discourse (conversation). (All of which I’ve taken as undergraduate and graduate courses.) Turns out linguistics is relevant to communication among health IT systems too — who da thunk!

Between healthcare IT systems, as between people, interoperability is ultimately more about conversation than mere message-passing transactions. Think about it. Think about how self-correcting conversation is. How much conversation depends on shared understanding of shared context. In fact, twenty years ago Frisse, Schnase, and Metcalfe wrote the following:

Models for Patient Records

“When performance is defined as the result of collective efforts rather than as the result of the actions of an individual, software systems supporting these activities may be labeled under the popular rubric groupware….Although it is tempting to think of these activities as “transactions” it is equally valid to consider them “conversations” related to the solution of specific tasks….Using conversations as a central metaphor for handling patients’ records reflects workflow in a clinical setting”

Healthcare desperately needs usable interoperability. We need interoperability at the level of user interface, user experience, provider and patient experience and engagement, not just syntactic and semantic interoperability. The best metaphor at this level of interoperability is conversation, not transaction. But we need pragmatic interoperability to get to conversational interoperability. And workflow tech is the best way to engineer self-repairing conversations among pragmatically interoperable health IT systems. (More on this in my next section.)

Communication among EHRs and other health IT systems must become more “conversational,” if they are to become more resistant to errorful interpretation. “Which patient are you referring to?” (reference resolution) “I promise to get back to you” (speech act) “Why did you ask about the status of that report?” (abductive reasoning) These interactions include issues of pragmatic interoperability (workflow interaction protocols over and above semantic and syntactic interoperability).


By the way, context plays a very important role in pragmatics. If you are interested in context, contextual design, context-aware computing, etc. you ought to take a look at how linguistics views context. A good place to start is Birner’s Introduction to Pragmatics.

I am in no way diminishing the importance of syntactic and semantic interoperability. In fact, we need both to get to pragmatic interoperability. Much is made of the need for EHRs to interoperate with each other and other information systems (as well it should). Current efforts focus on syntactic and semantic interoperability. Syntactic interoperability is the ability of one EHR to parse (in the high school English sentence diagram sense) the structure of a clinical message received from another EHR (if you are a programmer think: counting HL7’s “|”s and “^”s, AKA “pipes” and “hats”).

Semantic interoperability is the ability for that message to mean the same thing to the target EHR as it does to the source EHR (think controlled vocabularies such as RxNorm, LOINC, and SNOMED).

Meaning is shared between two systems (human or computer) when the following occurs:

  • An external real world entity (drug, diagnosis) is referred to by an internal concept
  • The internal concept is encoded as symbol (word, number, ICD-9 code)
  • The symbol is transmitted across a channel (air, paper, TCP/IP, string)
  • The symbol is decoded to an internal concept
  • The internal concept refers to the same external real world entity (drug, diagnosis)

Two systems that can do this are “semantically interoperable.”

Plug-and-play syntactic and semantic interoperability is currently the holy grail of EHR interoperability. We hear less about the next level up: pragmatic interoperability. As soon as, and to the degree that, we achieve syntactic and semantic interoperability, issues of pragmatic interoperability will begin to dominate. And they will manifest themselves as issues about coordination among EHR workflows. In fact, issues of pragmatic interoperability are already beginning to arise, although they are not always recognized as such.

Here are succinct descriptions of semantic versus pragmatic interoperability:

“Semantic interoperability is concerned with ensuring that a symbol has the same meaning for all systems that use this symbol in their languages. Symbols are real world entities indirectly (i.e., through the concept they represent). Therefore, the semantic interoperability problems are caused either by different abstraction of the same real-world entities or by different representations of the same concepts….”

“Pragmatic interoperability is concerned with ensuring that the exchanged messages cause their intended effect. Often, the intended effect is achieved by sending and receiving multiple messages in specific order, defined in an interaction protocol.” (ref)

This last quote elaborates my first quote about pragmatic interoperability. At this point we must consider what specific technology is available to begin to create pragmatically interoperable health IT systems.

How is workflow technology relevant to pragmatic interoperability?

So, how does workflow technology tie into pragmatic interoperability? The key phrases linking workflow and pragmatics are intended effect and specific order.

Imagine a conversation between a primary care EHR workflow system and a specialty care EHR workflow that goes like this:

  • EHR workflow systems (WfSs) will need to coordinate execution of workflow processes among separate but interacting EHR WfSs. For example, when a general practice EHR workflow system (GP EHR WfS) forwards (“Invoke”) a clinical document to a subspecialist who is also using an EHR workflow system (SS EHR WfS), the GP EHR WfS eventually expects a referral report back from the SS EHR WfS.
  • When the result arrives (“Result”), it needs to be placed in the relevant section in the correct patient chart and the appropriate person needs to be notified (perhaps via an item in a To-Do list).
  • If the expected document does not materialize within a designated interval (“Monitor”), the GP EHR WfS needs to notify the SS EHR WfS that such a document is expected and that the document should be delivered or an explanation provided as to its non-delivery. The SS EHR WfS may react automatically or escalate to a human handler.
  • If the SS EHR WfS does not respond, the GP EHR WfS may cancel its referral (“Control”) and also escalate to a human handler for follow up (find and fix a workflow problem, renegotiate or terminate an “e-Contract”).

The sequence of actions and messages — Invoke, Monitor, Control, Result — that’s the “specific order” of conversation required to ensure the “intended effect” (the Result). Interactions among EHR workflow systems, explicitly defined internal and cross-EHR workflows, hierarchies of automated and human handlers, and rules and schedules for escalation and expiration will be necessary to achieve seamless coordination among EHR workflow systems. In other words, we need workflow management system technology to enable self-repairing conversations among EHR and other health IT systems. This is pragmatic interoperability. By the way, some early workflow systems were explicitly based on speech act theory, an area of pragmatics.

What do I hope to see next year at HIMSS15?

I answered the question, what I’d like to see next year at HIMSS15 regarding workflow and workflow technology, in an interview at HIMSS14.

Here is the transcript the relevant portion of that interview:


What do [I] want to see coming out of HIMSS14 so [I] am even more excited next year (at HIMSS15)?


I am pretty monomaniacally interested in workflow. Because workflow is a series of steps, each of which consumers a resource (a cost), and achieves some goal, that the value. Health IT need more success in this area.

I want to see that sixteen or twenty percent this year [of exhibitors emphasizing workflow], that was eight percent last year; I’d like to see that go to a third of the vendors. And I would also like to see big signs saying “Here’s our top ten workflow aware vendors,” that kind of marketing to reward the folks investing in this technology.”

All-in-all, I was delighted to see a doubling of HIMSS14 exhibitors emphasizing workflow ideas and technology. And guess what? You aint seen nothing yet.

Whatever else Meaningful Use has done, many proponents and opponents would agree, it’s created a multi-billion dollar industry of EHR “work-arounds.” From EHR-extenders to speech recognition interfaces to full-blown workflow platforms riding on top of EHRs and related systems, workflow infrastructure, which ideally should have been implemented in the first place, is being added piecemeal. It’s flowing around legacy systems, talking to them through APIs when available, through reverse-engineered interfaces when not. Pragmatic interoperability provides users the “What’s next?” that many current systems can’t.

Social, mobile, analytics, and cloud (SMAC) get glory and credit. But often, under the hood, is workflow technology: workflow engines, process definitions, graphical editors and workflow analytics. Some EHRs will fare better than other. EHRs with open APIs will become plumbing. EHRs relying on workflow tech themselves will more naturally meet pragmatic interoperability half-way. Or one-quarter or three-quarter way, depending on their own degree of process-aware architecture and infrastructure.

As I wrote in another blog post about HIMSS14, software application architecture evolves through generations. In other industries, software designers are taking workflow out of applications, just as they moved data from apps to databases. Many of the usability and interoperability problems be-deviling health IT will become more manageable if we stop hardcoding workflow.

If you’ll recall, earlier I alluded to relevance of pragmatic interoperability and workflow technology to usability and user experience. So, just to hammer home the idea that workflow technology is the most natural and practical way to deliver true “deep usability”, let me point out that back in 1999, the Workflow Management Coalition defined an eight level model of workflow interoperability. It started with “Level 1: No interoperability” and went all the way up to “Level 8 – Common Look and Feel Utilities: Co-operating WFMS present a standard user interface.” (ref) Interoperable workflows and workflow usability are two sides of the same coin.

Let’s represent healthcare workflow so users can understand it and workflow engines can execute it. In doing so, we’ll finally make progress toward pragmatic interoperability and usable health IT systems.

I Interview New M*Modal Website(!) on Future of Language and Workflow in Healthcare

Short (well, memorable!) Link:

Last week, during #HIMSS13, I tweeted out individual questions and answers from the following interview with [wait for it!] the new M*Modal website at Here is the combined interview. I’ve included the original tweets so you can retweet answers to individual questions….

I’m going to try something a little different this week. I’ll talk to a website! I usually submit geeky questions about workflow or language technology to an industry expert, then top it off with a One-Minute Interview (on YouTube) embedded in the resulting blog post.

I recently interviewed M*Modal’s Chief Scientist Juergen Fritsch, Ph.D. Like any good interview, it left me wanting more. As smart as Juergen is, using a whole-is-greater-than-the-parts logic, M*Modal must be even smarter than he is. But I can’t interview 12,000 people. So I decided to have a conversation with, M*Modal’s new website.

Live Thumbnail

By the way, I’m aware and concerned about walking the fine line between education and marketing (and have written about it). I am not endorsing any M*Modal product or service. However, I’ve written hundreds of thousands of words (and 15,000 tweets!) about workflow tech in healthcare. I look for confirmation wherever I can find it. 🙂 I certainly endorse combination of workflow technology and language technology to help make EHRs and health IT systems more usable and useful. M*Modal is a leader in this area and I appreciate their cooperation to increase public understanding of both workflow tech and language tech opportunities.

My interviews? They’re more like conversations in which I talk almost as much the person (or, in this case, website) I’m talking to. I’ll mention earlier blog posts, quote from Wikipedia, even textbooks.  I eventually do get to the point. I don’t think my interviewees mind. I’m not like Larry King, who reputedly never read the books before he interviewed their authors (“So, what’s your book about?”) I’m more like Charlie Lamb on CSPAN (“On page 582 you write, [Charlie reads a couple paragraphs]. What did you mean by that?”)

So,, thank you for agreeing to this interview. Silence. Hmm.

I searched for “workflow”. I got 165 hits. I looked at each instance and context. Using the most interesting material I created 10 “answers.” Then I wrote the questions.

Let’s try again…

1., in a nutshell, in words people who aren’t rocket scientists or computational linguists can understand, what problem are you trying to solve?

“Physicians are natural storytellers. They prefer to document the complete patient story by simply speaking and naturally capturing the full narrative. With Electronic Health Records (EHR) it’s not that simple. Clinicians have to change their behavior and use point-and-click into various templates that just can’t tell the whole story. Using EHRs, collaboration remains difficult, prone to errors and incomplete. Speech-based narrative documentation is workflow-friendly and permits the whole story to be told, and easily and more completely passed along, creating a much more collaborative sharing of intelligence from doctor to doctor.”

Nicely put! EHR usability is a big issue these days. “Clickorrhea” does seem part of the problem. Got it.

2. From your unique perspective, what is the connection between language tech and workflow tech?

“This is an absolute dead-on question, I’m so happy you asked it. The important connection is if we would just do speech-to-text transcription we wouldn’t affect anything. We’d just be creating a piece of text, without being able to drive actions. Ultimately we want to drive that action in the workflow – for example, have a physician create that order for a new medication. We want to make sure follow up happens and facilitate the workflow that enables that process from beginning to end. Also, healthcare is all about collaboration among providers. There is a lot of patient handoff and effective coordination of care doesn’t happen nearly as much as it should, and it only happens if proper workflow processes are in place. If we’re not trying to get involved in that process and drive more effective workflow processes, we’re not being successful in affecting change.”

(You’re right, doesn’t have “This is an absolute dead-on question, I’m so happy you asked it.” on it anywhere. That would be a remarkable feat of dynamic natural language generation and extrasensory perception now, wouldn’t it! It’s Juergen’s answer to question 8 in that recent interview. However, the interview is noted on M*, with a link to the full interview on my blog.)

3. I’m especially interested in how workflow technology, combined with language technology, can improve efficiency and user experience. Could you expand a bit on those themes?

(“Certainly” I faintly hear.)

“By extracting, aggregating, analyzing and presenting clinical information based on business intelligence, M*Modal imaging solutions make sure that the right information is available at the right time for game-changing workflow management. Based on semantic understanding, M*Modal technology dynamically reacts to what is said and what is known from priors to automatically initiate a unique, information-driven, situationally-appropriate workflow. This content-based, real-time, corrective and pre-emptive physician feedback and decision support not only enhance efficiency and user experience, but also support downstream processes like compliance, coding and quality reporting.”

Wow! Now this is a lot more technical! However, I wrote a paper a couple years about about using event processing and workflow engines to improve “EHR Productivity.” Let me go back and reread that…. OK…. yes, I do think we are speaking of similar ideas. I wrote about use of structured EHR data to trigger EHR workflows, not unstructured free text, but similar idea. Let me tease this apart.

  • Speech recognition turns sounds into free text.
  • Natural language processing turns free text into structured data.
  • Semantic understanding figures out what is means and which workflows to trigger.
  • So, based on what the physician says, within moments after it’s said, asking for clarification if necessary, tasks are automatically queued, executed, tracked, etc.

Am I right?

And then the strangest thing happened. The webpage refreshed and the following text appeared:

“In principle you’re right, except that this is not a sequential process where one technology works on the output of the previous one. Instead, we have tightly integrated speech recognition, natural language processing and semantic understanding in a way that they complement each other. For example, speech recognition accuracy is improved by leveraging some of the semantic understanding that would indicate that a physician is talking about patient problems, rather than patient medications. When you adequately combine all the technologies mentioned above, you get more out of it than just the sum of their individual capabilities.” has some seriously wicked tech to pull that off! Both natural language processing *and* natural language understanding, not to mention remote extrasensory perception!

4. I’m a visual kinda guy, at a high level, what does your language and workflow platform look like?


Thanks. Let’s unwind the workflow from the moment a physician says something to the moment it helps someone.

  • Real-Time Speech Recognition
  • Cloud
  • Automated transcription
  • Human post-editing?
  • Cloud
  • Natural language processing
  • Cloud
  • (Then, in parallel, no particular order)
    • Insert data into EHR
    • Submit codes to billing
    • Distribute management reports
    • Analyze data to improve effectiveness and efficiency

Speech Understanding, the small cloud on the upper right, is sort of a label for the entire cloud, including speech recognition, natural language understanding, and workflow orchestration, right?

How did I do?

“Very well!! And as I said in my previous comment, Speech Understanding represents a tight integration of various technologies, using them in a non-linear way.” is even diplomatic! That requires remarkable discourse processing technology.

5. It would be great it we could actually track a hypothetical phrase, from beginning to end, what NLP engineers call a “linguistic pipeline.”

“While we could provide an example of that, it would look fairly generic and like any other NLP pipeline you may have seen before. The core differentiator of M*Modal’s speech understanding technology is that we don’t run a sequential pipeline, but that we have feedback loops and non-linear interactions between the individual stages of speech recognition, NLP, etc.”

“[F]eedback loops and non-linear interactions”, yes I’ve read about this. Speech and language understanding is a complex mixture of data-driven, bottom-up processing and context-driven, top-down processing. (Just think if how many times you don’t actually “hear” what’s said, but know it nonetheless purely from context.)

6. About that sub-cloud labeled “Workflow Orchestration”… Are we talking “workflow orchestration” in the same sense it is used in the workflow automation and business process management community?

From Wikipedia:

Workflow engines may also be referred to as a Workflow Orchestration Engines.

“The workflow engines mainly have three functions:

    • Verification of the current status: Check whether the command is valid in executing a task.
    • Determine the authority of users: Check if the current user is permitted to execute the task.
    • Executing condition script: After passing the previous two steps, workflow engine begins to evaluate condition script in which two processes are carried out, if the condition is true, workflow engine execute the task, and if execution successfully complete, it returns the success, if not, it reports the error to trigger and roll back the change.

Workflow engine is the core technique for task allocation software application, such as BPM in which workflow engine allocates task to different executors with communicating data among participants. A workflow engine can execute any arbitrary sequence of steps. For example, a workflow engine can be used to execute a sequence of steps which compose a healthcare data analysis.”

Do you use a workflow engine? Could you describe what we discussed earlier in terms of this engine?

“We do use a workflow engine in various of our solutions. In the case of clinical documentation services, it is used to orchestrate the processing, proofreading and distribution of millions of clinical documents per year, involving tens of thousands of users. In the case of coding and clinical documentation improvement workflows, it is used to orchestrate intricate workflows involving a combination of technology and humans, with lots of different users with different roles.”

That’s fantastic! I think healthcare needs more true workflow technology, such as what you describe. I increasingly frequently prepend “Workflow engine sighting” to links I tweet from @EHRworkflow.

7. But I’d like to shift gears now, over to the computational linguistics and natural language processing side. Computational linguistics, the science behind the NLP engineering, includes conversation (discourse) and achieving goals (pragmatics), not just sounds, syntax, and semantics. Where do you see medical language technology going in this regard?

“Again, you hit it dead-on – in the past, people have ignored the pragmatics aspect. At M*Modal we have been focused on pragmatics since the very beginning. Where it’s all going is being able to understand the content of speech, using semantics and syntax to understand what people are really talking about. You are absolutely right that without pragmatics we’d never be able to accomplish what we’re trying to with NLP technology.”

8. I picked up a copy of Introduction to Pragmatics. It was a great review, since the last graduate course in pragmatics that I took was so ago. And I read it! (I’m planning a blog post about importance of pragmatics to EHR and HIT interoperability and usability.)

At the end of the book, in the summary, was this:

“Who could doubt that the world of artificial intelligence will soon bring us electronic devices with which we can hold a colloquial natural-language conversation? The problem, of course, is pragmatics. Not to slight the difficulties involved in teaching a computer to use syntax, morphology, phonology, and semantics sufficiently well to maintain a natural-sounding conversation, because these difficulties are indeed immense; but they may well be dwarfed by the difficulties inherent in teaching a computer to make inferences about the discourse model and intentions of a human interlocutor. For one thing, the computer not only needs to have a vast amount of information about the external world available (interpreting I’m cold to mean “close the window” requires knowing that air can be cold, that air comes in through open windows, that cold air can cause people to feel cold, etc.), but also must have a way of inferring how much of that knowledge is shared with its interlocutor.”


“Thus, the computer needs, on the one hand, an encyclopedic amount of world knowledge, and on the other hand, some way of calculating which portions of that knowledge are likely to be shared and which cannot be assumed to be shared – as well as an assumption (which speakers take for granted) that I will similarly have some knowledge that it doesn’t. Beyond all this, it needs rules of inference that will allow it to take what has occurred in the discourse thus far, a certain amount of world knowledge, and its beliefs about how much of that world knowledge we share, and calculate the most likely interpretation for what I have uttered, as well as to construct its own utterances with some reasonable assumptions about how my own inferencing processes are likely to operate and what I will most likely have understood it to have intended. These processes are the subject of pragmatics research.”

In his recent interview, Juergen said “At M*Modal we have been focused on pragmatics since the very beginning”. Could you expand on his comments?

You would be justified to suspect that the answer to this question is not to be found on However, Wikipedia says “Pragmatics is a subfield of linguistics which studies the ways in which context contributes to meaning.”

“Context” occurs 48 times on For example:

  • “Healthcare Challenges and Context-Enabled Speech
  • “the real context and meaning behind a physician’s observations”
  • Context-specific patient information — from prior reports, EHRs, RIS, PACS, lab values, pathology reports, etc”
  • “providing real understanding of context and meaning in the narrative – not simply term matching or tagging”
  • “combine […workflow management…] with Natural Language Understanding to bring context to text
  • “Enabling physicians to populate the EHRs with color, context and reasoning without changing their established workflow
  • context-aware content that is codified to standardized medical lexicons, such as. SNOMED®-CT, ICD, RadLex®, LOINC, and others”

I love the connection between context and workflow. I’ve written about that too. But my point here is: if pragmatics is about context and M*Modal is about context then M*Modal is about pragmatics too. I won’t go any further into the subject of the importance of pragmatics to healthcare workflow. I’m planning a future blog post about import of discourse, reference, speech acts, implicature, intent, inference, relevance, etc. to EHR interoperability and usability.

In our interview, when Juergen said “You are absolutely right that without pragmatics we’d never be able to accomplish what we’re trying to with NLP technology,” what did he mean?

“The context of any natural language statement is extremely important for the correct semantic understanding. It is not sufficient to identify a key clinical concept like ‘pneumonia’ in a statement like ‘Two months ago, the patient was diagnosed with pneumonia, which turned out to be a mis-diagnosis.” Pragmatics (context, really) informs us that the statement is about the patient, that it is about something that occurred 2 months ago, and that it was a false diagnosis. Without a level of pragmatics, we would completely misinterpret that statement.”

9. By the way, while the web page didn’t come up in response to my “workflow” query, I stumbled across an M*Modal developer certification program. Which leads me to my final question. All of this workflow technology and language technology for improving efficiency and user experience? How do I, as a developer (and I are one), harness what you have created?

HCIT vendors can take advantage of M*Modal’s free Partner Certification
Program. M*Modal Fluency Direct speech-enables electronic health records (EHR) and other clinical documentation systems by verbally driving actions normally associated with point-and-click, templated environments.

    • No cost to certify or for yearly recertification
    • Access to product development engineers
    • Access to product development documentation
    • Onsite engineering-focused, peer-to-peer training session
    • Featured on program website
    • Allowed to use a specialized certified logo
    • Co-marketing and marketing opportunities
    • Signage for tradeshows
    • Product labels and specialized documentation

How to Get Started

M*Modal has made certification as simple and smooth as possible. The certification process consists of an onsite Speech Enablement Workshop at no cost to the vendor. To get started, vendors simply visit and register or email us at We will follow up with you and provide additional information that will prepare you for the certification workshop.

Well now! I have to admit you nailed that last question. You even used bullet points. I’ve never, ever, had an interviewee who (er, which, that, you tell me, you’ve got all the grammar rules!), who did that before.

I appreciate all the time you’ve spent with me. I hope I didn’t put too much of a strain on the web server. If anyone has any follow up questions, are you on Twitter?

Cool! I already follow you.

Well, that was my interview, about the future of language and workflow, with the website. I’m sure you’ll agree that it’s remarkable.

My Collected Blog Posts on Clinical Natural Language Processing

Short Link:

I created this post so I could tweet a link to more than one of my posts at a time. I’m sure there will be many more. I’ll add them here. By the way, the correct Twitter hashtag for natural language processing is #NLProc, not #NLP. Really! Click the previous links to compare for yourself.

Natural Language Processing and Healthcare Workflow: Interview with M*Modal’s Chief Scientist Juergen Fritsch, Ph.D.

Here’s a fantastic video and written interview with Juergen Fritsch, Ph.D. of M*Modal. M*Modal is a leading provider of clinical transcription services, clinical documentation workflow solutions, advanced cloud-based speech understanding technology and advanced unstructured data analytics. Juergen is Chief Scientist at M*Modal. (Sounds fun!)

Juergen’s insightful responses, to my in-the-weeds questions about language technology and healthcare workflow, hit so many nails on so many heads that I run out of metaphors. And his heartfelt thoughts on starting a company make me re-appreciate how lucky I am to live and work in the good-ol’ USA!


Here’s Juergen without that pesky YouTube play button
(see below) that turns everyone into an “Arrow Head”!

Juergen Fritsch on LinkedIn

Only one question (#8) is about workflow per se (though it’s implicit in others), but since that is what this blog is about, I telegraphed my punch in the title: Natural Language Processing and Healthcare Workflow…. EHRs and health IT need both workflow and language technology if they are to have a shot at making healthcare substantially more effective, efficient, and satisfying.

One-Minute Interview:
Juergen Fritsch on Founding of M*Modal and US Innovation
[Nota bene! I captured this video at two-frames-a-second
video using Skype. It’s quality, or lack, is my sole responsibility!]

  1. Who is Juergen Fritsch?
  2. Structured vs. Unstructured Data
  3. Closed loop documentation with automated feedback
  4. M*Modal”? “Multi-modal”?
  5. Juergen’s Ph.D. Thesis
  6. Does firing linguists improve speech recognition?
  7. Starting a company in the US
  8. The workflow tech/language tech connection
  9. Moving to pragmatics & discourse processing
  10. Medical equivalent of the HAL computer?

(By the way, while I used Skype for the One-Minute Interview segment of this post, at #HIMSS13 I’ll be out and about with my Infamous HatCam. Tweet me at @EHRworkflow if you’d like a shot at almost real-time stardom. I’ll record, edit, get your OK, upload to YouTube, and tweet your booth number and #HIMSS13 hashtag, all literally on the spot! Here’s an example from the #HIMSS12 exhibit floor.)

QUESTIONS – Answers from Juergen Fritsch, Chief Scientist, M*Modal

1. Who are you? Where do you work? What is your role?

I’m Juergen Fritsch and I serve as Chief Scientist at M*Modal. I’m responsible for all innovation activities around M*Modal’s speech understanding and clinical documentation workflow solutions.

2. In your recent AHIMA presentation, your closing slide included the following bullets:

Unstructured documentation not sufficient
Structured data entry via EHRs not sufficient

What do you mean by this?

Re: second point: The government is pushing for structured data entry. That’s because the EHR paradigm is not sufficient as it forces physicians to abbreviate and be minimalistic in their approach to clinical documentation. Physicians don’t have time anymore to tell the full patient story, leaving quality on the table and creating substandard clinical documentation.

Re: first point: Unstructured documentation is not sufficient because it’s a blob of text and although very valuable for the physician to read, it doesn’t allow a computer to read and then drive action. These are hidden in the unstructured blob of text and not actionable as a result.

3. With respect to the same slide, what do you mean by “closed loop documentation with automated feedback”?

Closed loop means bidirectional. As a physician when I do documentation I do not want to just provide input, I need to be able to get feedback, hear back from the system, as in “yes this is sufficient,” and give all the details I need. In most cases there’s a lack of specificity in the documentation. Here’s an example:

If I’m documenting a patient with a hand fracture I will need to comply with ICD-10 and be able to provide detail as to which fingers, which arm, is it healing or not. It’s a lot of detail from a billing perspective that physicians may not even think to provide on their own. Closed loop documentation helps with that by constantly observing the information being provided and prompting the physician to fill in missing information or address a lack of specificity in the system so at the end you get the best possible documentation with the least amount of effort.

4. When I look at “M*Modal”, I think “multimodal”. Right or wrong? How did you (or whoever) come up with the name? What does the “multi” refer to?

Great question. Yes, M*Modal refers to multimodal, and means that only one way of doing things is not sufficient. In other words, different physicians have different approaches, preferences and needs. For example, an ER physician may not have a hands-free environment, and may want to use a microphone when creating documentation, while a primary care physician can be in front of a computer and enter things right then and there with the patient. At M*Modal, we want to be multimodal and not force physicians into one way of doing things, but accommodate them, their needs and different ways of completing documentation.

5. The title of your Ph.D. thesis was “Hierarchical Connectionist Acoustic Modeling for Domain-Adaptive Large Vocabulary Speech Recognition.” In basic, non-mathematical terms, could you explain your research? Is it still relevant? How has speech recognition evolved since?

That thesis was about using artificial neuro networks to do speech recognition. At the time there was not much research done along those lines and there was a prevailing way of doing things using statistical method. I tried to apply artificial neuro networks and was quite successful. Interestingly there was just recently a renaissance of that idea and people picked it up again with a slight twist. It has evolved and I can provide more details via our video interview if you are interested. What’s exciting to me is, it is being picked up again, not in almost five to seven years, but now again people are doing it.

6. I studied computational linguistics back when one took courses in linguistics and GOFAI (Good Old Fashioned Artificial Intelligence). Statistical and machine learning approaches superseded that kind of NLP with considerable success. Will the pendulum continue to swing? In which direction? In other words, is “Every time I fire a linguist, the performance of our speech recognition system goes up.” still true?

Unfortunately, yes, this is still true – mostly because we have so much data available to us. Stat methods have been so successful in replacing the old school linguistic approaches because of good, plentiful data. The enormous amount of data available makes those methods so difficult to replace.

7. You performed original research, founded a company, and continue to evolve those ideas and that product. This must be personally satisfying. Could you share some thoughts about science, innovation, jobs, and economic progress?

I found it extremely gratifying coming to the U.S. as a student, not having been born or raised here, and being able to work on challenging, cutting-edge research problems. Then getting the opportunity to form a company and how relatively easy it was to get started, and how much people gave a very small company with only about 10 people and not much revenue a chance, was also very gratifying. I would never have been able to do this in my home country – there would have been too many obstacles and people would not have been ready to bet on a startup as much as they do here. The American culture of giving the underdog a chance to try out new ideas as long as they are perceived to be valuable is very rewarding. I would encourage students of various disciplines to try the same things.

8. I write and tweet a lot about workflow management systems and business process management systems in healthcare. These include, at the very least, workflow engines and process definitions. To me, there does seem to be some similarities, or at least complementary fit, between language technology and workflow technology. For example, on the M*Modal website is a short page where “workflow” is mentioned nine times, as well as “workflow orchestration” and a “workflow management module.”

From your unique perspective, what is the connection between language tech and workflow tech?

This is an absolute dead-on question, I’m so happy you asked it. The important connection is if we would just do speech-to-text transcription we wouldn’t affect anything. We’d just be creating a piece of text, without being able to drive actions. Ultimately we want to drive that action in the workflow – for example, have a physician create that order for a new medication. We want to make sure follow up happens and facilitate the workflow that enables that process from beginning to end. Also, healthcare is all about collaboration among providers. There is a lot of patient handoff and effective coordination of care doesn’t happen nearly as much as it should, and it only happens if proper workflow processes are in place. If we’re not trying to get involved in that process and drive more effective workflow processes, we’re not being successful in affecting change.

9. As you know, computational linguistics, the science behind the NLP engineering, is about more than sound (phonetics and phonology), sentence structure (syntax), or even meaning (semantics). It’s also conversation (discourse) and achieving goals (pragmatics). Where do you see medical language technology going in this regard?

Again, you hit it dead-on – in the past, people have ignored the pragmatics aspect. At M*Modal we have been focused on pragmatics since the very beginning. Where it’s all going is being able to understand the content of speech, using semantics and syntax to understand what people are really talking about. You are absolutely right that without pragmatics we’d never be able to accomplish what we’re trying to with NLP technology.

10. How many years until we have the medical equivalent of the HAL computer depicted in the Movie 2001?

Hopefully never! ☺ We are getting there in a different way but I don’t think that the computer will ever replace humans. The computer will provide information, guide and educate the user, but not replace the human decision-making process.

Nuance’s 2012 Understanding Healthcare Challenge: Natural Language Processing Meets Clinical Language Understanding

I’ve written a lot recently about natural language processing in healthcare.

Language technology and workflow technology have lots of interesting connections. As I previously discussed:

  • NLP and workflow often use similar representations of sequential behavior.
  • Speech recognition promises to improve EHR workflow usability but needs to fit into EHR workflow.
  • Workflow models can provide context useful for interpreting speech and recognizing user goals.
  • NLP “pipelines” are managed by workflow management systems.
  • Workflows among EHRs and other HIT systems need to become more conversational (“What patient do you mean?”, “I promise to get back to you”, “Why did you ask me for that information?”)

So, writing about NLP reflects the name of this blog: EHR Workflow Management Systems.

Therefore I was delighted when Nuance Communications, provider of medical speech recognition and clinical language understanding technology, approached me to highlight their 2012 Understanding Healthcare Challenge. An interview with Jonathon Dreyer, Director, Mobile Solutions Marketing follows. In a postscript I add my impression of going through the process of gaining access to Nuance’s speech recognition and clinical language understanding SDKs (Software Development Kits).

By the way, I think the idea and model of a vendor sponsoring a challenge and giving away developer support packages is one of the best ideas I’ve come across in quite a while. If anyone else decides to follow suite, I’d love to interview and highlight your SDK and related resources too!

Interview with Jonathon Dreyer, Director, Mobile Solutions Marketing, Healthcare Division at Nuance Communications, about the 2012 Understanding Healthcare Challenge

Jonathan, thanks for taking time out of your schedule for speaking with me! I enjoyed interviewing Nuance’s Chief Medical Informatics Officer, Dr. Nick van Terheyden and turning that into a blog post.

Video Interview and 10 Questions for Nuance’s Dr. Nick on Clinical Language Understanding

So I look forward to doing the same with you!

Dr. Nick did a great job of putting speech recognition and natural language understanding into clinical context. That interview was directed toward user-clinicians. Let’s focus this interview on developers. (As well as curious clinicians; lots of physicians are learning about IT these days!)

1. What’s your name and role at Nuance?

Jonathon Dreyer, Director, Mobile Solutions Marketing, Healthcare Division at Nuance Communications. I manage our 360 | Development Platform. That’s speech recognition and clinical language understanding in the cloud. It includes a variety of technologies for desktop, web and mobile, including speech-to-text, text-to-speech, voice control and navigation, and clinical fact extraction.

2. I understand Nuance is sponsoring a contest or challenge. What’s it called? What does it entail?

It’s called the 2012 Understanding Healthcare Challenge. The deadline is Friday, October 5th. Just go here…


…and fill out answers to some questions and submit them to Nuance.

About a year and a half ago we launched our speech recognition platform in the cloud. Earlier this year we had sponsored a successful challenge in which several dozen developers participated. Recently we released our clinical understanding (CLU) services platform and software development kit (SDK). The CLU engine can take unstructured free text from dictation (generated by our speech recognition engine), or existing text documents, and extract a variety of data sets.

The key difference between the current 2012 Understanding Healthcare Challenge and the previous speech recognition challenge is that in the previous challenge developers integrated speech recognition into applications, but in this challenge we’re looking for great ideas. The 2012 Healthcare Understanding Challenge has list of questions: What clinical data types would you use? What value is provided to end-users? And so on.

At the end of the challenge application deadline, October 5th, we’ll select three winners. Each will get a free developer evaluation and a year of developer support. These packages are worth $5,000. Essentially Nuance will help developers bring their idea to life and then to help market it.

[CWW: The 2012 Understanding Healthcare Challenge application form lists the following areas application: EMR/Point-of-Care Documentation, Access to Resources, Professional Communications, Pharm, Clinical Trials, Disease Management, Patient Communication, Education Programs, Administrative, Financial, Public Health, Ambulance/EMS, Body Area Network.]

3. Pretend I’m a programmer (which I occasionally am): how does Nuance work and how does Nuance work with mobile apps?

Our platform supports both speech and understanding. That’s both the speech-to-text service and then the text-to-meaning structured data service. A developer can sign up for one or both of these services. Depending on country and language they’ll access different relevant content and resources.

For example, a US developer can sign up for a ninety-day speech recognition evaluation (eval) account (360 |SpeechAnywhere Services), including SDKs and documentation, or the 30-day CLU eval (360 | Understanding Services). The developer portal has lots of educational and technical documentation, plus online forums and contacts for support. The SDKs are relatively simple to use. All you need are just a couple of lines of code. Within an hour most developers are generating their first speech to text transactions.

[CWW: I signed up for access to both the speech recognition and clinical language understanding evaluation documentation, software, and services. I’ll tell you what I found in a postscript at the end of this post.]

4. I’m seeing more and more speech-enabled mobile apps in healthcare. It’s not always obvious which speech-engine or language technology powers them. Have any numbers you’d care to share?

We’ve had about 300 developers come through our program. Several dozen have reached commercial status and their products are commercially available today. To date, we’ve worked with a lot of startup vendors. But in the next few weeks we’ll also be announcing partnerships that focus on providing speech recognition mobility to a number of well-known EHR vendors. It’s safe to say we are powering a “fair number” of these mobile healthcare applications.

5. Is there a “Nuance Inside” option? (after “Intel Inside”)

The phrase we use is “Powered by Nuance Healthcare” Plus there are a couple of visual indicators. In an iPhone or Android app, or in a web browser, there’ll be a little Dragon flame. This automatically appears in text fields when the speech recognition SDK is integrated into a product. And there’s a little button with the Dragon flame. Help menus also have a “Powered by Nuance Healthcare” badge.


6. How cross-platform is the technology? Does it rely on specific libraries compiled into iOS or Android apps?

SDKs include iPhone/iPad, Android, Web, and a .NET version for desktop windows.

7. I’m @EHRworkflow on Twitter and my blog is called EHR Workflow Management System, where I talk about workflow, usability, and natural language processing. It seems to me that a bunch of interesting technologies are coming together, all of which potentially contribute to more usable EHR workflow. Here’s just a few of these ideas: workflow, process, flexibility, customization, context of use, user intent, intelligent assistants, etc. I know that was a long preamble for this question, but could you react to some of these topics with respect to speech and language technology?

Well, for example, the latest version of the speech engine has some conversational capability. It can also do text-to-speech, so EHR can, potentially, speak up. Command-and-control functionality allows users to ask questions (such as “What are the vitals for my patient” ) and to navigate through an application.

The clinical language-understanding engine is a different use case, from the developer’s point of view, because it’s not directly dependent on an audio control. You send narrative text to the server and you get back useful data set extracted from the text. So CLU depends on the use case and what our development partner is trying to accomplish.

[CWW: The 2012 Understanding Healthcare Challenge application form lists the following datasets: Problems/Diagnoses, Medications, Allergies, Procedures, Social History, Vital Signs.]

However, if you’re doing pure speech recognition, then you add a couple lines of code and enable every text field. With our new command-and-control functionality users can also directly address such controls as checkboxes “Check this, check that, etc.” You can also integrate a medical assistant as in “Who are my patients for the day” or “Show me patient Mary Smith”, “When was her last visit?”

So we’re going beyond speech to text to more intelligent voice interactions. Speech recognition and clinical language understanding allow the intent of users to directly drive workflow or process.

We’ve leveraged our speech recognition and clinical language understanding technology to build our own workflow solutions, such as Dragon Medical 360 | M.D. Assist. It not only improves workflow from a technical perspective, streamlining and so forth, but also asks intelligent questions. So if a user mentions heart failure the system can check for specificity and ask the user for more information if needed.

Relative to workflow-related use cases, we’re seeing a lot of specialty-specific integrations, from general medicine and emergency medicine to dermatology and chiropractic. We’re also seeing a lot of EHR-agnostic front ends. These mobile workflow tools sit on top of legacy EHRs or even connect to multiple EHRs applications. One example even uses location services to reason from clinic or facility location to which back-end EHR system to which to connect. These systems intelligently recognize and reason from context to user-intended workflow. Adding speech recognition and clinical language understanding into this mix provides even more value. Every week I see something new and exciting from our development partners.

We have a base set of functionality we provide. If you want to do simple things, you can just add our code to your application. But you can also customize voice commands to work with your preferred keystrokes or macros.

We know that users were going to be doing some form of touching, speaking, typing, swiping and so on. For example, natively in iOS and Android, if you swipe to the right that starts your dictation. If you swipe to the right again, it stops. If you swipe to the left it “scratches” the last phrase.

Here’s an interesting workflow. If you tap a sequence of text fields, and speak into each, you don’t need to wait until the text appears in one text field before moving onto the next. All we need is a low-bandwidth connection, but if the network is congested or slow for any reason, users can move on while text catches up. We call this “Speak Ahead”. In other words, the user can forge ahead at own pace and we’ll accommodate them.

8. Tell me about how you support third-party developers.

[CWW: Jonathan and I spoke briefly about this. Since I personally registered as a developer, I’ll spill those beans in a postscript below.]

9. What are the ideal technical skill prerequisites for third-party developers?

The technical skills are not so much about our end. If developers are already building mobile apps for the iPhone and Android then they already have more than enough technical skill to integrate our technology. You’ll get more of an appreciation for this when actually get a chance to see the SDKs. We tried very hard to make integration as easy and effortless as possible. Relative to the current developer portal education and support resources, you’ll see that too. But we will also be updating the portal before the end of the year. We’ve had a lot of feedback from hundreds of developers and we’re continuing to leverage this to feedback to aim for rich educational content and a robust developer experience.

10. OK! Let’s close with the deadline for you 2012 Understanding Healthcare developer challenge and where folks go to apply.

Friday, October 5th. Go to


Thank you again, Jonathon!

Thank you Chuck!

Well, that’s my interview with Jonathan Dreyer, Director, Mobile Solutions Marketing, Healthcare Division at Nuance Communications. I certainly learned a lot. I hope you did too! And, please, get in there and apply for one of those $5,000 development support packages. Create something great. Then come back here and tell me about it!

Many thanks to Gordon Segersten, of Nuance Healthcare Business Development, for walking me through the application for access to the speech recognition and clinical language understanding evaluation materials and services.

PS In order to get my own impression of at least the first couple steps of becoming an Nuance third-party developer, I went to Nuance’s 360 | Development Platform developer support site at and signed up for 90-day and 30-day free evaluation access for speech recognition and clinical language understanding SDK material and support.

When I investigate integrating a third-party product or service, I start of with a short list of questions.

  1. Is there an SDK (Software Development Kit)?
  2. Does the SDK appear well documented? Lots of content? Well organized? Current? Etc.
  3. Is there sample code? In the right programming language? (i.e., the language the application you’re integrating the speech/language tech into, though there are often workarounds if not)
  4. Are data formats based on standards familiar to the developers in question? XML (eXtensible Markup Language), CDA (Clinical Document Architecture), etc.
  5. Are there support forums? Are they well populated with recent discussions: questions, answers from support, contributions from other users, etc?

After being accepted into the developer evaluation programs, I observed that the answers to these questions was “Yes.” In fact, I quite liked what I saw!

PSS One additional postscript: Whenever I get an opportunity to look under the hood of an EHR or HIT systems, I always look for workflow technology such as I tout in in my blog or via my Twitter account. It’s not always obvious! But it is often the secret in the sauce that makes some systems more customizable than others. Intriguingly I found what I was looking for:

“The data extraction platform contains a number of components that can be assembled in a pipeline to perform a specific extraction task. The actual execution of that task is performed by a workflow engine that takes this pipeline as a configuration parameter.”

Ha! Workflow engine sighting! I’ve written about NLP pipelines and workflow engines elsewhere in this blog, just in case you are interested!

EHR Usability, Workflow & NLP at AMIA 2012: Presentations You Don’t Want to Miss!

The two big Health IT conferences I’ve attended, repeatedly, over the years, are HIMSS and AMIA (even back when it was SCAMC). I always root around their on-line programs, looking for presentations about EHR and HIT usability, workflow and natural language processing. This year AMIA is in Chicago, November 3-7. The City of Chicago has even declared Chicago Informatics Week (love that logo, especially the skyline reflected in the lake).

Here’s today’s tweeted announcement of availability of the AMIA conference online program:

And here’s what I reeled in! If you click through to AMIA’s Itinerary Planner, you’ll find dates and times. In some cases, where I’ve a related blog post, I provide the link.

If you’re interested EHR usability, workflow, and natural language processing, I hope the above list convenient and you find something of interest. Let me know if I missed anything!

Video Interview and 10 Questions for Nuance’s Dr. Nick on Clinical Language Understanding

Natural language processing (NLP) applied to medical speech and text, also know as Clinical Language Understanding (CLU), is a hot topic. It promises to improve EHR user experience and extract valuable clinical knowledge from free text about patients. In keeping with this blog’s theme, NLP/CLU can improve EHR workflow and sometimes uses sophisticated workflow technology to span between users and systems.


Here’s Dr. Nick without that pesky YouTube play button
(see below) that turns everyone into an “Arrow Head”!

Therefore I was so delighted to Skype an interview with Dr. Nick van Terheyden, Chief Medical Information Officer at Nuance Communications. Warning: several of my questions were a trifle long…sorry! But I did want to explain where I was coming from in several instances. In every case Dr. Nick, as he is known, broke down complicated NLP/CLU ideas to their essentials and explained how and why they are important to healthcare.

I’ll start with the tenth question first, since, well, I promised Dr. Nick I’d do so.

10. Most of my previous questions are pretty “geeky.” So, to compensate, from the point of view of a current or potential EHR user, what’s the most important advice you can give them?

To me, the core issue is usability and interface design. The interface and technology has struggled to take off in part because the technology has been complex, hard to master and in many instances has required extensive and repeated training to use. The combination of SR [speech recognition] and CLU technology offers the opportunity to bridge the complexity chasm, removing the major barriers to adoption by making the technology intuitive and “friendly”. We can achieve this with intelligent design that capitalizes on the power of speech as a tool to remove the need to remember gateway commands and menu trees and doesn’t just convert what you say to text but actually understands the intent and applies the context of the EMR to the interaction. We have seen the early stages of this with the Siri tool that offers a new way of interacting with our mobile phone, using the context of your calendar, the day and date, location and other information to create a more human-like technology interface that is intuitive and less intimidating.

1. Dr. van Terheyden, I see references to you as Dr. Nick. How do you prefer to be addressed?

Thanks for asking – Nick is fine but many folks refer to me as Dr. Nick…so much easier than “van Terheyden.”

2. Could you tell us a bit about your education, what you do now, and how you came to be doing it?

That’s a long story that started over 25 years ago, after I qualified at the tender age of 22 as a doctor in England. I practiced for a while in the UK and also in Australia but decided I wanted to try other things. My first step into the technology world was unrelated to medicine when I worked at Shell International as a computer programmer in their finance division on IBM mainframes programming in COBOL, JCL, CICS, DB2, TSO, CLIST and REXX, to name a few, for the financial returns for Shell operating companies. Check out the IBM Terminal behind my desk….!


I then used the skills I acquired in these roles when I transitioned my focus to some early development and incubator companies that emerged onto the scene with electronic medical record (EMR) type functions in Europe. During this time, I had the fortune of working at a greenfield site in Glasgow Scotland that built one of the first paperless medical records – many of the discoveries and concepts developed there remain applicable today.

Dr. Nick and Holly (his Golden Lab!) on YouTube

(Note: Video interview contains different, and
funnier, content than these ten questions and answers.)

Then, my career took me to the Middle East, working in Saudi Arabia, and then on to the US where I have had the fortune of working in New York, California and Maryland with a number of companies in the healthcare technology space, including most of the clinical documentation providers and speech technology vendors.

3. Are there any stereotypes about speech recognition in general and medical speech recognition in particular? What is a more accurate or useful way to think about this technology?

Speech has been available commercially for a number of years in the healthcare space and to general consumers. For a number of years it struggled to deliver value – the technology suffered from general hardware challenges and some of the challenges that exist in consumer implementations with noisy and challenging environment (your car for instance is a difficult environment with many and varied background noises and poor quality audio recording). In the clinical setting we have similar problems with noise but with added complexity of clinical workflow and fitting into the busy and complex clinical setting. In some respects, we suffered a Hollywood effect where the industry painted a picture of speech recognition that was much more than it was capable of – not just recognizing words – but understanding them.

Speech in the consumer world has moved on from a pure recognition tool to integrating some Artificial Intelligence (AI) that not only understands what is said but puts this into context. For example, some of the Nuance commercial telephony solutions include voice analysis for stress, anger and other emotions as indicators to help manage and route calls more appropriately. In healthcare, we don’t just apply voice recognition but layer on clinical language understanding that not only grasps the meaning but also tags the information, turning clinical notes and documentation into medical intelligence and making this data truly semantically interoperable.

4. What are typical medical speech recognition error rates? Cause for concern or manageable?

That’s a common question and one that a single digit or rate does not really answer. That being said, for many people out-of-the-box recognition is in the high 90’s and in many cases in excess of 98-99 percent. Even for those who do not achieve that accuracy out-of-the-box with training, it is possible for most speakers to attain a high level of accuracy. With more dictation and correction to build a personalized or speaker dependent model you can even customize the engine to individual speaking and pronunciation style.

Siri has achieved a level of accuracy and comprehension without having an individual voice and audio profile for the user. In the current version anyone can pick up an iPhone and interact with Siri. Siri has helped paint a clearer picture of what is possible with speaker independent (in other words speech recognition that does not require training) recognition that has achieved a very high rate of success and acceptability.

5. While attending the North American Association for Computational Linguistics meeting recently in Montreal (see my blog post) I noticed that Nuance Communications was a Platinum Sponsor, a higher level of commitment than even Google (Gold) or Microsoft (Silver). While NAACL2012 included some speech recognition research presentations, most presentations dealt with natural language processing further along the so-called NLP “pipeline”: morphology, syntax, semantics, pragmatics, and discourse. Where do you see Nuance going in these areas?

Nuance has a serious investment in Research and Development spread across many industries and part of the value we derive as an organization is from the cross fertilization of these efforts to different areas and verticals. The learning we derive of understanding a driver in their noisy car environment can be applied to the physician and their noisy Emergency Room department.

Applying understanding to the voice interaction opens up many avenues and we have seen this outside of healthcare (Dragon Go! for example). These principles have tremendous potential to simplify the physician interaction with technology and the complex systems they must master and use on a daily basis. I talked about this recently at a presentation I gave to the Boston AVIOS Chapter.

6. At the keynote for a workshop in biomedical NLP at NAACL2012, the concluding slide included the bullet: “NLP has potential to extend value of narrative clinical reports.” In light of your recent comments on the EMR and EHR blog I’m sure you’d agree. But, could you expand on those comments?

Nuance began investing in Clinical Language Understanding (CLU) technology, a clinical-specific form of NLP, over two years ago.

CLU is a foundational technology being applied to the other areas and applications (DM360 MD Assist and DM360 Analytics) that offers the ability to understand the free form narrative and extract out discreet data and tag it and link to a number of structured medical vocabularies.

7. The first question asked after the keynote was “Will meaningful use drive [need for/use of] clinical natural language processing?” Is SR/NLP more important for some MU measures than others? If so, which ones and why?

The answer lies in the source of data and how it is captured not in the specific data elements in my mind.

So take one measure as an example:

Is an ACE inhibitor prescribed for a patient who is suspected or suffering from a Heart Attack?

The source of this data will come from different sources in different facilities. If you have a CPOE and prescribing system then that data already exists in the system in digital form as structured data – you can answer that question (and guide the clinicians to make sure they comply with best practices and high quality care) with the existing structured data.

However, if this is a patient being seen in the ED and they dictate their notes then that information will be locked away in their documentation (unless they use a digital structured form to create the medical history) and then NLP is needed to extract the information and SR may be needed to facilitate the creation of the note efficiently.

8. At a recent workshop on clinical decision support and natural language processing at the National Library of Medicine, clinical NLP researchers cited several concerns. Point-and-click interfaces threaten to reduce amount of free text in EHRs. Since modern computational linguistics algorithms rely on machine learning against large amounts of free text, this is a potential obstacle. Is there a “race” between data input modalities? If so, who’s winning? If not, why not?

True. But this battle has been going on for years. I have referred to Henry VIII’s medical record many times as a great example why structured data entry will never fulfill the requirements:

If we rely on structured data entry that presupposes that we know everything we need to know then a structured form and selecting from a list will allow you to capture everything you need in the medical record. However, we do not know everything we need to know. By way of an example, Ground Glass opacities appeared in medical notes in narrative form before we knew what this radiological finding meant. If we had not captured this in the narrative there would have been no record of these findings since it was new at the time. We know now that these findings are linked to a number of diseases including pulmonary edema, ARDS, and viral, mycoplasmal, and pneumocystis pneumonias.

The narrative is, and will always remain, essential to a complete medical record – NLP will bridge the gap between the structured data necessary to create semantically interoperable records and just keeps getting better.

9. Much of the success of the automated language processing that we take for granted today (for example, in Google and Apple products) is due to access to lots of annotated free text, called “treebanks” (“tree” for syntax tree). However, HIPAA requirements make sharing marked-up clinical text problematic. As Nuance moves from speech recognition to natural language processing, how will you deal with this constraint?

We take this issue very seriously. As the largest provider of medical transcription in the world, we have a good understanding of the issues and the importance of securing the data and ensuring patient confidentiality. I don’t have specific answers relative to how we handle this but our development and engineering teams have incorporated these concerns into our discussions and designs as we move this technology forward.

End of Interview.

What a great written interview! (Dr. Nick’s answers, not my questions.) We got into the weeds at the end, though. Please watch the video to compensate. If only to hear Holly’s “bark-on”.

On one hand there are lots of marketing-oriented white papers about clinical natural language processing. On the other hand there are lots of arcane academic papers about computational linguistics and medicine. Dr. Nick drove the ball right down the middle. People like Dr. Nick — clinician, programmer, journalist, entrepreneur, social media personality, and Golden Lab owner — are valuable bridges between communities who need to work together to digitize medicine.

Clinical NLP at 2012 NAACL Human Language Technology Conference

I livetweeted presentations about clinical and biomedical natural language processing and computational linguistics at last week’s meeting of the North American Chapter of the Association for Computational Linguistics (NAACL, rhymes with “tackle”): Human Language Technologies, in Montreal. This blog post embeds those tweets and adds lots of editorial and tutorial material (“Editorial”? Wait. How about “Editutorial”? I think I just coined a term!).

My goal was threefold:

  • Leverage social media content I went to some effort to create (the tweets).
  • Summarize current state-of-the-art clinical NLP research and directions.
  • Make it understandable to readers who are not computational linguists or NLP engineers.

By the way, what’s the difference between clinical NLP and biomedical NLP? Take, for example, the alarmingly, but misleading, headline “Will Watson Kill Your Career?” It has a great quote (my emphases):

“New information is added to Watson in several different ways. On the patient side, the healthcare provider adds their electronic records to the system. On the evidence side, it happens by accessing the latest medical sources such as journal articles or Web-based data and clinical trials.”

Clinical NLP? The patient side. Biomedical NLP? The evidence side. The former is free text about specific patients, such as is found in transcription systems and electronic health records. The latter is free text about biological theories and medical research. You’ll observe this division of classification of tweeted papers below.

To date, most NLP research and application focused on the biomedical evidence side. As NLP becomes more practical and as electronic free text about (and by!) patients explodes, computational linguistics (the theory) and natural language processing (the engineering) inevitably will shift toward the patient side. We’ll need to combine both kinds of human language technology–patient and evidence–to create a virtuous cycle. Mine patient data to create and test theories that will, in turn, come back to the point of care to improve patient care.

And, given my blog’s EHR-and-workflow brand, just think of all the complicated and interesting healthcare workflow issues! 🙂

By the way, sometimes I’ll editorialize, or explain ideas and terminology. What the speaker presented and my elaboration should be clear from the context. I’ll try to signal lengthy tangential discussions of my own thoughts about the subject at hand. Where they exist, I’ll provide links to related work. Where illuminating, I’ll quote an abstract or relevant paragraph or two. I tend not to name names (unless I follow them on Twitter, in which case I’ll often provide that link). I’d rather talk about ideas. I provide links to each paper associated with each presentation, where you can find the who and the where.

This blog post ended up being a lot longer than I intended or planned. But, the more I dug the more I found, and the more I found the more I dug. Current clinical NLP research and developments reflect a remarkable amount of accumulated knowledge, tools, community, ambition, momentum and, ideally, critical mass.

I expect great things. I hope the following conveys this excitement.

Before I livetweet I like to forewarn and apologize to any folks following me who are not interested in whatever I’m about to flood their tweetstream.

A photo to set the stage, so to speak! Note the highest paper review scores ever and trends toward diversity of clinical and biological topics.

You can see my format: “Listening to…” then title then link to actual paper. By the way, is my custom URL shortener and tweet archiver. Since tweets are limited to 140 characters in length, long URLs need to converted to short URLs. Otherwise the entire tweet could be taken up by just an URL! And that’s no good.

I really liked this talk. The authors applied machine translation technology, the same techniques that allow you to read web pages in foreign languages, to classify transcriptions of patients retelling simple stories.

You can think of retelling a story as similar to translating a story from its original words into the patient’s own words. The bigger the difference, the larger the translation distance, and, possibly, the larger the cognitive impairment.

They showed that their automatic scoring system was usually as good as human judged comparisons of original to retold stories.

The reason I liked this presentation so much is that it’s different. It’s not about extracting knowledge from genetics research papers or combing patient records for symptoms and diagnoses (those are impressive, but they are more plentiful). It’s about measuring something about patient-uttered language and helping diagnose potential medical problems. It’s using computational linguistics to create a virtual medical instrument, just as a stethoscope or X-ray machine is a medical instrument, to better see (or hear) potential medical conditions.

Here “Listening to…” precedes the title of a paper about constructing ontologies from medical research text using a system called NELL, for the Never Ending Language Learner. I presume there’s some, possibly implicit, connection to the similarly titled movie with learning of language a central theme. (Of course, there’s also the Never Ending Story…)

The combined ontologies were:

  • Gene Ontology, describing gene attributes
  • NCBI Taxonomy for model organisms
  • Chemical Entities of Biological Interest, small chemical compounds
  • Sequence Ontology, describing biological sequences
  • Cell Type Ontology
  • Human Disease Ontology

See next tweet.

From the paper:

“BioNELL shows 51% increase over NELL in the precision of a learned lexicon of chemical compounds, and 45% increase for a category of gene names. Importantly, when BioNELL and NELL learn lexicons of similar size, BioNELL’s lexicons have both higher precision and recall.” (p. 18)

This time NLP is used to mine textual data about adverse drug events. Same or similar events can be described in different ways. To get a handle on the total number or frequency of different kinds of drug reactions we need to lump together similar events. From the paper’s intro (my emphases):

“When an automatic system is able to identify that different linguistic expressions convey the same or similar meanings, this is a positive point for several applications. For instance, when documents referring to muscle pain or cephalgia are searched, information retrieval system can also take advantage of the synonyms, like muscle ache or headache, to return more relevant documents and in this way to increase the recall. This is also a great advantage for systems designed for instance for text mining, terminology structuring and alignment, or for more specific tasks such as pharmacovigilance….”

”…if this semantic information is available, reports from the phramacovigilance databanks and mentionning similar adverse events can be aggregated: the safety signal is intensified and the safety regulation process is improved.”

Back to clinical NLP. Understanding the order of patient clinical events is crucial to reasoning about diagnosis and management. I also like this paper a lot. In fact, folks at #NAACL2012 must have too, since the presenter got to present twice on the same topic, once to the main conference and once in the BioNLP workshop.

Extracting this information from EHR free text is complicated by the fact that events are not mentioned in the same order that they originally occurred and due to inconsistencies.

Since it is difficult or maybe impossible to determine the exact date upon which an event happened, we need fuzzier labels. Helpfully, most events are described relative to an admission date included at the top of clinical notes.

Next tweet…

(To repeat my tweet:)

The fuzzy categories used were:

  • “way before admission,
  • before admission,
  • on admission,
  • after admission,
  • after discharge”

I found this paper particularly interesting because I’ve written about process mining, which builds flowcharts from event data in EHR logs. These flowcharts represent typical temporal ordering of events. I think I’ve actually seen a paper about process mining applied to “timestamps” gleaned from clinical free text…(note to self, look and put link here; found this, but not it). I suspect that future uses of process mining in healthcare will combine fine grained event log data from EHRs with coarse grained time-bin-like data from free text.

Another temporal reasoning from clinical free text paper. I think this topic is especially important because it’s relevant to combing patient records to construct typical patient scenarios. Such information could be invaluable to creating care pathways and guidelines. These, in turn, are relevant to the EMR workflow management systems and EHR business process management suites I write about in this blog.

A medical condition’s assertion status refers to:

  • Negation Resolution (“There is no evidence for…”)
  • Temporal Grounding (current or historical condition)
  • Condition Attribution (experienced by the patient or someone else, such as family member)

I found some interesting related material:

A knowledge-based approach to medical records retrieval (Demner-Fushman et al., 2011)

ConText: An Algorithm for Identifying Contextual Features from Clinical Text (Chapman et al. 2007) (5M: 11th of 33 papers)

(plus found this) A Review of Negation in Clinical Texts

When livetweeting conferences, I like to signal the beginning and ending of tweetable events. If someone follows in realtime, they may decide to have a coffee break too, and not worry about missing anything. This may sound a little odd, but I know this is true, because I frequently listen to tweets at conferences I wish I was attending. Things “kicking off,” breaks for lunch, etc. provide mundane but useful local color.

By the way, I met @McCogley at the main conference during a “tweetup” I organized. Tweetups are gatherings of people on Twitter who use Twitter to suggest a time and place to rendezvous and talk about, well, Twitter of course, but also anything else in common (after all, we presumably share an interest in computational linguistics and natural language processing).

By the (second) way , while none of these papers that I’m livetweeting are about Twitter, there were some such at NAACL. Whenever I see a paper about Twitter, I try to look up the authors on Twitter, to see if they are active and to follow them. At #NAACL2012 there were six papers/posters about analyzing tweets. There were 14 co-authors. I could identify seven Twitter accounts (searching on Twitter and Google, confirming if name and affiliation matched, also taking into account NLP content in profiles, tweets, followers or followers). Three accounts actively tweet. Of course folks may have accounts I didn’t find. And you don’t need a tweet account to read someone’s tweets. But still.

In this tweet I’m corrected the earlier incorrect #NAACL12 hashtag and indicated the coffee break is over.

The key note is starting, so it’s worth setting the stage again, with a photo. The title makes me think it will be very interesting. What’s an “NLP Ecosystem”? Guess I’m in the right place to find out. By the way, there’s apparently no paper to which to link.

I found similar slides from a presentation given at The Second International Workshop on Web Science and Information Exchange in the Medical Web. I consulted these slides several times below to elaborate what might be otherwise somewhat cryptic tweets (expanding acronyms, provide additional context).

OK! Let’s start out with what almost seems like the proverbial elephant in the room. If clinical natural language processing is so great, has so much potential, why hasn’t it realized more of that potential by now? Reminds me of the old rejoinder, if you’re so smart, why aren’t you rich? That rejoinder really applies, too. (Mind you, the following is me ruminating.) If computational linguistics and natural language processing applied to medical data and processes is such a smart thing to do, why aren’t people doing so and making a lot of money at that?

Well, as a matter of fact, speech recognition-based dictation and NLP-based coding are burgeoning, though still nascent, industries. (This is still me ruminating.) But, relative to the potential promise of clinical NLP, the question “Why has clinical NLP had so little impact on clinical care,” is spot on.

“Sharing data is difficult” (quoting my tweet)

We need a bit of an explanation here. Modern computational linguistics is not your grandfather’s computational linguistics. In the old days (when I got my virtual graduate degree in computational linguistics) natural language processing system were created by consulting linguistic theories and/or artificial intelligence researchers and writing programs to represent, process, and reason about natural language.

Today, natural language process systems are created by “training” them on large amounts of “annotated” text.

What’s annotated text?

From Natural Language Annotation for Machine Learning (my links and emphases):

“It is not enough to simply provide a computer with a large amount of data and expect it to learn to speak–the data has to be prepared in such a way that the computer can more easily find patterns and inferences. This is usually done by adding relevant metadata to a dataset. Any metadata tag used to mark up elements of the dataset is called an annotation over the input. However, in order for the algorithms to learn efficiently and effectively, the annotation done on the data must be accurate, and relevant to the task the machine is being asked to perform. For this reason, the discipline of language annotation is a critical link in developing intelligent human language technologies.”

What’s training on annotated text?

“Machine learning is the name given to the area of Artificial Intelligence concerned with the development of algorithms which learn or improve their performance from experience or previous encounters with data. They are said to learn (or generate) a function that maps a particular input data to the desired output. For our purposes, the “data” that a machine learning (ML) algorithm encounters is natural language, most often in the form of text, and typically annotated with tags that highlight the specific features that are relevant to the learning task. As we will see, the annotation schemas discussed above, for example, provide rich starting points as the input data source for the machine learning process (the training phase).

When working with annotated datasets in natural language processing, there are typically three major types of machine learning algorithms that are used:

  • Supervised learning – Any technique that generates a function mapping from inputs to a fixed set of labels (the desired output). The labels are typically metadata tags provided by humans who annotate the corpus for training purposes.
  • Unsupervised learning – Any technique that tries to find structure from an input set of unlabeled data.
  • Semi-supervised learning – Any technique that generates a function mapping from inputs of both labeled data and unlabeled data; a combination of both supervised and unsupervised learning.”

“Treebanks” are databases of free text marked up as syntax trees. Here’s an example of treebank annotation guidelines for biomedical text: grueling!
Sharing databases of free text marked up in a standard way has been incredibly important to the progress of natural language technology to date. By sharing annotated text computational linguists don’t have to reinvent the wheel and can focus on creating better NLP machine learning techniques. Shared databases of annotated free text also play an important role in comparing these techniques. Contests between centers of NLP excellence to see who can do better and then to learn from each other’s successes would not be possible without shared annotated natural language text.

Which gets us to the number one reason (in my opinion, and other’s as well) that computational linguistics on the “patient side” has not taken off. Privacy and security concerns such as those encoded into law by HIPAA make it difficult to share, and learn from (in both the computerized and professional senses of “learn), annotated free text.

Lack of annotation standards contribute too. But, even the creation of desirable annotation standards runs afoul of privacy concerns. After all, how do you create data standards without data?

OK, back to the presentation.

All of this (the above described issues), and more, lead to a perception that clinical NLP is too expensive. You need to employ a Ph.D. in computational linguistics to reinvent the wheel for each healthcare organization.

Which brings us to: is there any way to use technology to reduce the expense of NLP?

iDash is one such initiative. You can read more about it at the link in the tweet. But there is also a brief communication in JAMIA: iDASH: integrating data for analysis, anonymization, and sharing. Here’s the abstract (my emphasis):

iDASH (integrating data for analysis, anonymization, and sharing) is the newest National Center for Biomedical Computing funded by the NIH. It focuses on algorithms and tools for sharing data in a privacy-preserving manner. Foundational privacy technology research performed within iDASH is coupled with innovative engineering for collaborative tool development and data-sharing capabilities in a private Health Insurance Portability and Accountability Act (HIPAA)-certified cloud. Driving Biological Projects, which span different biological levels (from molecules to individuals to populations) and focus on various health conditions, help guide research and development within this Center. Furthermore, training and dissemination efforts connect the Center with its stakeholders and educate data owners and data consumers on how to share and use clinical and biological data. Through these various mechanisms, iDASH implements its goal of providing biomedical and behavioral researchers with access to data, software, and a high-performance computing environment, thus enabling them to generate and test new hypotheses.

One way to create sharable clinical text is to de-identify it. This means to remove any material from the text that explicitly identifies a patient. By the way, de-identification is not the same as anonymize (a point made by a later presenter, who has written a review of de-identification of clinical free text). The latter means to make it impossible for anyone to figure out who the patient was. De-identification (in at least some opinions) does not go that far.

NLP systems typically rely on complicated “pipelines” starting from the original free text and passing through further stages of processing (see below for examples). Setting up the individual software systems for the individual pipeline steps and then connecting them all up to work together correct is difficult and expensive. So, why not put them all together in a virtual machine, which folks and download and use almost immediately (after just a bit of configuration to handle their specific needs).

You can think of an NLP system for free text in an EHR as a set of NLP subsystems among which information must flow correctly in order for the entire NLP system to work correctly. Just as the human body has organs, such as heart, lungs, and kidneys, each of which has a specialized purpose and all must work together, NLP systems have subsystems, NLP organs, to continue the medical analogy.

There are NLP modules that determine where words begin and end. There are modules that find where sentences begin and end. Others locate beginnings and endings paragraphs or clinical subsections. There are entity recognizers and relations between entities recognizers. There are event recognizers and text entailers (recognizing/inferring what text implies, even if not stated explicitly). All of these subsystems must work together via workflows between them. The order of this workflow is sometimes referred to as a “pipeline”.


Above is an example of NLP “pipeline” from the recent NIH workshop on
“Natural Language Processing: State of the Art, Future Directions and
Applications for Enhancing Clinical Decision-Making.” One of the problems with NLP pipelines is that mistakes made in earlier modules and steps tend to propagate forward causing a lot of problems. If you can’t figure out whether a string is a noun (“Part-of-Speech Tagging” in the diagram), you’ll not likely be able to recognize what real world entity to which it refers (“Named Entity Recognition”). And if you can’t do that, well…

For example, entities such as patients, conditions, drugs, etc. must be recognized. Co-reference must be detected: This “heart attack” and that “MI” refer to the same clinical entity. Three-year old “John Smith” and his 73-year old grandfather “John Smith” do not refer to the same entity. In fact, the latter JS2 has a grandfather relation to the former JS1: grandfather(JS2, JS1)). And the fact that JS2 is the grandfather of JS1 implies (entails) that JS1 is the grandson of JS2.

We could keep going, from words to sentences to semantics on to pragmatics and discourse. I think this is exactly where clinical NLP needs to go if EHRs are to become truly useful but unobtrusive helpmates to their users. I touch on this at the end of this blog post, in an epilogue. However, practically speaking, due to the nature of the traditional NLP pipeline, earlier stages of processes need to be mastered before later stages. (Hmm. Ontology recapitulates phylogeny?)

I’m starting to ramble, so back to the presentation!

eHost is described in the poster paper (my emphases) A Prototype Tool Set to Support Machine-Assisted Annotation. eHost stands for Extensible Human Oracle Suite of Tools.

Manually annotating clinical document corpora to generate reference standards for Natural Language Processing (NLP) systems or Machine Learning (ML) is a time-consuming and labor-intensive endeavor. Although a variety of open source annotation tools currently exist, there is a clear opportunity to develop new tools and assess functionalities that introduce efficiencies into the process of generating reference standards. These features include: management of document corpora and batch assignment, integration of machine-assisted verification functions, semi-automated curation of annotated information, and support of machine-assisted preannotation. The goals of reducing annotator workload and improving the quality of reference standards are important considerations for development of new tools. An infrastructure is also needed that will support large-scale but secure annotation of sensitive clinical data as well as crowdsourcing which has proven successful for a variety of annotation tasks. We introduce the Extensible Human Oracle Suite of Tools (eHOST) that provides such functionalities that when coupled with server integration offer an end-to-end solution to carry out small or large scale as well as crowd sourced annotation projects.

I’m impressed with the goals of eHOST and the described vehicle for achieving them.

TextVect is described in more detail [I’ve inserted the bracketed material]:

TextVect is a tool for extracting features from textual documents. It allows for segmentation of documents into paragraphs, sentences, entities, or tokens and extraction of lexical, syntactic, and semantic features for each of these segments. These features are useful for various machine-learning tasks such as text classification, assertion classification, and relation identification, TextVect enables users to access these features without installation of the many necessary text processing and NLP tools.

Use of this tool involves three stages as shown in Fig below: segmentation, feature selection, and classification. First, the user specifies the segment of text for which to generate the features: document, paragraph or section, utterance, or entity/snippet. Second, the user selects the types of features to extract from the specified text segment. Third, the user can download the vector of features for training a classifier. Currently, TextVect extracts the following features:

  • unigrams and bigrams [One and two-word sequences. An n-gram of size 1 is referred to as a “unigram”; size 2 is a “bigram” (or, less commonly, a “digram”); size 3 is a “trigram”. Wikipedia]
  • POS tags [POS stands for Part-of-Speech: noun, verb, adjective]
  • UMLS concepts [Unified Medical Language System: “key terminology, classification and coding standards, and associated resources to promote creation of more effective and interoperable biomedical information systems and services, including EHRs”]

There’s also an IE tool (IE means Information Extraction, not Internet Explorer). I wasn’t able to find a link to KOS-IE cross indexed to iDash, but KOS likely means Knowledge Organization System, such as whose purposes are described here (though not in a BioNLP context):

  • “translation of the natural language of authors, indexers, and users into a vocabulary that can be used for indexing and retrieval
  • ensuring consistency through uniformity in term format and in the assignment of terms
  • indicating semantic relationships among terms
  • supporting browsing by providing consistent and clear hierarchies in a navigation system
    supporting retrieval”

I found the following paper about cKASS:

Using cKASS to facilitate knowledge authoring and sharing for syndromic surveillance

The introduction (no abstract, my emphasis):

Mining text for real-time syndromic surveillance usually requires a comprehensive knowledge base (KB), which contains detailed information about concepts relevant to the domain, such as disease names, symptoms, drugs and radiology findings. Two such resources are the Biocaster Ontology (1) and the Extended Syndromic Surveillance Ontology (ESSO) (2). However, both these resources are difficult to manipulate, customize, reuse and extend without knowledge of ontology development environments (like Protege)and Semantic Web standards(like RDF and OWL). The cKASS software tool provides an easy-to-use, adaptable environment for extending and modifying existing syndrome definitions via a web-based Graphical User Interface, which does not require knowledge of complex, ontology-editing environments or semantic web standards. Further, cKASS allows for–indeed encourages–the sharing of user-defined syndrome definitions, with collaborative features that will enhance the ability of the surveillance community to quickly generate new definitions in response to emerging threats.

I found a description of Common Evaluation Workbench Requirements and this description of its Background and Business Case (brackets and emphasis are mine):

“Many medical natural language extraction systems can extract, classify, and encode clinical information. In moving from development to use, we want to ensure that we have our finger on the pulse of our system’s performance and that we have an efficient process in place for making outcome-driven changes to the system and measuring whether those changes contributed to the desired improvements.

Our goal is to develop a quality improvement cycle and a tool to support that cycle of assessing, improving, and tracking performance of an [Information Extraction] system’s output.

The workbench should

  1. be compatible with any NLP system that is willing to generate output in a standard output
  2. compare two annotation sets against each other
  3. produce standard evaluation metrics
  4. interface with at least one manual annotation tool
  5. allow exploration of the annotations by drilling down from the confusion matrix to reports to individual annotations and their attributes and relationships
  6. provide a mechanism for categorizing errors by error type
  7. provide the option to track performance over time for a system or annotator
  8. allows user to record types of changes made between versions of annotated input so that changes in performance over time are linked to specific changes in guidelines (in the case of human annotations) or changes in the system (in the case of automated annotations)”

See next tweet…

I found the next section especially interesting. It deals with workflow and usability! I usually focus on EHR workflow. That’s sort of this blog’s brand. My Twitter account is even @EHRworkflow. I’ve a degree in Industrial Engineering where I studied workflow (and stochastic processes, dynamic programming, and mathematical optimization, all relevant to modern computational linguistics, but that’s another blog post).

So, just look at the above tweet! Tasks, workflow, cognitive load… Even 60-page annotation guidelines ring a bell (for example, I’ve written about the 200 pages about EHR workflow, in one EHR user manual, instructing EHR users about what to click, what to click on, what to click on…etc. )

Here’s an interesting notion. Apparently there’s been some success getting lots of people to annotate free text over the Web. Might it work for medical record text?

Obviously there are, again, HIPAA-related issues. But perhaps these can be dealt with.

See next tweet…

For information at

Annotation Admin is a web-based service for managing annotation projects. The tool allows you to

  1. create an annotation schema comprised of entities, attributes, and relationships
  2. create user profiles for annotators
  3. assign annotators to annotation tasks
  4. define annotation tasks
  5. determine batch size
  6. sync with annotation tool (currently syncs with eHOST) to send schema and batches to valid annotators and to collect the annotations when finished
  7. keep track of progress of annotation project

The following is my reaction to the above, in light of this blog’s theme: workflow automation in healthcare:

For EHRs to fulfill their full potential they will need to not just interoperate with a wide variety of other systems, but manage those interactions as well. Some of those key interactions will be with descendants of the kinds of NLP tools being discussed here. Workflow engines and process definitions and process mining will be the glue to connect it all up to operate efficiently, effectively and satisfactorily for EHR and NLP system users.

Back to the presentation…

If you want to learn more, there’s a free workshop this fall at the link in the tweet.

This was a good conclusion slide so I tweeted it.

In light of “More demand for EHR data”

“NLP has potential to extend value of narrative clinical reports”

Key developments include

  • Common annotation conventions (annotations needed to learn)
  • Privacy algorithms (to de-identify, anonymize)
  • Shared datasets (to compare and improve NLP systems)
  • Hosted environments (deliver tools, manage workflows, etc.)

Two of the best questions were about meaningful use and capturing data about users in improve usability.

Stages of meaningful use will likely drive demand for clinical NLP. There’s too much relevant information locked in EHR free text.

(Slight tangent ahead…)

Now, if you’ve read a few of my blog posts or follow me on Twitter you’ll know I am not a fan of keyboards (one way to create free text). Contrary to complaints about traditional EHR user interfaces, with all those dropdown menus and tiny little checkboxes, a well-designed (read, well-designed workflow) EHR point-and-click oriented user interface can outperform alternatives for routine patient care. If you’re a pediatrician, which is faster? Saying “otitis media” and then making sure it was recognized correctly? Or just touching a big button labeled “otitis media”, which is, as the military say, fire-and-forget.

However, EHRs really need to be multimodal, accepting input in whatever form users prefer. If you like to type (yuck) by all means have at it. Or, as automated speech recognition gets better and better, dictate. As non-routine phrases becomes routine, I still see both styles of data entry giving way to quicker (and not-requiring-post-editing) fire-and-forget of canned text strings (increasingly mapped to standardized codes).

In the meantime, there is an ocean of legacy free text to sail and we need means to traverse it. So, yes, meaningful use will drive demand for clinical NLP.

What about “instrumenting” annotation software to capture information about user behavior and use machine learning to improve usability (how I interpreted the question). Since I’ve suggested doing similar for EHRs, I might as well extend the suggestion to annotation software as well. One approach is to use process mining to learn process maps from time stamped user event data. Cool stuff, that. Can’t wait.

Here’s another paper about using natural language processing of biomedical text to match abbreviations (such as MAR) with their expanded definitions (Mixed Anti-globulin Reaction test/not text!). Well, it’s by the same authors of the first paper on this topic.




Break for lunch, spelling #NAACL21012 correctly this time!

Interesting work comparing English vs Swedish taxonomies of certainty and negation in clinical text.

“[A]nnotators for the English data set assigned values to
four attributes for the instance of pneumonia:

  • Existence(yes, no): whether the disorder was ever present
  • AspectualPhase(initiation, continuation, culmination, unmarked): the stage of the disorder in its progression
  • Certainty(low, moderate, high, unmarked): amount of certainty
    expressed about whether the disorder exists
  • MentalState(yes, no): whether an outward thought or feeling
    about the disorder’s existence is mentioned

In the Swedish schema, annotators assigned values to two attributes:

  • Polarity(positive, negative): whether a disorder mention is in the
    positive or negative polarity, i.e., affirmed (positive) or negated
  • Certainty(possibly, probably, certainly): gradation of certainty
    for a disorder mention, to be assigned with a polarity value.

The best thing about comparing English vs Swedish taxonomies of uncertainty and negation? Getting to go to Sweden to do so! Jag är väldigt säker på! (I am very certain! By the way, these were my thoughts, though enthusiasm for visiting Sweden was indeed expressed.)

Back to “Listening to…”

Automatic de-identification of textual documents in the electronic health record: a review of recent research

Multiple methods were combined and compared to alternatives.

Described hybrid approach had did very well. Most of the strings it said were patient names were indeed patient names (precision) and most of the patient names in the corpus were detected (recall).

”Coreference resolution is the task of determining linguistic expressions that refer to the same real-world entity in natural language” (See later tweet)

”Active Learning (AL) is a popular approach to selecting unlabeled data for annotation (Settles, 2010) that can potentially lead to drastic reductions in the amount of annotation that is necessary for training an accurate statistical classifier.”

As noted, behind a paywall, which is too bad! But, here’s the abstract:

“Coreference resolution is the task of determining linguistic expressions that refer to the same real-world entity in natural language. Research on coreference resolution in the general English domain dates back to 1960s and 1970s. However, research on coreference resolution in the clinical free text has not seen major development. The recent US government initiatives that promote the use of electronic health records (EHRs) provide opportunities to mine patient notes as more and more health care institutions adopt EHR. Our goal was to review recent advances in general purpose coreference resolution to lay the foundation for methodologies in the clinical domain, facilitated by the availability of a shared lexical resource of gold standard coreference annotations, the Ontology Development and Information Extraction (ODIE) corpus.”

The penultimate paper is up! I didn’t prepend a “Listening to…” because the paper title was so long and I couldn’t figure out how to shorten it without changing its meaning: computational linguists ought to study this, after getting a Twitter account :-))

Protein interactions!

Suppose you wanted to look up all the papers that mention a specific disease? Could you do so? Not unless alternative ways to refer to the same disease are somehow aggregated (recall aggregating adverse drug events?).

Back to the problem of annotation here. This time, since the data is in medical research text, there are no HIPAA issues.

I’ve livetweeted events before. Each time I do it a bit differently. The first time I didn’t even have a smartphone Twitter client, so I texted to a special number. Since then I livetweeted a health IT conference and a government workshop. Only recently did I realize I could embed tweets from Twitter as I did above (No more copy and paste. Now, Twitter, don’t go away!). There are advantages and disadvantages. It’s a lot of work. I only do it when I am really interested in the subject. It keeps me focused during presentations so that I don’t miss the perfect slide bullet or speaker quote to summarize the entire presentation. What’s really fun, though, what I don’t show above, are all the retweets of my tweets by others, as well as others finding the #NAACL2012 hashtag and chipping with their own, often funny, comments.

I walked out of the #NAACL2012 BioNLP workshop and this is what I saw!


Computational linguistics and natural language processing (the former the theory and the latter the engineering) are about to transform healthcare. At least some people think so. There’s certainly a lot of buzz in health IT traditional and social media about medical speech recognition and clinical language understanding.

Coverage can be pretty superficial. Watson will, or won’t, replace clinicians. Siri will, or won’t, replace traditional EHR user interfaces. It comes with the territory. CL and NLP are full of dauntingly abstract concepts and complicated statistical mathematics. However, there is an idea, among philosophers, that science is really just common sense formalized. If so, maybe the science of CL/NLP can be “re-common-sense-ized”, at least for the purpose of looking under the hood of what makes these clever language machines possible.

Looking further ahead, where I’d really like to see clinical NLP go, is toward conversational EHRs. A bit like Siri, or at least the way Siri is portrayed in ads, only a lot more so. To get there EHRs will need to become intelligent systems, not just converting compressions and refractions of air molecules into transcribed tokens to be passed on to pipelines and become ICD-9 or -10 codes. They will need to “understand” the ebb and flow of medical workflow and, like the hyper-competent operating room nurse, do the right thing at the right time with the right person for the right reason, without having to be explicitly told or triggered to do. This is where this blog’s brand, EHR+plus+workflow comes together with thinking and language.

Thanks for reading! I learned a lot!