I’ve written a lot recently about natural language processing in healthcare.
Language technology and workflow technology have lots of interesting connections. As I previously discussed:
- NLP and workflow often use similar representations of sequential behavior.
- Speech recognition promises to improve EHR workflow usability but needs to fit into EHR workflow.
- Workflow models can provide context useful for interpreting speech and recognizing user goals.
- NLP “pipelines” are managed by workflow management systems.
- Workflows among EHRs and other HIT systems need to become more conversational (“What patient do you mean?”, “I promise to get back to you”, “Why did you ask me for that information?”)
So, writing about NLP reflects the name of this blog: EHR Workflow Management Systems.
Therefore I was delighted when Nuance Communications, provider of medical speech recognition and clinical language understanding technology, approached me to highlight their 2012 Understanding Healthcare Challenge. An interview with Jonathon Dreyer, Director, Mobile Solutions Marketing follows. In a postscript I add my impression of going through the process of gaining access to Nuance’s speech recognition and clinical language understanding SDKs (Software Development Kits).
By the way, I think the idea and model of a vendor sponsoring a challenge and giving away developer support packages is one of the best ideas I’ve come across in quite a while. If anyone else decides to follow suite, I’d love to interview and highlight your SDK and related resources too!
Interview with Jonathon Dreyer, Director, Mobile Solutions Marketing, Healthcare Division at Nuance Communications, about the 2012 Understanding Healthcare Challenge
Jonathan, thanks for taking time out of your schedule for speaking with me! I enjoyed interviewing Nuance’s Chief Medical Informatics Officer, Dr. Nick van Terheyden and turning that into a blog post.
Video Interview and 10 Questions for Nuance’s Dr. Nick on Clinical Language Understanding
So I look forward to doing the same with you!
Dr. Nick did a great job of putting speech recognition and natural language understanding into clinical context. That interview was directed toward user-clinicians. Let’s focus this interview on developers. (As well as curious clinicians; lots of physicians are learning about IT these days!)
1. What’s your name and role at Nuance?
Jonathon Dreyer, Director, Mobile Solutions Marketing, Healthcare Division at Nuance Communications. I manage our 360 | Development Platform. That’s speech recognition and clinical language understanding in the cloud. It includes a variety of technologies for desktop, web and mobile, including speech-to-text, text-to-speech, voice control and navigation, and clinical fact extraction.
2. I understand Nuance is sponsoring a contest or challenge. What’s it called? What does it entail?
It’s called the 2012 Understanding Healthcare Challenge. The deadline is Friday, October 5th. Just go here…
…and fill out answers to some questions and submit them to Nuance.
About a year and a half ago we launched our speech recognition platform in the cloud. Earlier this year we had sponsored a successful challenge in which several dozen developers participated. Recently we released our clinical understanding (CLU) services platform and software development kit (SDK). The CLU engine can take unstructured free text from dictation (generated by our speech recognition engine), or existing text documents, and extract a variety of data sets.
The key difference between the current 2012 Understanding Healthcare Challenge and the previous speech recognition challenge is that in the previous challenge developers integrated speech recognition into applications, but in this challenge we’re looking for great ideas. The 2012 Healthcare Understanding Challenge has list of questions: What clinical data types would you use? What value is provided to end-users? And so on.
At the end of the challenge application deadline, October 5th, we’ll select three winners. Each will get a free developer evaluation and a year of developer support. These packages are worth $5,000. Essentially Nuance will help developers bring their idea to life and then to help market it.
[CWW: The 2012 Understanding Healthcare Challenge application form lists the following areas application: EMR/Point-of-Care Documentation, Access to Resources, Professional Communications, Pharm, Clinical Trials, Disease Management, Patient Communication, Education Programs, Administrative, Financial, Public Health, Ambulance/EMS, Body Area Network.]
3. Pretend I’m a programmer (which I occasionally am): how does Nuance work and how does Nuance work with mobile apps?
Our platform supports both speech and understanding. That’s both the speech-to-text service and then the text-to-meaning structured data service. A developer can sign up for one or both of these services. Depending on country and language they’ll access different relevant content and resources.
For example, a US developer can sign up for a ninety-day speech recognition evaluation (eval) account (360 |SpeechAnywhere Services), including SDKs and documentation, or the 30-day CLU eval (360 | Understanding Services). The developer portal has lots of educational and technical documentation, plus online forums and contacts for support. The SDKs are relatively simple to use. All you need are just a couple of lines of code. Within an hour most developers are generating their first speech to text transactions.
[CWW: I signed up for access to both the speech recognition and clinical language understanding evaluation documentation, software, and services. I’ll tell you what I found in a postscript at the end of this post.]
4. I’m seeing more and more speech-enabled mobile apps in healthcare. It’s not always obvious which speech-engine or language technology powers them. Have any numbers you’d care to share?
We’ve had about 300 developers come through our program. Several dozen have reached commercial status and their products are commercially available today. To date, we’ve worked with a lot of startup vendors. But in the next few weeks we’ll also be announcing partnerships that focus on providing speech recognition mobility to a number of well-known EHR vendors. It’s safe to say we are powering a “fair number” of these mobile healthcare applications.
5. Is there a “Nuance Inside” option? (after “Intel Inside”)
The phrase we use is “Powered by Nuance Healthcare” Plus there are a couple of visual indicators. In an iPhone or Android app, or in a web browser, there’ll be a little Dragon flame. This automatically appears in text fields when the speech recognition SDK is integrated into a product. And there’s a little button with the Dragon flame. Help menus also have a “Powered by Nuance Healthcare” badge.
6. How cross-platform is the technology? Does it rely on specific libraries compiled into iOS or Android apps?
SDKs include iPhone/iPad, Android, Web, and a .NET version for desktop windows.
7. I’m @EHRworkflow on Twitter and my blog is called EHR Workflow Management System, where I talk about workflow, usability, and natural language processing. It seems to me that a bunch of interesting technologies are coming together, all of which potentially contribute to more usable EHR workflow. Here’s just a few of these ideas: workflow, process, flexibility, customization, context of use, user intent, intelligent assistants, etc. I know that was a long preamble for this question, but could you react to some of these topics with respect to speech and language technology?
Well, for example, the latest version of the speech engine has some conversational capability. It can also do text-to-speech, so EHR can, potentially, speak up. Command-and-control functionality allows users to ask questions (such as “What are the vitals for my patient” ) and to navigate through an application.
The clinical language-understanding engine is a different use case, from the developer’s point of view, because it’s not directly dependent on an audio control. You send narrative text to the server and you get back useful data set extracted from the text. So CLU depends on the use case and what our development partner is trying to accomplish.
[CWW: The 2012 Understanding Healthcare Challenge application form lists the following datasets: Problems/Diagnoses, Medications, Allergies, Procedures, Social History, Vital Signs.]
However, if you’re doing pure speech recognition, then you add a couple lines of code and enable every text field. With our new command-and-control functionality users can also directly address such controls as checkboxes “Check this, check that, etc.” You can also integrate a medical assistant as in “Who are my patients for the day” or “Show me patient Mary Smith”, “When was her last visit?”
So we’re going beyond speech to text to more intelligent voice interactions. Speech recognition and clinical language understanding allow the intent of users to directly drive workflow or process.
We’ve leveraged our speech recognition and clinical language understanding technology to build our own workflow solutions, such as Dragon Medical 360 | M.D. Assist. It not only improves workflow from a technical perspective, streamlining and so forth, but also asks intelligent questions. So if a user mentions heart failure the system can check for specificity and ask the user for more information if needed.
Relative to workflow-related use cases, we’re seeing a lot of specialty-specific integrations, from general medicine and emergency medicine to dermatology and chiropractic. We’re also seeing a lot of EHR-agnostic front ends. These mobile workflow tools sit on top of legacy EHRs or even connect to multiple EHRs applications. One example even uses location services to reason from clinic or facility location to which back-end EHR system to which to connect. These systems intelligently recognize and reason from context to user-intended workflow. Adding speech recognition and clinical language understanding into this mix provides even more value. Every week I see something new and exciting from our development partners.
We have a base set of functionality we provide. If you want to do simple things, you can just add our code to your application. But you can also customize voice commands to work with your preferred keystrokes or macros.
We know that users were going to be doing some form of touching, speaking, typing, swiping and so on. For example, natively in iOS and Android, if you swipe to the right that starts your dictation. If you swipe to the right again, it stops. If you swipe to the left it “scratches” the last phrase.
Here’s an interesting workflow. If you tap a sequence of text fields, and speak into each, you don’t need to wait until the text appears in one text field before moving onto the next. All we need is a low-bandwidth connection, but if the network is congested or slow for any reason, users can move on while text catches up. We call this “Speak Ahead”. In other words, the user can forge ahead at own pace and we’ll accommodate them.
8. Tell me about how you support third-party developers.
[CWW: Jonathan and I spoke briefly about this. Since I personally registered as a developer, I’ll spill those beans in a postscript below.]
9. What are the ideal technical skill prerequisites for third-party developers?
The technical skills are not so much about our end. If developers are already building mobile apps for the iPhone and Android then they already have more than enough technical skill to integrate our technology. You’ll get more of an appreciation for this when actually get a chance to see the SDKs. We tried very hard to make integration as easy and effortless as possible. Relative to the current developer portal education and support resources, you’ll see that too. But we will also be updating the portal before the end of the year. We’ve had a lot of feedback from hundreds of developers and we’re continuing to leverage this to feedback to aim for rich educational content and a robust developer experience.
10. OK! Let’s close with the deadline for you 2012 Understanding Healthcare developer challenge and where folks go to apply.
Friday, October 5th. Go to
Thank you again, Jonathon!
Thank you Chuck!
Well, that’s my interview with Jonathan Dreyer, Director, Mobile Solutions Marketing, Healthcare Division at Nuance Communications. I certainly learned a lot. I hope you did too! And, please, get in there and apply for one of those $5,000 development support packages. Create something great. Then come back here and tell me about it!
Many thanks to Gordon Segersten, of Nuance Healthcare Business Development, for walking me through the application for access to the speech recognition and clinical language understanding evaluation materials and services.
PS In order to get my own impression of at least the first couple steps of becoming an Nuance third-party developer, I went to Nuance’s 360 | Development Platform developer support site at and signed up for 90-day and 30-day free evaluation access for speech recognition and clinical language understanding SDK material and support.
When I investigate integrating a third-party product or service, I start of with a short list of questions.
- Is there an SDK (Software Development Kit)?
- Does the SDK appear well documented? Lots of content? Well organized? Current? Etc.
- Is there sample code? In the right programming language? (i.e., the language the application you’re integrating the speech/language tech into, though there are often workarounds if not)
- Are data formats based on standards familiar to the developers in question? XML (eXtensible Markup Language), CDA (Clinical Document Architecture), etc.
- Are there support forums? Are they well populated with recent discussions: questions, answers from support, contributions from other users, etc?
After being accepted into the developer evaluation programs, I observed that the answers to these questions was “Yes.” In fact, I quite liked what I saw!
PSS One additional postscript: Whenever I get an opportunity to look under the hood of an EHR or HIT systems, I always look for workflow technology such as I tout in in my blog or via my Twitter account. It’s not always obvious! But it is often the secret in the sauce that makes some systems more customizable than others. Intriguingly I found what I was looking for:
“The data extraction platform contains a number of components that can be assembled in a pipeline to perform a specific extraction task. The actual execution of that task is performed by a workflow engine that takes this pipeline as a configuration parameter.”
Ha! Workflow engine sighting! I’ve written about NLP pipelines and workflow engines elsewhere in this blog, just in case you are interested!