BROWSE TOPICS

RESOURCES

ABOUT THIS SITE


Tag: Speech

Pages, news, and videos


AITopics > Tags > Speech

Pages

AITopics/Speech
AAAI's AITopics explores speech -- recognition, understanding, synthesis. Computers that converse with people must interpret and generate the acoustic signals of spoken language.

News

News Allscripts Broadens Speech Recognition Choices In EHRs
Allscripts has entered into a reseller and development agreement with M*Modal Inc., which calls for the integration of M*Modal's speech and language understanding technology into Allscripts' acute and ambulatory electronic health records (EHRs). M*Modal's technology, based on its Speech Understanding and Natural Language Understanding (NLU) platform, will help clinicians create voice-driven narrative patient documentation in Allscripts' EHRs. Michelson said adding M*Modal's technology will help Allscripts gain a competitive advantage over other EHR technologies on the market that don't incorporate speech recognition technology. M*Modal also recently announced a strategic agreement with Merge Healthcare to integrate its speech- and language-understanding technology across Merge's portfolio including its imaging and radiology offerings. (more)
News Speech recognition trial uses DS consoles to help children with hearing ...
Speech recognition trial uses DS consoles to help children with hearingdifficulties Nintendo is helping to implement the use of speech recognition software in Japanese schools, in partnership with telecom company NTT. As part a project currently being trialed, speech can be captured from a classroom teacher, and relayed as text on a students DS handheld console. Nintendos handheld console is no stranger to classrooms in Japan, with it already being used in educational settings for a variety of purposes. You can follow him on Twitter @midnightambler VentureBeat's Games channel covers stories about the evolving video game industry, from disruptive social game companies such as Zynga and CrowdStar, to the established giants such as Electronic Arts, Activision Blizzard, Microsoft, Sony, and Nintendo....... (more)
News Voice recognition software helps decode data from Yellowstone geyser basin
Related Links The Norris Geyser Basin has spoken to Phil Dawson after he figured out how to use voice recognition software to decode "noisy" monitoring data gathered in 2003. "Who would have thought that voice recognition software could be applied to this kind of problem? " Voice recognition software has made a splash most recently in the new iPhone from Apple. Dawson, a geophysicist with the U.S. Geological Survey in California, said he happened by chance upon the discovery that voice recognition could be applied to seismic data. (more)
News Nuance Buys Transcription And Speech Editing Company Transcend For $300M In Cash
Learn More Nuance has just announced that it is acquiring Transcend, a company that provides medical transcription and speech editing services, for approximately $300 million in cash, or $29.50 per Transcend share. Transcend utilizes a combination of its proprietary Internet-based voice and data distribution technology, customer based technology, and home-based medical language specialists to convert physicians voice recordings into electronic documents. The companys solutions are used every day by people and businesses for tasks and services, such as requesting account information from a phone-based self-service solution, dictating records, searching the mobile Web by voice, entering a destination into a navigation system, or working with PDF documents. (more)
News Software Translates Your Voice into Another Language
Researchers at Microsoft have made software that can learn the sound of your voice, and then use it to speak a language that you don't. In a demonstration at Microsoft's Redmond, Washington, campus on Tuesday, Microsoft research scientist Frank Soong showed how his software could read out text in Spanish using the voice of his boss, Rick Rashid, who leads Microsoft's research efforts. Hear Rick Rashid's voice in his native language and then translated into several other languages: In English, a synthetic version of Mundie's voice welcomed the audience to an open day held by Microsoft Research, concluding, "With the help of this system, now I can speak Mandarin. " Individual sounds used by the first model to build up words using a person's voice in his or her own language are carefully tweaked to give the new text-to-speech model a full ability to sound out phrases in the second language. (more)
News Natural Language Processing Takes Center Stage In EHRs
All of these companies are applying voice recognition coupled with NLP to their ambulatory-care EHRs, and Allscripts is doing the same with its Sunrise acute-care products, including Sunrise Mobile MD II. ) The company has already integrated M*Modal's NLP software into its Sunrise EHR, and the two companies are developing the application on the ambulatory care side. Vern Davenport, chairman and CEO of M*Modal, told InformationWeek Healthcare that his company's product can convert voice to text and do "context enablement" to create discrete data elements that go into EHR templates. (Because of the limitations of iPad keyboards, NLP will be needed to aid this process. (more)
News The disruptive power of gesture and voice recognition
Similarly, Apple's Siri virtual assistant has taught manufacturers and software developers that voice recognition has moved beyond recognition and into comprehension. And that's happening largely as a consequence of the rapid increase in microchip processing power, said Aviad Maizels, founder and president of PrimeSense, which designed the Kinect's chips. They've since crossed that threshold, and continued improvements in processing power are enabling more sophisticated gesture recognition tools. The increase in processing power has also helped improve speech-recognition software, said Richard H. The next wave, represented by Nuance's Dragon TV and the forthcoming Vlingo TV app, will help people search through program guides, answer questions about shows and exchange messages with friends while they watch TV. (more)
News Siri, a voice-recognition iPhone application, introduces a new way to interact ...
There's something about Siri - the new, smart-talking celebrity of the virtual world - that could be changing the way we interact with computers forever. Siri is a speech-recognition application installed on the new iPhone 4S that answers questions with a human-like voice and a seemingly human emotional intelligence and sense of humor. In short, Siri is changing everyday computer usage into a relationship most people can enjoy, said Chris Harrison, a doctoral student at the Human-Computer Interaction Institute at Carnegie Mellon University and editor-in-chief of "XRDS," a student magazine for an educational and scientific computing society called ACM. Not only is it a useful, everyday application, but Siri also has a manner and a way about it that people can relate to. (more)
News Dr. Mac: Speech recognition getting better and better
A big improvement Take, for example, Siri on the iPhone 4S. Poor man's Siri The bad news is that Siri, which is still in beta, by the way, is only available on the iPhone 4S. The good news for other iPhone users is that the Dragon Dictation app I mentioned last week, plus the Dragon Go app, which are both free, are the poor man's Siri. This dynamic duo from Dragon isn't as smart Siri (yet), but if your iPhone is older, it's a winning combination. (more)
News Google to Launch Siri Rival for Android
9,087 viewsGoogle plans to launch a voice assistant for Android to rival Apples Siri, integrating voice recognition with its established search capabilities. The Mountain View, Calif.-based company is rumored to be working on the voice assistant software at its secret laboratory, Google X, according to the site Android and Me, and may launch the service within the next few weeks. Codenamed Majel after the late actress Majel Barrett-Roddenberry, who provided the computerized voice of the Starship Enterprise in Star Trek, the technology will reportedly upgrade Androids Voice Actions app, which lets users make calls, get directions and perform searches on Android smartphones using voice commands. However, Voice Actions only responds to preset voice commands. (more)
News Ask Ziggy for Windows Phone to Rival Apple's Siri
Ask Ziggy for Windows Phone to Rival Apples Siri An independent Windows Phone developer has created an app for Microsofts mobile platform that will attempt to outmatch Apples intelligent voice-controlled assistant Siri. Ask Ziggy uses Speech Recognition to translate human speech into transcribed text, which is displayed in a speech bubble. The Nuance-based voice recognition part of the software allows Ask Ziggy to translate speech to text and just like Apples Siri, the Windows Phone app can accept queries, solve math problems, perform other tasks, and talk back to the user. Ask Ziggy is still under development as Leib plans to add multilingual support, language translations and expand speech grammar before submitting the app to the Windows Marketplace. (more)
News Nuance Aims iPhone Siri-Type Speech At TVs, Cars
You may update your e IBD preferences at any time by going into My IBD and selecting Update Your e IBD Preferences. Ricci says much improved voice recognition technology is on the way soon, though he says it's too early to throw away those computer keyboards and TV remotes. In an interview with IBD, Ricci talks about the potential of speech recognition in living rooms, cars, health care and social media applications. IBD: How long until voice commands or voice recognition technology become ubiquitous in smart phones, TVs and other products? (more)
News AT&T Spills Details Of New Watson Speech Recognition App Platform
In June, we plan to launch several AT&T Watson SM Speech application programming interfaces (APIs) that developers can access to quickly create great new apps and services with voice recognition and transcription capabilities, wrote John Donovan, an AT&T senior VP of technology and network operations, in a blog post. Watson, which was developed by AT&T Labs, the telecom giants research arm, is already live in its automated customer service phone banks and AT&T Translator and Yellow Pages mobile ( YPMobile ) iPhone apps. Following AT&Ts announcement, some key questions remained, such as: How much the program will cost developers, whether AT&T will take a cut of sales, and just which specific languages the technology will support. Regarding the cost, there is a registration charge of $99, which will allow developers to use all AT&T APIs, including speech, without a per transaction charge through 2012, AT&Ts spokesperson told TPM. (more)
News TCS Associates Expands Speech Recognition Solutions for Disabled To Include ...
TCS Associates has a solid and successful history of providing voice recognition products and services to persons with disabilities, so it seemed a natural move to offer those same cutting edge technologies to medical professionals through products like Dragon Medical Practice Edition. TCS Associates is now helping physicians (in addition to lawyers, social workers and other professionals) and medical practices realize the significant time and cost savings of transitioning from manual to voice entry of data with the use of Dragon Medical Practice Edition. Dragon Medical Practice Edition is a powerful and configurable voice recognition solution for small practices with 24 physicians or less, and greatly facilitates the entry of patient notes, coding or prescription information into a practices Electronic Health Record (EHR) software. For medical practices that already have an Electronic Health Record system in place, TCS supports the integration and customization of programs such as Dragon Medical Practice Edition into their existing office IT infrastructure. (more)
News Voice Recognition Saves The Day!
The ability to type has become somewhat paramount in our digital world, and for those who spend their days in front of a computer, being able to use both hands on the keyboard is practically a necessity. The hunt and peck method of typing has never worked for me, and perhaps because I spent so much time in front of my keyboard I can envision the QWERTY layout when I close my eyes. While many students taking their first typing class may bemoan the layout, I am one who says kudos to Christopher Latham Sholes the newspaperman who came up with the idea of a multi-row keyboard. I am now continuing to do my job as a writer, both by typing one-handed and with the help of voice recognition software. (more)
News Scribe Healthcare Interactive Includes Customizable Cloud Features for Greater ...
Healthcare Technology Featured Article Scribe Healthcare Technologies Inc. Since physicians and healthcare personnel need to be adaptable to cost efficient ways of practicing healthcare, Scribe Interactive is one tool that can immediately generate transcription layout from a dictation. With Scribe Interactive, the built in Scribes M*Modals Speech Recognition Engine leverages existing voice profiles to accomplish this. Even without a recognized voice profile, Scribe Interactive allows users to create verbal snippets for efficiency. (more)
News Siri's Voice Can Be Heard In Nuance's Guidance
The brightest light in Apple's ( AAPL ) quarterly earnings report came from sales of its iPhone 4S, whose most defining feature is the voice recognition assistant Siri. The phone's eye popping growth, particularly in Asia where it launched in January, was likely behind Nuance's ( NUAN ) upside guidance issued on April 26th. The engine behind Siri And, while suppliers like Nuance can't refer directly to Apple, the company was born out of DARPA funding to SRI International, just like Siri. In it SRI writes, "Siri has also partnered with Nuance Communications to power its robust speech recognition capabilities - the same technology behind the successful Dragon Dictation and Dragon Search and Apps for iPhone". (more)
News We need to talk about speech recognition
Look under the skin of the Siri software and you will find that the speech recognition is provided by industry stalwart Nuance Communications. I've just tried it on my Windows 7 machine and it was quite fun, except for having to spell out and correct Nuance and Google spelling, then sorting out capitalisation, and now I've given up already, back to typing! Controlling some functions of your 2012 Ford Explorer by voice I think Siri has shown us that the voice-recognition technology isn't the most important thing here but the application software layer that the users go through to use the speech recognition. But the application built to use the Nuance speech recognition isn't programmed to interpret all the ways you might say, for example, "No, thank you". (more)
News Smarter Voice Capabilities Will Transform Medical Documentation
In simple terms, that's one on the dilemmas that persist when clinicians interact with clinical information systems. Most provider organizations are familiar with Nuance's Dragon Medical, for instance, which lets clinicians dictate their notes directly into an EHR instead of sending them to first be transcribed into text. The major problem with this approach has been that it provides a huge repository of free text narrative that can't populate all the structured components of a hospital's clinical information system, leaving a potential treasure chest of valuable insights and facts in limbo. According to Chris Spring, MModal's VP of health IT, the platform "listens" to a clinician's dictation in real time and tells her if she's missing any vital information already in the patient's chart. (more)
News What the Voice-Recognition Industry Needs Most
I'm a big believer that voice-recognition technology will play an increasingly prominent role in how we interact with technology -- so much that I've made a bet on Nuance Communications ( NAS: NUAN ) accordingly as the clear technological leader in the field. Datria is a small private player with about 50 employees, and it resells Nuance's speech engine while also counting software giant SAP ( NYS: SAP ) as an investor. That being said, Datria uses a plug-and-play model, so it could theoretically swap out the engine if needed, but Datria has been reselling Nuance's engine for 14 years. Many of those companies license Nuance's engine, and the recognition side works great, but if the application (which is frequently built in-house) isn't programmed to interpret all the ways you can say "yes," then it might not realize that "absolutely" means the same thing. (more)
News Future of voice recognition: Assistants that learn from you
Voice-activated assistants are playing an increasingly prominent role in the technology world, with Apple's introduction of Siri for the iPhone 4S and Google's ( rumored ) work on a Siri competitor for Android phones. Siri is taking steps toward providing a natural, conversation-like experience with voice-activated assistants, as Jacqui Cheng noted in the Ars iPhone 4S review. "When given direct and clear tasks, Siri performs well, and it's nice not having to memorize a strict list of commands," Cheng wrote. "The best part about Siri is the fact that you can (or should be able to, anyway) speak to it like you would speak to a person without having to conform to a special speaking syntaxthe number one turn-off for 'regular' people using voice control features. (more)
News The Human Voice, as Game Changer
Here, Mr. Sejnoha, the companys chief technology officer, and other executives are plotting a voice-enabled future where human speech brings responses from not only smartphones and televisions, cars and computers, but also coffee makers, refrigerators, thermostats, alarm systems and other smart devices and appliances. Today, voice technology is a fixture of many companies customer-service operations, albeit an occasionally maddening one. But now the race is on to make the voice the sought-after new interface between us and our technology. No player is bigger in voice technology than Nuance, of Burlington, Mass., an industry pioneer that has acquired more than 40 companies in the field and today employs 7,300 people. (more)
News Speech Recognition Leaps Forward
Speech Recognition Leaps Forward During Interspeech 2011 , the 12th annual Conference of the International Speech Communication Association being held in Florence, Italy, from Aug. 28 to 31, researchers from Microsoft Research will present work that dramatically improves the potential of real-time, speaker-independent, automatic speech recognition. Dong Yu , researcher at Microsoft Research Redmond , and Frank Seide , senior researcher and research manager with Microsoft Research Asia , have been spearheading this work, and their teams have collaborated on what has developed into a research breakthrough in the use of artificial neural networks for large-vocabulary speech recognition. The notion of using ANNs to improve speech-recognition performance has been around since the 1980s, and a model known as the ANN-Hidden Markov Model (ANN-HMM) showed promise for large-vocabulary speech recognition. The new project applied CD-DNN-HMM models to speech-to-text transcription and was tested against Switchboard, a highly challenging phone-call transcription benchmark recognized by the speech-recognition research community. (more)
News Speech Recognition Tool Comes Up 'Speechless'
Videos The breast-imaging reports, which were reviewed from January 2009 to April 2010, were almost evenly divided into two categories. In one, 307 reports used conventional dictation transcription in which the radiologist dictates the report and a team transcribes and reviews the report. The other 308 reports used automatic speech recognition (ASR) in which the radiologist dictates the report and software immediately transcribes the report onto a computer screen. Dictation was conducted using a handheld speech microphone, the Pro-Plus LFH5276 from Philips Healthcare. (more)
News Advancements in Speech Recognition Set to Improve IVR
Advancements in Speech Recognition Set to Improve IVR As much as we like automated systems, we also like to use voice to move through steps and complete interactions. This recent Plum Voice blog focused on the advancements in IVR , thanks to improvements in speech recognition. Increases in computer processing speeds have enabled speech recognition developers to create more natural, accurate speech recognition software. Susan J. Campbell is a contributing editor for TMCnet and has also written for eastbiz.com. (more)
News Getting your mobile to listen to you: trends in voice recognition
Indeed, most people have gotten used to these tools operating on mobile devices, helping people to control smartphones and navigation systems. Modern programmes like Dragon NaturallySpeaking 11.5 are designed for users who occasionally have to draft a document or want to quickly throw a note up on Facebook. " It continued its work into the 1980s and 90s with ViaVoice, making it one of the pioneers of voice recognition programmes, focusing on commercial applications like call centres. Along with its Windows-based Dragon NaturallySpeaking programmes, Nuance has also started releasing systems for Apple Dragon Dictate 2.5. (more)
News Speech Recognition Tits for the Busy Radiologist
- While sometimes speech recognition systems may seem downright dullard, they are learning all the time, and learn best when exposed to continuous phrases. Also, when the software makes a mistake, try to correct the entire phrase rather than the offending word, particularly if youre using an older system. - Pick away at your systems errors. Correct a single word or phrase per reading session and in a month youll have eliminated dozens of potential errors and boosted accuracy without feeling like youre spending undue time doing software calibration. (more)
News Why Watson Can't Talk to Siri
On Tuesday night, I was schooled by Watson on playing Jeopardy in an exhibition match at the Computer History Museum. David Ferrucci, the guy at IBM behind Watsons creation, explained during a conversation before the match that as intuitive as the interactions with Siri or Watson appear to us, they are fundamentally task-oriented. Watsons tasks are thus to figure out the context associated with a question, determine which answer is the likeliest based on that context, and then reckon if its confident enough in the probabilities to bother answering. Watsons Greater Firepower Siri, on the other hand, does two important things: It recognizes speech (Watson actually doesnt understand speech, but is fed a text version of the question) and it can figure out what steps to take in a limited number of applications, once it understands the words in a natural language process related to the process by which Watson functions. (more)
News Dragon Express 1.0: An inexpensive way to discover speech recognition
Dragon Express 1.0 Dragon Express is an easy and fun speech recognition utility that introduces OS X Lion customers to voice recognition for the Mac. Its fast and easy to place your text wherever you need it: transfer icons within the Dragon Express window include the active application (such as Microsoft Word or Text Edit) as well as popular applications such as Mail, Facebook and Twitter. Dragon Express includes the ability to select and delete text by voice as well as the convenient scratch that command that can be used when you change your mind. Dragon Express Knows When to Listen Dragon Express can be used with the internal microphone of your Mac, but a USB headset is recommended. (more)
News Voice input for medical apps to trend?
Thats the mindset of NuancesJonathon Dreyer, senior manager of mobile solutions marketing at the companys healthcare division: I definitely think voice will be the primary form of input into these mobile devices, Dreyer told MobiHealthNews recently. The biggest thing holding up app developers [for our platform] is approval of their apps [from Apple], he said. The main types of apps using Nuance are point of care and reference, while other categories include pharma, clinical trials, education programs, patient communications, and disease management apps. Dreyer believes that these categories will eventually change: These things will morph over time, and well see new categories emerge, as well as categories we thought were categories turn out to not be categories. (more)
News Yap Isn't Much Like Siri. So Why Does Amazon Want It?
CLT Blogs Justin Ruckman decoded SEC filings to turn up an intriguing recent Amazon acquisition : Yap, a Charlotte-based speech-recognition startup best known for its recently shuttered voicemail transcription app and backend services for some of Microsofts voice-to-text application. So far, Amazon hasnt publicly commented on or even confirmed Yaps acquisition, and didnt immediately respond to our attempts to find out what it plans to do with the company. What Yap does do, though, and does very well, is cloud-based voice transcription i.e., literal, word-for-word rendering of speech into text, at very high volume with very high accuracy but at very low cost. The closer analog to Yap then, isnt Siri, but Nuance, the company behind Dragons collection of voice applications for desktop and mobile, and whose engine powers the speech-to-text component of you guessed it Siri. (more)
News Voice Recognition Software in Medical Imaging Continues to Evolve
Voice recognition software has been shown to reduce report turnaround time and holds promise for populating and mining structured reports but not all radiologists are convinced. Many users still find the software cumbersome and error prone, as seen in a recent informal Diagnostic Imaging poll where 80 percent of respondents said they use it, but 30 percent of them reported frustration with the software. Thats the main feature where the products compete, he said, in the time it takes to make the fixes. They also are competing based on how good the product is right out of the box, taking into account training time and accuracy, said Tim Kearns, GE Healthcares product manager. (more)
News What Makes Siri Special?
If you ask Siri, the virtual personal assistant on the iPhone 4S, why it's so great, it answers with disarming humility: "I am what I am. " Siri goes well beyond voice recognition, they say, by applying powerful artificial intelligence and statistical analysis to decipher the meaning behind questioners' sometimes jumbled sentences. Add to that Siri's dry wit and you have the kind of breakout hit that will propel new uses of similar technology on your phone, tablet, and even your PC, experts say. When you ask Siri to find a nearby restaurant, Siri doesn't just use speech recognition to deal with the request; Services like Siri are "natural language processing" apps that use statistical models to figure out what you probably meant to say when your pronunciation or word choice is garbled. (more)
News Siri-like voice recognition coming to cars
Smartphones raised the bar for hands-free voice activated technology, and consumers are starting to expect the same level of intuitive usability in their cars. The Detroit Free Press reported that at the Nuance Automotive Forum in Detroit this week, speech-recognition company Nuance said an auto manufacturer will integrate advanced voice command technology into its vehicles next year. However, it's getting to the point that even 10,000 voice commands isn't enough, especially when you want the system to be able to look up directions, suggest restaurants, or shop for you. Siri-like voice systems in vehicles are desperately needed for automakers to keep up with consumer expectations. (more)

Videos

Video Computer Chronicles: Robots - Japanese Style (1985).
Host Stewart Cheifet visits High Tech Expo '85 in Tsukuba (Science City), Japan, to look the latest computer and robotics technology. 1985. (more)
Video Computer Chronicles: Speech Synthesis (1984).
A look at speech synthesis and speech recognition technologies and applications with host, Stewart Cheifet, co-host, Herb Lechner (SRI International), and guests Carl Berne (Speech Plus), and Ron Stevens (Votan). Products discussed and demonstrated during the program include Minolta's Talking Camera, Texas Instruments' Speak & Spell, a text-to-speech system from Speech Plus, and the Votan V-5000 speech recognition system. The program also includes two "Computer Principles" lessons from Herb Lechner as well as his Chronicles' Summary. 1984. (more)
Video Handsfree Decision Support - Full version.
Demonstration of handsfree decision support systems for trauma care. The system executed on a portable computer, employing speech recognition coupled with a Bayesian inference and decision making system. After the demo, details of the system construction and use are presented. 1995. (more)
Video Hear Here.
Raj Reddy, et al. Video made in 1969, showing work from 1968. (more)
Video IBM Demonstrates Speech-to-Sign Language Translation System.
"Say It Sign It" avatar translates the spoken word into British Sign Language. Sept. 18, 2007. (more)
Video Listen.
Demo of Project Listen for helping kids with reading. Computer listens to kids reading sentences of a story and gives feedback. 1994. (more)
Video Scientific American Frontiers with Alan Alda: Cars That Think.
3 segments: Part 1 - Watch the Road. Alan rides in a vehicle that recognizes road signs and hazards – and warns the driver to slow down. Part 2 - Hold the Phone! Alan 'drives' the Ford VIRTTEX simulator that researchers use to investigate how distractions like cell phone calls or drowsiness affect driver safety. Part 3 - Smart Passenger. A virtual smart passenger named Sally listens in to the driver's speech at all times and responds appropriately. January 26, 2005. (more)
Video The Age of Intelligent Machines: The Film. By Raymond Kurzweil.
From the original video notes: A survey of Artificial Intelligence showing AI at work and under development. The paradoxes, promise and challenges of advanced computer science, with authorities Marvin Minsky, Roger Schank, Raj Reddy and other leaders in the field. 1987. (more)
Video The Sounds of Speech.
The Sounds of Speech” (Rubin, 2002) is a segment excerpted from the Reading Rocket series commissioned by the United States Department of Education. It shows the Reading Tutor developed by Carnegie Mellon University’s Project LISTEN (www.cs.cmu.edu/~listen), directed by Jack Mostow. The Reading Tutor uses automatic speech recognition to listen to children read aloud, and responds with recorded human speech and graphical feedback modeled after reading experts but adapted to the affordances and limitations of speech technology. July 14, 2008. (more)
AAAI   Recent Changes   Edit   History   Print   Contact Us
Page last modified on May 16, 2012, at 11:20 AM