Monday 17 November 2008

Total Voice Recognition

I have firmly chosen my idea, it is to be the Total Voice Recognition.

Recent technologies in the area of voice activated computers are much better than they have been. In talk-to-type systems like 'Dragon Naturally Speaking 10' there is a much higher success rate than before, with now just 1% of words being mis-heard. There is also a limited increase in the Artificial Intelligence of the software; you can give instructions such as "bold that" or "left align that" which can tell the difference between dictation and instruction based on tone of voice.

There is also software available for Mac called Dictate which goes the extra step and is used across the operating system, to open software or even select tools in Photoshop. It is combined with the computer's own speech software, which means you can ask the time and be told by the machine, without using anything but a microphone.

My idea has been slightly covered already, and so I must think of the next stage in this speech control evolution. Multi-Touch systems combined with Total Voice Recognition would alleviate the need for a mouse or keyboard altogether, but it could go further and limit the number of software interfaces needed. The biggest and most obvious example would be the internet: If you need a quick piece of information such as a train time you could simply ask the computer, who would understand the question, search the internet and respond as quickly as possible. You could set up a system of favourites (e.g. railenquiries.com) so the search, and therefore the response you are after are instant.
You: "Computer, When is the next train from Brighton to Victoria?"
Computer: "The next train from Brighton to Victoria is at 3.40"
You: "Thanks"

The result of this service would mean individual websites of data could be avoided, rather people would get all of their information via one source. Potentially, if the software was intelligent enough, blind internet users could get information as quickly and easily as everyone else.

The progression from there would be that you have a separate machine, a computer with no display, that works only on a speech input and speech output system, searching the internet for information and relaying it back to you. It would have to work 100% of the time, or else not be worth having, so the intelligence of the searches would have to be outstanding. This is currently not likely to happen, but is certainly possible in the future.

Dissertation

This morning we ran through initial ideas for our dissertations. I hadn't thought about it until asked this morning, but I had two thoughts, both of which might be problematic: I thought maybe I'd write about mathematics, especially phi in design of all kinds, but this idea has been taken by somebody else, which could prove problematic. I also considered writing about TV, especially American TV series such as HBO's The Wire, The Sopranos and Deadwood. There are loads of ideas I have on this subject, but the problem is that I wrote an essay on this topic in the first year of the course, and I might plagiarise myself.

This means I either have to come up with a new question and set of research, or think of a new topic altogether.

Tuesday 11 November 2008

I haven't updated for a while, but here are my most recent activities.

There were 6 ideas:

1. Social TV - Chat whilst flicking through channels in case you find something good. See what your friends are watching.

2. Digital Odours - Smells can be captured by a device, then emailed and reproduced by others. Could also be used in cinema, TV or consoles to aid immesion in the media.

3. Total Voice Recognition - Operate the entire computer with muti-touch and voice, or voice alone.

4. 4D imagery - Impossible for humans to see, a computer will be able to generate 4 dimensional imagery.

5. Future Internet Dating - Virtual reality dates, using face-mapping to ease cyber-couples into real life meetings.

6. Retro Websites - A future movement by those tired of web 2.0's domination, bringing back cloud wallpapers, frames and animated gifs.


Of these 6, I have put the most thought into the Digital Odours and the Total Voice Recognition. I currently want to go with Total Voice Recognition as Digital Odours have already been launched and failed (see Digiscents iSmell and Smell-O-Vision). I also see Total Voice Recognition as both the most interesting and most likely of all my ideas.

I need to think about the AI that would be essential in getting the technology to work. This means I need to look at the way people type or write, and how that is different from how they speak, and how a system could work that speeds up rather than slows down the process.

I should not get caught up in the technical/electronic side of things, as this technlogy is currently unavailable and I am not an inventor. That being said, I have thought of a solution to the technical problem of many people in an office/class talking at once: Use vocoder technology, like in the Talk Box, which requires you only to mouth the words rather than speak them aloud.

It can also be used to create funky sounds.
http://uk.youtube.com/watch?v=RQfZRRRo_8A&feature=related