However, I recently decided, "Screw that, it's my blog, and I can ramble about whatever topics I like." I'll just preface some of my ideas with a disclaimer: I'm not an expert in this field, I'm just playing with ideas in my head, and I hope people who know more will expand and expound on what I'm explaining.
On 'seeing' for the blind: my previous post alluded to a device in which a blind person hooked a camera up to her forehead, a bunch of piezoelectric rods to her chest (or feet or tongue, maybe), connect the two, and could "see." This has already been done, as I thought, but I have been unable to determine if my twists have been implemented (I've actually done a bit of research here, and written one of the leading scientists, who is on the road and can't answer), so I'll explain my whole idea, and place it in the public domain.
The full idea is you run an edge-detection algorithm on the image from the camera (in real-time), so that instead of trying to present the full data you only present the edges, which is what the human eye is good at detecting anyways (cf, "mach banding"). I believe that, using this system, it would be very easy to detect, say, a post two feet in front of your eyes, because if you moved your head back and forth five or six inches the edges of the post would whizz across your chest in a very noticeable fashion, and even though the chest may not be innervated enough to detect subtle difference in a static field of poking rods representing complete image data, it can certainly feel if there's something brushing across it if you use edge-detection.
Thus, the emphasis would be on detecting motion, not depth or color or hue or brightness or anything like that. Many animals have relatively poor eyesight but are extremely good at detecting motion (cf, owls), and they do OK. Hell, owls can fly.
There are a couple of cool twists you could add to this system. One would be using an infra-red camera and adding a bright infra-red LED that was offset from the camera by a couple feet. This would provide 'hatchet lighting' in low-light or high-ambient-light conditions, which would give very dramatic shadows to close objects, which would enhance the edge-detection algorithm, and make detecting motion even easier for objects within about 15 feet. (You can also imagine military uses of this system; you could attach a back-facing camera to a special forces dude and put a piezo panel on his back, and he's got eyes in the back of his head.)
The other twist would be to add real-time optical character recognition and barcode reading to the system, so that every n frames you grab the scene and see if there are any words / codes that are recognizable, and if you so read them aloud to the wearer, while simultaneously vibrating the area in which you found the word / code. In this way, once a blind person found, say, a post at a street corner, she could look up it and have the street sign read to her. Or, if she were pawing through her cabinets, she could look at boxes and cans and have the items read to her.
I know real-time barcode reading is possible (because, uh, I invented it), I suspect real-time OCR is possible as well, given that we've had basic OCR forever.
I've been thinking about how to work towards teaching computers to "understand" languages, with the goal of creating a simple universal translator. (Simple!)
What I'd like to see is a system where we can translate sentences from any language into a neutral format, and then translate that neutral format into words in the target language, but NOT try to put them together into grammatically correct structures. This is why I call it a simple translator.
I don't think this is a particularly hard problem, really; I think it's mainly tedious, and I've been trying to think of how to divide up the work; possibly create a wiki or some other kind of public-supported site so we can build up a universal knowledge base once and for all, much like the wikipedia.
My thinking is: we need to assign a unique code (call it a number) to every concept in the entirety of our existence. (Simple!) For example, look at the sentence, "Mary had a little lamb." "Had" as in "owned" would be, say, concept number 23. But "had" can also mean "ate," which would be concept number 2,031. "Had", as in knew carnally, would be concept number 65,312.
So, the english word "had" would have a strong association with 23, and then weaker associations with 2,031 and 65,312. We might throw out the weaker associations if they aren't re-inforced by the associations found in the rest of the piece; there appears to be no more mention of Mary eating her lamb or loving it in a way that would make John Cornyn excited, so we can probably discount those associations.
We end up with a list of concepts for each word (or possibly phrase, in the case of idioms) in the source text. We can take it further by recognizing parts of speech, and building our little concept list into a tree, sixth-grade sentence-diagram style. (Eg, prepositions and clauses start their own little tree, and adjectives hang off of the nouns they modify.)
Now, what I'm curious about is just translating these raw trees into another language and then outputting them as trees, not as sentencces. That is, I don't want to create a grammatically correct English sentence from a Japanese one; I want to see how the Japanese sentence was structured. I am sure I could figure out its original meaning, and it would also be a fascinating learning experience. Also, it would introduce fewer translation errors into the equation to simply require people to learn the structure of other languages (which is pretty trivial, really, n'est-ce pas?).
The things we could do with a system like this are amazing. Imagine playing World of Warcraft and being able to talk to people from all over the world, in their language, automatically. Imagine hooking this up to iChat so that you could iChat with anyone in real-time.
But once you had this system, we could start working on teaching the computer to actually understand our speech, in limited amounts. We could teach the computer facts in its language (that is, the numbered concept language) instead of trying to teach them in some arbitrary human language. For instance, we could teach it that Granny Smith apples are red when ripe, and it would really know this. We'd have to be smart about associating concepts with each other; eg, the concept for "red" would have a link to its class, which would be the concept for "color". But once we did this, we could ask for the "color" of any object in the fact-base, and it'd work. In any language.
This is a huge undertaking, and I know that parts of this have been done before, but never on the right scale. The idea I'm playing with is how we can use the net to have people add concepts to the initial language database (leaving aside the knowledge base, which I see as a second step, after we have concepts in), and making something useful from that before we tackle associating the concepts which each other.
Ceci n'est-ce pas un pipe!
Labels: random ideas