Announcement

Collapse
No announcement yet.

Type the text in this box!

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Type the text in this box!

    You've all seen it. You need to verify that you're a real person and not a T-1000. So you see a box with a bunch of random letters and numbers, or sometimes a made up word or sometimes a real word, and you have to repeat what you see. I guess its a good idea, although I don't always see the point as far as what they are protecting from the vicious cyborgs.

    What I don't get is why do they go find an epileptic 3 year old to actually draw the symbols? Sometimes all you see is like a random like and a circle, and you're somehow supposed to figure out what it means? It seems pointless - if the robots they are trying to thwart can figure out what the text is if it's written too neatly, surely they can still figure it out no matter how messily you write it.

  • #2
    The captchas that Google (and Yahoo Groups) uses are notorious for being unreadable at times. I realize that the point is to make sure a human is parsing the 'words', but if a human can't even make out what's supposed to be there...

    A site I've been on (can't remember what it is now) uses "CAT-cha" where users have to pick out the three kitten photos from a group of six, and it's always randomized.
    "Any state, any entity, any ideology which fails to recognize the worth, the dignity, the rights of Man...that state is obsolete."

    Comment


    • #3
      Originally posted by DrFaroohk View Post
      I guess its a good idea, although I don't always see the point as far as what they are protecting from the vicious cyborgs.
      As to what they are protecting? Everything. If it's a forum, then they are protecting the users from spam. If it's a mailing list, the same thing. If it's a corporate site, then they're trying to prevent a distributed denial of service attack (DDOS), by making sure that the person entering the code can't just sign up for any number of accounts and then start downloading the entire site (including whatever really big files they might have) over and over.

      If I've missed a scenario, let me know what you saw being protected, and I'll tell you what they were protecting from.

      Originally posted by DrFaroohk View Post
      It seems pointless - if the robots they are trying to thwart can figure out what the text is if it's written too neatly, surely they can still figure it out no matter how messily you write it.
      Actually, no, they can't. Computers are very good at seeing straight lines in a picture. They can easily follow them. Once they know where the straight lines are, they can then easily do pattern matching, and figure out the characters. In fact, this is the basis of optical character recognition, or OCR. Most scanners come with some form of OCR software, and even with how easy this process is (in comparison to understanding captchas), they still make mistakes.

      As a result, deliberate distortion of letters results in something that is very hard for a computer to automatically decipher.

      Newer systems are in development, but one thing that has not been determined as yet is how to beat a different problem that has come up: Spammers are now re-using captchas on porn sites, and having a human answer the captcha for them.

      The way it works is this: Spammer sets up a porn site, and requires the user to answer a question to gain access. Whenever a user attempts to gain access, spammer's code goes and tries to gain an account (or make a post) somewhere, and receives the captcha from the target site. It then gives the user the captcha, who then dutifully enters the captcha. The response is sent to the target site, and when that site responds to the spammer's site, his code knows whether or not it succeeded. Based on that response, it either grants or denies the user access to the porn site.

      In other words: Porn surfers are helping the spammers by cracking captchas for them.

      All of this is a race to try to control the scumbags using the internet to bilk money out of unsuspecting people. So far, the good guys are barely holding even. And probably not for much longer.

      Ironically, the bad guys are pushing artificial intelligence further than the good guys because the bad guys have greater need for it. If they can get a true AI going that can read anything, then they don't have to have any other systems running, they can just go to town getting at everything they can without bothering with secondary sites trying to get users to crack data for them.

      It's not a pretty sight out there. And I'm not convinced it's going to continue working this well for much longer.

      Comment


      • #4
        I understand why they do it, and I'm fine with it, but I frequently have to do it a couple times before I manage it. I'm fine at reading letters, but two things I am NOT fine at are:

        1. Number recognition. Even if it's written clearly. Muss up the character, and I'm doomed.

        2. Decipering bad handwriting. I sometimes wonder if that's part of the dyscalculia situation, but I should be able to decipher mussed up letters, at least. But I really don't do so well at that, either.

        I cringe when I see those things. I can do it, but it's not easy for me. I pity any dyslexics trying to navigate that crap.

        The worst one was the one for the Dyscalculia forum I'm on. I thought it was a very, very sick joke that we have to type in a numeric code to initially create an account on that site. It's funny....now.

        Comment


        • #5
          Some of the ones I've seen just don't look like they do anything. One was a forum that kicked in the captcha thing whenever you just did a forum search. Want to find all posts about Back to the Future? Gotta decipher the egyptian code first!

          Some other ones are movie/tv sites I go to. Just to watch a streaming video you have to do one of those. It seems pointless though, because I don't understand what they are protecting. Are they afraid the robot is going to watch an episode of burn notice!?!?! OH NOES!

          Why don't they do it like this: Simply type a regular word, in regular font, that people can read, and then to get by it you have to type the word BACKWARDS. Or like an unscramble type thing. Would the OCR even be able to comprehend the fact that it has to do such a thing? I suppose if they were all the same you could tell it to do so, but that's why you mix it up some. Type this word backwards. Unscramble these letters. Type the first half backwards and the last half forward.

          Comment


          • #6
            Originally posted by DrFaroohk View Post
            Some of the ones I've seen just don't look like they do anything. One was a forum that kicked in the captcha thing whenever you just did a forum search. Want to find all posts about Back to the Future? Gotta decipher the egyptian code first!

            Some other ones are movie/tv sites I go to. Just to watch a streaming video you have to do one of those. It seems pointless though, because I don't understand what they are protecting. Are they afraid the robot is going to watch an episode of burn notice!?!?! OH NOES!
            That would be the DDoS fear. Running repeated searches can bog down a database and eat up CPU cycles, and some hosting services will nail you for using more than your share. Also, the videos, a bot might be wasting bandwidth, or simply harvesting media.

            I like your ideas about backwards, or unscramble, or such (though unscrambling might be difficult, as you might get multiple words from a scrambled set, or someone might not know the word).
            Any comment I make should not be taken as an absolute, unless I say it should be. Even this one.

            Comment


            • #7
              Originally posted by DrFaroohk View Post
              One was a forum that kicked in the captcha thing whenever you just did a forum search. Want to find all posts about Back to the Future? Gotta decipher the egyptian code first!
              Broom is right: This is about preventing a denial of service attack. As a non-programmer, you're not likely to know just how easy it is to create a very small program that can bring a forum to its knees through a simple search. Based on what I've done to myself, I could write a program in about 25 (maybe 30) lines of code that would execute a couple hundred simultaneous searches. Randomize one word in the search for each one, and I manage to make the database work even harder to accommodate all of them. Basically, for almost any forum out there, I can bring it to its knees with very little code and only one computer.

              And yes, there are malicious people like this out there. Add in a random element that the user has to do, and you make my 25 line program much more difficult to accomplish successfully. The forum gets protected.

              Originally posted by DrFaroohk View Post
              Some other ones are movie/tv sites I go to. Just to watch a streaming video you have to do one of those. It seems pointless though, because I don't understand what they are protecting. Are they afraid the robot is going to watch an episode of burn notice!?!?! OH NOES!
              Actually, I work with someone who has written a video grabber. He's got it completely automated, and uses it for his own video site. Someone submits a URL to a video, and his script retrieves the video from the remote server, stores it on his own local server, and then does view counts, advertising, etc, using that video. Add in an actual captcha to view a video, and that grabber script becomes useless.

              So, there's preventing illegal copying, and preventing a DDOS (download the same video a hundred times from a hundred different computers). I think that's quite sufficient reason to protect your data, and your bandwidth bill, using a captcha.

              Originally posted by DrFaroohk View Post
              Why don't they do it like this: Simply type a regular word, in regular font, that people can read, and then to get by it you have to type the word BACKWARDS. Or like an unscramble type thing. Would the OCR even be able to comprehend the fact that it has to do such a thing? I suppose if they were all the same you could tell it to do so, but that's why you mix it up some. Type this word backwards. Unscramble these letters. Type the first half backwards and the last half forward.
              Those actually sound like decent ideas at first, until you realize that the web page you are viewing has actual structure to it. Once you realize that, you can (quite easily) write code that looks over the structure, and looks for the piece of the page that is visible to the user that provides the instructions on how to decode it. From there, a simple keyword match reveals what has to be done to get past the captcha, breaking them totally.

              Right now, the best methods we have of validating a human is entering something are to use some sort of graphical data presentation. Pictures are hard for a computer to understand. They don't see the same way we do. Where we see patterns, they see a collection of 1s and 0s. We have to teach them to see patterns, and to recognize objects within those patterns. What is surprising is the complexity of doing that work.

              If you would like to get an idea of how difficult the process is, have a friend blindfold you, and then drive you to some place random, and take a not normal route to get there. Once there, get out of the car, and spin in place until you're dizzy. Take off the blindfold, and time how long it takes for you to recognize where you are.

              After doing that, consider this: You were given a huge number of disadvantages in the process, and still managed to figure out where you were fairly quickly by looking around and identifying the objects surrounding you.

              Computers, right now, cannot do that. They do not have the capacity to actually recognize objects in general. In certain limited cases, they can do so. For instance, an object they've been trained to recognize, when presented in the way they've been trained to recognize it, is likely to be recognized. But change the background, or turn the object sideways, or upside down, or change the color contrasts sufficiently, and you'll find the computer failing to recognize something.

              Vision is hard. And that is why it is chosen for captchas. Computers, while better than they used to be, still don't do it very well.

              Comment


              • #8
                I completely agree! I sometimes have to do those two or three times because the letters are so weird. I really hate the ones where you get locked out if you get it wrong more than three times, and have to wait 24 hours or contact a server admin.

                Comment


                • #9
                  It doesn't stop all spambots tho; not all spambots are actually well, bots. Some are people who get on, post as a user for a few posts, then post a thread that's actually crammed full of advertising links. There was one on a site I admin a week or so ago; I deleted the thread right away and told them to knock it off or else. They haven't been back since. People like that make me so angry, cuz there's a good chance that the links lead to spyware and shit, and there are often kiddies surfing sites who might end up clicking.
                  "Oh wow, I can't believe how stupid I used to be and you still are."

                  Comment


                  • #10
                    What bugs me is when they offer the option to hear the letters if they are too hard to read, and my slight hearing problem means that I don't understand the electronic voice from the speakers either.
                    Point to Ponder:

                    Is it considered irony when someone on an internet forum makes a post that can be considered to look like it was written by a 3rd grade dropout, and they are poking fun of the fact that another person couldn't spell?

                    Comment

                    Working...
                    X