Monday, December 18, 2006

CAPTCHA introction (cont. 6)

在第二個研究議題上,我們將破解Captcha的程序分為兩大步驟,分別是「切割」(Segmentation)以及「辨識」(Recognition)。在「切割」的過程中,我們必須將Captcha內每個字元完整的提取出來。再透過「辨識」的步驟,將提取的字元辨識出來。在[1]中有提到,「切割」的困難度遠高於「辨識」,就目前的技術來說,「辨識」的工作可以透過機器學習(Machine learning)的方法完成,但「切割」的問題電腦卻仍有待克服,因為在Captcha設計的過程當中,為了增加Captcha的安全性,會將文字的部分進行扭曲或變形,並在圖形中加入雜線,當文字扭曲或變形的程度提高時,有可能將多個字元的提取區域重疊在一起,當雜線的數量增加時,就會混亂Captcha的切割與辨識,使得Captcha內的文字無法決定正確的擷取。目前我們正著手進行Yahoo以及YouTube的Captcha系統的破解,並規劃在不久的將來破解更多的Captcha系統。

In the second area, we separate the break method into two processes. One is segmentation process, and the other is recognition process. At the segmentation process, we should catch all of the characters perfectly, and then through the recognition process to recognize the caught characters. [1] has mentioned that the segmentation process is more difficult than the recognition process. The recognition problem can be solved by the machine learning technique, but the segmentation problem still has many difficulties can not be solved. For example, some CAPTCHA system may improve the security by adding clutter or warping characters. The clutter can confuse the computer’s recognition and segmentation ability, and the warped characters will increase the difficulties of segmentation, too. We are trying to break the YouTube and Yahoo’s CAPTCHA system, and plan to break others CATCHA systems at future.

0 Comments:

Post a Comment

<< Home