Tuesday, January 16, 2007

明慧--The proposed CAPTCHA mechanism

2.The proposed CAPTCHA mechanism
The proposed visual CAPTCHA mechanism is relatively simple. First, a collection of images should be prepared. To generate a test, select an image from the database at random. In this image, choose two non-overlapping blocks of the same size and exchange the content. We will name this procedure the EXCHANGE-2 algorithm. Fig.2 depicts an image before and after applying the EXCHANGE-2 routine. To pass the test, the user needs to use the pointing device to click on the switched regions. In the above example, it should not require too much effort to identify these areas due to the size of the exchanged block.

在這裡提出的視覺CAPTCHA方法相當簡單,首先將影像準備好,然後隨機選擇一張影像作測試用。在這張影像中,選擇兩個沒有重疊的區塊作交換,此方法在此命名為EXCHANGE-2演算法。如Fig.2所示,一張為改變前的影像,一張為套用EXCHANGE-2演算法之後的影像。若要通過此測試,使用者必須在交換的區塊上做點擊的動作,而在上例中,他應該不需要花費很多心力去辨識所交換的區塊。


The original methodology can be generalized as follows: randomly select K non-overlapping regions in the image and perform content reassignment. A user will be granted access if he/she can correctly identify J exchanged blocks where K>=J>=2. Intuitively, the lager K and J are, the more difficult it is for both human and computer to pass the test. On the other hand, incrementing K also increases the probability of overlapping, hindering the test generation process. Therefore, proper choices of K and J must be made to allow smooth operation of this EXCHANGE-K-J algorithm.

原始的方法論可能被推斷如下:在一張影像中隨機選擇K的區塊,然後將其重新配置,如果能正確辨認出J塊已交換區塊,而這J塊區塊介於2塊到K塊之間,則使用者將認同此方法是可行的。在直覺上,隨機選擇的區域K及辨認出來的區域J越多,就越難通過人機辨識系統。就另一方面而言,增加K塊區域且增加重疊的可能性,也阻擋了測試生成過程。所以,適當的選擇K和J,可讓EXCHANGE-K-J演算法更流暢的操作。


The proposed visual CAPTCHA is designed to enable quick inspection and identification of the switched regions by a human user. At the same time, these regions should not be easily detected by a computer algorithm. To this end, we must look into two important issues. The first one has to do with the size of the exchanging blocks. The lager the size is, the easier it is for a user, and perhaps the computer, to locate the exchanged block is too small, even users with acute eyesight will have difficulty finding these regions, as shown in Fig.3, where the block size is 0.05x0.05 of the original image. The second issue is regarding the image collection. What type of images should be stored in the database? Does the choice of a specific type of images affect the performance we attempt to answer in this paper.

在此提出的視覺CAPTCHA是設計成讓使用者能夠快速檢查及辨識交換的區塊,同時,這些區塊也不能讓電腦程式容易破解。最後,我們必須調查兩個重要的問題。第一個問題為交換區塊的大小,size越大越容易讓使用者辨識,也或許讓電腦更容易破解,若交換的區塊太小,使用者較難找出這些區塊,像Fig.3,區塊大小僅0.05x0.05。第二個問題則為影像匯集。什麼樣的影像該被存放在資料庫中?在本文中我們嘗試著回答具體類型的影像所影響的效能。

Monday, December 25, 2006

明慧--Introduction(cont. 2)

The textured-textual-image-based mechanism we proposed is more difficult to defeat since it involves two challenging problems in computer vision, namely, image segmentation and texture analysis. It cleverly makes use of the unique capabilities of human visual systems such as filling-in of contours. The result is a more robust and temper-resistant scheme for access control.

我們所提出的以影像貼圖為基礎的手法,自從包含影像分割及貼圖分析這兩種問題後,已經變得較難破解了,他是利用人類視覺系統的獨特能力去填滿影像,此結果在存取控制上為一份更具磨練性的計畫。


In this paper, we carry on the same principle: gestalt theory, to devise another visual CAPTCHA which is also easy to generate, yet tough to defeat. The test in the newly developed method is formed simply by exchanging non-overlapping blocks in an image. Passing the test requires clicking on the switched regions with the pointing devices, a more natural way to interact than keyboard entry if the user is mostly doing the browsing.

在這篇論文中,我們持續在相同的原則上:型態理論,構想另外一種視覺上的CAPTCHA,容易產生更容易破解。這項測試在近年來的發展是在一張影像當中交換非重複的區塊,通過在被要求點擊交換區塊的測試,而不是使用鍵盤輸入,為一個使用者在瀏覽網頁時更自然的互動方式。


The rest of this paper is organized as follows. In section 2 we formally present the algorithm to generate the test and justify its efficacy in telling humans and machines apart. We will also discuss issues regarding the choice of parameters and image database. Section 3 describes possible ways to defeat the proposed CAPTCHA and our corresponding counter-measures. Section 4 presents the experimental results of applying several image segmentation techniques to identify the exchanged blocks. Section 5 concludes this paper with a short conclusion and outlines possible improvements and future developments.

以下介紹這篇文章的架構。在第2章我們完整地描述演算法去測試這實驗,並且證明在人機辨識系統中式有效的,也討論關於特徵及影像資料庫選擇的問題。在第3章描述可能破解我們所提出的CAPTCHA方法和對應的相關措施。第4章提出幾個影像分割技術的實驗結果,用來辨識已經交換的區塊。第5章簡單敘述結論,以及提出未來可能的發展和改進的方向。

Thursday, December 21, 2006

明慧--Introduction (cont.)

In [2], Ahn et al. proposed a method: CAPTCHA, which stands for “Completely Automated Public Turing test to Tell Computers and Human Apart”, to separate human user from bots. It later became probably the most common method of limiting access to services made available over the Web. Most CAPTCHAs are in the form of visual verification of a bitmapped image, although alternative solutions such as audio and logic puzzles have also been exploited.

Ahn et al.提出一個方法:CAPTCHA,用來分辨使用者和機器人程式,後來變成最常使用的方法,常用來限制網路上一些經常使用的服務。而大部分的CAPTCHA是在二為影像中利用視覺來做確認,現在已經有一些聲音及拼圖的辨識也被使用。


The incorporation of visual CAPTCHA mechanism with textual images is now frequently seen in the comment areas of message boards and personal blogs due to the low cost in generating such a test. However, it is also relatively easy to defeat textual-image-based CAPTCHA. In [3], Mori and Malik developed a robust character recognition algorithm using shape context and achieved 93% accuracy on a set of images generated by EZ-Gimpy. A more difficult test named ‘Gimpy-r’ has also been broken [4]. It is only a matter of time that bots with these functionalities be widespread. Consequently, more effective solutions must be sought.

(這段我看不太懂...)現在比較常見的視覺CAPTCHA是用在留言板及個人的部落格,他就像是個測驗,但也很容易將以貼圖影像為基礎的CAPTCHA破解。在“Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA”這篇論文中,Mori和Malik發明一種特徵辨識的演算法,利用形狀互換及藉由EZ-Gimpy一堆的影像生成達到93%的正確率,而更難的Gimpy-r測驗也被破解了。機器人程式的功能自然而然的已經被廣泛流傳,而更有效的方法也不斷在發掘當中。


The textual-image-based approach reported in [5] marks our first effort to address the problem raised above. Instead of using distorted characters with random backgrounds, we filled both foreground and background layers with random textures. An example is shown in Fig. 1.

在“Embedding Information within Dynamic Visual Patterns”這篇論文中指出我們第一個以貼圖影像為基礎的作品,已經將上面的問題解決了,不是利用背景的特徵作扭曲,而是在前景及背景兩部分都做隨機貼圖。

Monday, December 18, 2006

CAPTCHA introction (cont. 6)

在第二個研究議題上,我們將破解Captcha的程序分為兩大步驟,分別是「切割」(Segmentation)以及「辨識」(Recognition)。在「切割」的過程中,我們必須將Captcha內每個字元完整的提取出來。再透過「辨識」的步驟,將提取的字元辨識出來。在[1]中有提到,「切割」的困難度遠高於「辨識」,就目前的技術來說,「辨識」的工作可以透過機器學習(Machine learning)的方法完成,但「切割」的問題電腦卻仍有待克服,因為在Captcha設計的過程當中,為了增加Captcha的安全性,會將文字的部分進行扭曲或變形,並在圖形中加入雜線,當文字扭曲或變形的程度提高時,有可能將多個字元的提取區域重疊在一起,當雜線的數量增加時,就會混亂Captcha的切割與辨識,使得Captcha內的文字無法決定正確的擷取。目前我們正著手進行Yahoo以及YouTube的Captcha系統的破解,並規劃在不久的將來破解更多的Captcha系統。

In the second area, we separate the break method into two processes. One is segmentation process, and the other is recognition process. At the segmentation process, we should catch all of the characters perfectly, and then through the recognition process to recognize the caught characters. [1] has mentioned that the segmentation process is more difficult than the recognition process. The recognition problem can be solved by the machine learning technique, but the segmentation problem still has many difficulties can not be solved. For example, some CAPTCHA system may improve the security by adding clutter or warping characters. The clutter can confuse the computer’s recognition and segmentation ability, and the warped characters will increase the difficulties of segmentation, too. We are trying to break the YouTube and Yahoo’s CAPTCHA system, and plan to break others CATCHA systems at future.

Sunday, December 17, 2006

CAPTCHA introction (cont. 5)

Captcha的研究分為兩大主軸:一是設計Captcha,另一是破解Captcha,這也是本論文的兩大研究議題[1]。針對第一個研究議題,設計Captcha應注意的重點是所設計的Captcha需方便人類閱讀(Friendly)並且不易被電腦破解(Security)。實際上這兩個需求是互相牴觸的,以無名小站的Captcha為例(如圖1.6 (a.)),它很Friendly,人們可以輕易的辨識出6185,同時電腦也可以辨識出他是6185(Security不足),所以這不是一個好的Captcha。再以Hotmail的Captcha為例(如圖1.6 (b.)),這組Captcha雖然不易被電腦辨識出來(Security不足),但因雜線過多,人們也搞不清楚其內容(不Friendly)。所以要設計一個好的Captcha需在Friendly與Security這兩個需求間取得一個平衡,特別是要了解人類視覺辨識能力的優勢在哪裡,電腦辨識能力的缺點又是在何處,這樣才能設計出一個理想的Captcha。

The research of CAPTCHA has two areas, one is to design a system and the other is to break a exist system. At the first area, when we design a new system, we should consider both human friendly and security. The people should easy to answer the question, and the computer can not find the answer automatically. In actually, these considerations are conflict with each other. For example, Fig. **.** shows a CAPTCHA which is used by the www.wretch.cc. Human can recognize this picture is 6185 easily, and so do the computer. Otherwise, Fig. **.** shows a sample of the Hotmail CAPTCHA system. This CAPTCHA has high security, but the too much clutters will cause the people can hardly to read it. Because these considerations can hardly to balance each other, we should find out where is the advantage of the human recognition ability, and where is the drawback of the computer’s recognition algorithm. According to these, we can try to find out a better way to design a new system.

Thursday, December 14, 2006

小和--Introduction

With the prevalence of digital communication networks, increasing amount of information is being generated and exchanged electronically.To effectively address the growth in data quantity, automatic processing ‘bots’ or ‘agents’ have been developed to aid human user in filtering or summarizing raw materials. With the introduction of new web standards such as XML and semantic web [1], it is expected that search, exchange, and interaction across different platforms will become more convenient and precise.

While we celebrate and enjoy the convenience of automation, it should be cautioned that abuse of technology can result in undesirable consequences. For example, scripts to register for free e-mail accounts in quantity have proliferated to such an extent that major portal sites have adopted certain anti-automation strategy to counter-attack.Automated playing ‘bots’ in the form of plug-ins can seriously interfere with the normal operation of online games and thus take the fairness and fun away from the whole experience. A simple and effective approach to automatically tell humans and machines apart will prove valuable for the aforementioned scenarios.

隨著網路通訊技術的普及,數位化的資訊也越來越豐富。為了幫助人們能更有效率的分析與整合這些資料,所謂的"機器人"或是"代理人"程式也因此應運而生。透過XML或是語意網路等新一代的網路技術,不同平台間的搜尋、交流以及互動將會更加方便,也更加準確。

當我們對自動化的方便感到慶幸的同時,不可諱言的,我們也必須要正視自動化的濫用將可能導致的恐怖後果。舉例來說,利用自動化程式大量註冊免費郵件帳號,導致入口網站禁不起流量的激增,只好設計出反自動化程式加以反制。作為外掛的自動化遊戲機器人,則會妨礙正常的遊戲方式,進而嚴重影響遊戲的公平性與趣味性。最簡單而有效的方法,便是提出一個能夠自動分辨人類與電腦的程式,來防止上述情況的發生。

明慧--Introduction

With the prevalence of digital communication networks, increasing amount of information is being generated and exchanged electronically. To effectively address the growth in data quantity, automatic processing ‘bots’ or ‘agents’ have been developed to aid human user in filtering or summarizing raw materials. With the introduction of new web standards such as XML and semantic web [1], it is expected that search, exchange, and interaction across different platforms will become more convenient and precise.

隨著網路上數位溝通的流行,藉由電子方法的產生和交換增加資訊。在資料量中有效位址的成長,全自動化程序的機器人或代理人程式已經發展成幫助使用者過濾或概括原始資料。隨著新網路的引進,像是XML和語義網路,期待於搜尋、交換、及不同平台間的互動能夠更方便且明確。


While we celebrate and enjoy the convenience of automation, it should be cautioned that abuse of technology can result in undesirable consequences. For example, scripts to register for free e-mail accounts in quantity have proliferated to such an extent that major portal sites have adopted certain anti-automation strategy to counter-attract. Automated playing ‘bots’ in the form of plug-ins can seriously interfere with the normal operation of online games and thus take the fairness and fun away from the whole experience. A simple and effective approach to automatically tell humans and machines apart will prove valuable for the aforementioned scenarios.

當我們讚頌和享受自動化方便的同時,應該小心謹慎科技的濫用所導致不受歡迎的後果。舉例來說,免費的e-mail帳戶所登記的數量已經擴散到一個範圍,讓主要部門已經採納可信的反自動化對策去破解。而自動化機器人程式以外掛的形式利用正常指令嚴重介入於線上遊戲,且將公平性和娛樂性遠離於所有的經驗。在上述的方案中,一個簡單而有效的通道自動證實人機辨識是有價值的。

CAPTCHA introduction (cont. 4)

(1.) 閱讀式Captcha(Reading-based Captcha)

我們在上一章所討論的Captcha是一種要求使用者辨識一張圖片當中的文字,這種類型的Captcha就是屬於閱讀式Captcha(如圖1.5)。此種Captcha的優勢在於,文字原本的設計目的就是為了要讓人類使用,而且絕大多數的人從小就開始學習它。在語言的隔閡上,若使用的是英文字母與阿拉伯數字,幾乎全世界的鍵盤上都有英文字母及阿拉伯數字,使用者就算看不懂符號所代表的意思,也可以對照著鍵盤上的符號輸入答案。在安全上,相較於圖形式Captcha可能只有數十種答案,使用大寫英文字母,以及阿拉伯數字的Captcha,三個字母長度的字串就有約四萬六千組可能的解答。最後,又因為光學文字辨識以及人類視覺的認知方法都是相當著名的研究領域,Captcha的設計者比較容易根據相關的資料找出分辨人類及電腦的方法[4]。但是,在光學文字辨識的領域不斷進步下,此種類的Captcha也勢必將面臨到嚴峻的考驗。

Reading-based CAPTCHA

At last chapter, we showed a CAPTCHA with text box and character images, and this kind of CAPTCHA is called Reading-based CAPTCHA (as shown on Fig. 1.5). The advantage of this kind of CAPTCHA is: the character is made for human and most of us used it from our childhood. In the language gap, if we use English and Arabic numerals, even the user may not know the English, the keyboard also has these symbols to help user to answer the question. In the security, unlike Image-based CAPTCHA may has limit answer; the Reading-based CAPTCHA with 3 character length will give more than 40,000 possible answers. Finally, because the OCR is a well known field, the CAPTCHA researcher can design a more security system from reading these researches. However, the progress of the OCR field will cause this kind of CAPTCHA more serious problems.

Sunday, December 10, 2006

CAPTCHA introduction (cont. 3)

聲音式Captcha(Sound-based Captcha)

由電腦發出聲音,要求使用者進行聲音的辨識。通常是播放英文字母,以及英文的阿拉伯數字,這對某些在色彩或形狀上有知覺障礙的人無疑是項福音,但是這類的Captcha卻仍舊無法避免語言上的隔閡,當問題不是以熟悉的語言描述時,使用者有可能無法順利通過此類Captcha的測試。雖然聲音的Captcha無法以圖片方式呈現在論文中,但圖1.4中箭頭所指的符號,便是Google所使用的聲音式Captcha按鈕。

Sound-based CAPTCHA:

This CAPTCHA will give a sound sequence to the user and ask them to recognize what numbers and words are there. In general, this sequence uses numbers only. No doubt, this kind of CAPTCHA will help the people with visual difficulty to solve the problem, but it can not across the language gap, neither. When the sequence is generated by a unknown language, the user may not be able to pass the test. We can hardly to put a voice sequence in the paper, but Fig. 1.4 shows a mark which is the bottom of the Googles sound-based CAPTCHA.

明慧--Abstract

Abstract
The need to tell human and machines apart has surged due to abuse of automated 'bots'. However, several textual-image-based CAPTCHAs have been defeated recently, calling for the development of new anti-automation schemes. In this paper, we propose a simple yet effective visual CAPTCHA test by exchanging the content of non-overlapping regions in an image. We give in-depth analysis regarding the choice of parameter and image database during the test generation phase. We also contemplate possible ways, including 1) random guess, 2) collect and match, 3) image segmentation, to defeat the proposed test and provide counter-measures when necessary. Preliminary experimental results have validated the efficacy of the proposed CAPHCHA, although we expect that a large-scale experiment to collect and analyze user responses will contribute to optimal parameter settings.

由於全自動化機器人程式的濫用,人機辨識的需求已經暴增。然而,近年來一些以貼圖影像為基礎的CAPTCHA已經被反自動化組合的發展所取代。在這篇論文當中,我們提出一個簡單而有效的視覺CAPTCHA方法,它是在一張影像中,選出沒有影像重疊的部分做交換。我們在產生CAPTCHA的階段,根據參數的選擇及影像資料庫作深入的分析,同時也考量一些可能的方法去破解所提出的CAPTCHA,並且提供數據做為參考,而破解的方法包括1)隨機猜測,2)收集和比較,3)影像分割。初步實驗結果已經有效證實所提出的CAPTCHA,而我們期望有一個大規模的實驗來收集和分析使用者的使用結果,進而提供最佳的參數環境。

CAPTCHA introduction (cont. 2)

目前網路上較常見的Captcha又可以大略的分成以下三種類型:
圖形式Captcha(Image-based Captcha):
圖1.3是此種Captcha的一個範例,它會顯示一至數張圖片,要求使用者回答相關的問題。較簡單的作法是將具有某樣特定景物的圖片做輕微的扭曲,之後詢問使用者:這是張什麼東西的照片?或是將數張具有共同元素的圖片放在一起,詢問使用者:這些圖片的共通點在哪裡?雖然電腦很難去分辨一張照片裡到底是什麼東西,但是這種Captcha卻會受到語言的隔閡。舉例來說:當系統要求使用者輸入英文時,對於某些不太懂英文的使用者來說,語言就是道很深的鴻溝。除此之外,同樣的東西在每個人的主觀認定上也可能有不同的結果,我認為那是塊牛排(Steak),你可能認為那只是塊肉(Meet)。因此,圖形式Captcha在系統的設計上,通常會增加一個下拉式選單讓使用者選取其中的標準答案,但如果系統只有三十個標準答案,那電腦就有1/30通過Test的機會,這將大幅降低Captcha的安全性。


There are three common types on the internet.
Image-based CAPTCHA:
Fig. 1.3 is a sample of this kind of CAPTCHA. This type will give you one or more pictures, and ask you some questions. An easy method is to warp an image and then ask the user to identify what it is. Another method may generate a few pictures with the same object, and then ask the user to find out the common point of them. Although this kind of CAPTCHA is hardly to be recognized by the computer, the human may not across the language gap to answer the question. For example, when the system asks an American who don’t know the Chinese enter Chinese word, the user may not pass the test. Otherwise, someone may recognize the same picture to different answer. You may think that is a meet, but I say that is a steak. Because of the reason, the CAPTCHA system may increase a combo box to help user answer the right object. However, the security is decreased by the limited answer in this method. If a system has only 30 answers in the combo box, the bots have 1/30 chance to pass the test.