But I know it when I see it...
A swedish television investigative journalism "show" called "Insider" is currently running a investigation about the money behind porn industry. Now why I use the term "show" for the television program is that it is not totally unbiased journalism. The first episode of this investigation was little about the background on the history and starting to follow the money. One of the points was that Internet Service Providers(ISP) makes a bundle out of this. I could almost buy their point if it wasn't for their "expert": Per Hellquist. I'll translate a quote for you: "It is possible to screen pictures for skin colored pixels, and from there determine if a picture is pornographic or not!". This in the context that it is feasible for ISP:s to intercept porn traffic by this method. This is so wrong at so many levels.
To start of Per works at Symantec in sweden. Nothing wrong with working at symantec ( or is there? ), but it begs the question how he got his job in the first place? He lacks some fundamental knowledge about how the internet works and lets not forget some common sense.
Lets try to break down what is wrong with the statement.
1. When you transfer a file on the internet, the file is divided in to chunks called packets and routed through the network!
This means that to analyze a picture all packets of that picture must be captured ( also rerequest a packet if it was malformed during transport ) and then assembled to the file, first after that you can start your analysis of the picture. But if you capture your packets during transit then it is too late to stop the transfer! To solve these problems we must somehow proxy the transfer. If by some luck we succeseed in capturing and analyzing the content of the image the next problem rears it's ugly head.
2. Picture analysis is not cheap!
It takes enourmous amounts of memory and processing power to keep up with just the http traffic of a connection. But these companies are usually rich and they could actually take a slightly higher fee for a slower connection just to filter it for the customers. Which customer would not like to pay more and get less content???
3. Picture analysis is hard!
Really, thats why it is so computationally expensive. First of all what is skin color, the shade of colors differs slightly with race. That makes it a little harder but lets assume that we can solve this problem. How would the computer determine image comes from a lingerie webshop or porn page? Number of pixels that are skin colored? The problem here is actually very well known. A human can actually tell what is smut or not, but it is a matter of personal taste. A computer has a very hard time to "see the difference".
To start of Per works at Symantec in sweden. Nothing wrong with working at symantec ( or is there? ), but it begs the question how he got his job in the first place? He lacks some fundamental knowledge about how the internet works and lets not forget some common sense.
Lets try to break down what is wrong with the statement.
1. When you transfer a file on the internet, the file is divided in to chunks called packets and routed through the network!
This means that to analyze a picture all packets of that picture must be captured ( also rerequest a packet if it was malformed during transport ) and then assembled to the file, first after that you can start your analysis of the picture. But if you capture your packets during transit then it is too late to stop the transfer! To solve these problems we must somehow proxy the transfer. If by some luck we succeseed in capturing and analyzing the content of the image the next problem rears it's ugly head.
2. Picture analysis is not cheap!
It takes enourmous amounts of memory and processing power to keep up with just the http traffic of a connection. But these companies are usually rich and they could actually take a slightly higher fee for a slower connection just to filter it for the customers. Which customer would not like to pay more and get less content???
3. Picture analysis is hard!
Really, thats why it is so computationally expensive. First of all what is skin color, the shade of colors differs slightly with race. That makes it a little harder but lets assume that we can solve this problem. How would the computer determine image comes from a lingerie webshop or porn page? Number of pixels that are skin colored? The problem here is actually very well known. A human can actually tell what is smut or not, but it is a matter of personal taste. A computer has a very hard time to "see the difference".