More answers from Phorm

Phorm have finally got back to me with answers to most of the remaining 33 questions that I asked.

I’m going to add a few notes below various answers, they’ll be in brackets in bold.

Q20. The report states that your system ignores “form fields” yet you claim that you will be collecting information regarding what people search for on the internet via search engines. The box people write their search queries in is a “form field” which appears to contradict the claim in your privacy audit report, can you clarify this situation?

A20. This could be clearer: we obtain search terms from GET submissions to known search engines. All other form fields are ignored.

Q21. The report states that data will be immediately purged from the system but “Research and debug logs may be kept on a separate system for a maximum of 14 days”. What is the nature of this separate system and as Phorm have stated that all the kit will be located within the ISP¹s infrastructure how is this data either transferred externally for research and debugging or what is the relationship regarding allowing representatives of Phorm to access ISP data infrastructure if this separate system remains located within their system?

A21. This item is a hangover from the previous attestation and needs updating. Logs contain only system health and error information. Browsing data is not included.

Q22. It has been claimed that this system would have the ability to throttle an internet users connection if they had opted out of the service but the report claims “Do not tie into the authentication systems of our ISP partners” if this is the case then how would a users connection be able to be throttled if it didn¹t know through the IS’¹s authentication systems which users had not subscribed to it?

A22. We do not tie into the ISP’s authentication system. Phorm cannot and has not claimed to throttle data. This was a speculation in an article, not originated by Phorm or the ISPs.

Q23. The report states “We offer an easy, anonymous method for users to opt out of Phorm¹s systems if they would rather not receive targeted advertising and content. For as long as a user retains the Phorm opt-out cookie, the system will not collect or store data on their browsing behavior.” However if this is the case then how does the ISP know who¹s online browsing to send to the proxy server for scanning and who¹s not to if it isn?t in anyway tied to the authentication systems of the ISP?

A23. The ISP’s system inspects the cookie and handles the user accordingly. This is browser-based and does not require integration with the authentication system.

Q24. Following on from this, if a user has an “opt-out” cookie does this mean that somewhere along the line at the ISP level, it checks to see if this Oopt-out” cookie is present and if this is the case, what would happen if a user had simply barred all cookies from “OIX.NET as per the instructions on your website?

A24. If you block cookies from webwise.net (renamed from oix.net) you will be treated as if opted out. We are advertising this method as one that survives cookie cleaning, but it is not supported in all browsers.

Q25. The report states “Because of inherent limitations in controls, error or fraud may occur and not be detected.
Furthermore, the projection of any conclusions, based on our findings, to future periods is subject to the risk that the validity of such conclusions may be altered because of changes made to the Service or controls, the failure to make needed changes to the Service or controls, or a deterioration in the degree of effectiveness of the controls.” It is of course standard practice to drop in a get out clause into any evaluation of a system and it¹s fair to assess that no system can ever 100% guarantee that things can¹t go wrong. However, what is the nature of in such an instance that something went wrong of your liability and or insurance to compensate those affected?

A25. This is the auditor’s disclaimer, not ours – and I am not a lawyer! But seriously, it’s hard to imagine an event that would require us to compensate anyone. If, for example, someone hacked into our system to get access to the data, they would be very disappointed: we simply don’t have the data — only product categories, a timestamp and a random number. Our safe, as it were, is empty. The AOL / Netflix accidental disclosure of masses of personal data could not happen with our system.

Q26. A quick break and sartorial comment and not intended in any way to be implied criticism of Phorm but why does such a respected company such as Ernst & Young who go around headhunting top graduates employ someone with such poor English grammar as to start a sentence with the word “because¹?

A26. Hmm. Perhaps grammar is different in the US? ‘Think Different’ (adverb vandalism!) springs to mind.. (btw we all use Macs so no slight intended to the great Mr Jobs)..

Q27. According to the report the system employed by Phorm does not store data with a sequence of numbers of more that three to avoid picking up credit card details. However many URL¹s contain more than three sequential numbers, as these URL¹s are passed to Phorm¹s proxy server it will store them no matter for how short a period even if it is held in RAM so does the system ignore URL¹s with sequences of number of three or more?

A27. Would you say the data was stored on an ethernet cable (for no matter how short a time)? We are clear that this kind of raw data is never stored on disk and is deleted from memory in real time. The system is not proxy-based – data capture is offline.

(A bit of a master class in avoiding the actual question. No data isn’t stored in an ethernet cable, don’t be daft we’re not talking about length of storage, we’re talking about does the system ignore URL’s with a sequence of numbers of three or more. A simple yes or no answer would have been sufficient).

Q28. Can you explain how this relationship with the cookies really works? In the case of having an opt-out cookie provided by Phorm does the ISP actively scan for the presence of this cookie on an individual users PC and if it does how can can it differentiate between any difference in service provision to that PC on a network and another PC that may have an opt-in cookie in terms of providing differential services?

A28. The ISP does not scan the PC: the cookie is detected in HTTP requests sent by the PC. On this basis the requests can be handled differently. This is browser-specific and so the sharing of a network is not important.

(I’m curious on this point. If the network is truly irrelevant and this is wholly browser based. Are Phorm saying that even in for example companies or home users with more than one computer that may run more than one web browser that each individual browser will have to be configured manually. If so, what additional costs are going to be incurred by companies and organisations to send techies round to block every single browser. Or is this not quite factually accurate in that cookies could be barred at either the router or server level and all computers behind that would be protected?)

Q29. Does the system scan all unencrypted HTTP requests including online e-mail services, private social networking sites such as Facebook and if it doesn¹t what is the system in place to allow it to differentiate between these sites and other HTTP sites?

A29. We maintain a list of webmail sites and we do not analyze their pages. In any case the content of all sites is protected by the way the system works:
it takes a ‘top 10′ of the repeated keywords from the page and matches them against a list of advertising categories, then throws the keywords away. The categories (“Channels”) are policed to ensure they do not contain personal information or match sensitive behaviours such as medical or porn. This means that unless a word from a page is a) repeated b) is one of the top 10 and c) is found in a legitimate list of advertising keywords, then it is ignored. This means that personal information cannot be matched and it passes unnoticed by the system.

(This is interesting. The basic answer is no the system can’t differentiate between private areas of non-encrypted HTTP sites so it will scan people’s web-based e-mails, private areas on social networking sites and from my own personal perspective, the backend area of my blog. Configuring this as having a list of site’s not to scan, presumably popular ones like Google Mail, Hotmail and Yahoo is no guarantee because the number of web based e-mail systems, forums, social networking sites, web based company intranets and blogs run in to the millions. At the very least this list should be made public and anyone should be able to add their site, forum, company intranet or blog to it. That does of course mean additional costs in terms of time for the people who have to do this. I wonder if there’ll be any compensation forthcoming?)

Q30. Leading on from that, the architecture of the system appears to suggest that when a HTTP request is sent to the ISP it is then passed on to a proxy server for analysis. Is content of that URL entered passed on as in the case of various community sites it may contain personally identifiable information?

A30. Please see A29 above.

Q31. At which point in the ISP¹s system is the HTTP request passed on to the proxy server. In particular, is it at the Domain Name Server stage and if so could an end user change the default settings on their router to use another DNS not from their ISP like Open DNS to avoid the sites that they visit being scanned?

A31. The system is not proxy-based – data capture is by traffic mirroring, so changing DNS will have no effect..

(So this pretty much seals the argument that there is in effect nothing whatsoever that an end user can do to stop this system apart from blocking/using an opt-out cookie).

Q32. There are with this system various “bits” of data flying backwards and forwards that are in addition to the normal data flow across an ISP¹s network specifically those on the connection between the end user and the ISP. For people on fixed limit connections, will these packets of data be discounted from the limits agreed in their contracts or comprise part of their monthly allowance?

A32. That is a matter for your ISP, but the amount of data is tiny.

Q33. As far as can be concluded from the technical data available, when a website is returned from the ISP to the end user it will have custom Javascript embedded into it to update information on the cookie held on the users computer. As a web publisher myself, have you had any kind of evaluation undertaken as to the legal position regarding copyright as although people may see exactly the same, they will be receiving code that the original author did not intend and are you going to offer an opt-out system for web publishers that do not wish for this code to be embedded into returns from their sites?

A33. The only way that Phorm javascript (aka an ad tag) will appear in a publisher’s page is if they have put it there because they are working with us.

(Just a note and I’m open to categoric denial that this will be included within the envisaged system planned with ISP’s but this is the relevant bit of the patent:
“[0035] At 206, the method includes ISP-initiation of context reading of the response data received in response to web page requests. The ISP-initiation of the context reading function may be performed by causing the context reader to be applied from the ISP to requested web page data. In particular, in FIG. 3, context reader 40 may be stored in a memory location at ISP 14, for example on a server (e.g., a proxy server) or network appliance that manages traffic through the ISP. In the present example, context reader 40 is a javascript that is embedded or injected by the ISP into response data 122, for example by the proxy server. As a result, the javascript (context reader 40) is embedded into web page 34. In typical implementations, the script is embedded into each of a plurality of pages that are requested by the client device.”)

Q34. In the case of partner advertising companies that will have Javascript embedded into their sites to search for profiled data from the cookie located on an end user¹s computer. How can you protect from that site linking up both the contained profiled data from the cookie and the users IP address if they run another statistical package that logs IP addresses thus allowing others to link profiled data to IP addresses which is one of the claimed privacy gold standards of your system?

A34. Need to clarify the question: is it about advertisers or publishers? If you browse to a website you give them your IP address directly.

(There are two elements to this question. The first is quite simple and is based on this particular part of the patent description of the technology, “[0028] Regardless of the particular data in browsing information 42, or the manner in which it is collected, the browsing information may be reported out to advertising server system 18 via Internet 12. System 18 is configured to receive browsing information 42 and use such browsing information to select context-specific advertising content 80 (such as advertisement 82) to be returned to the browser that generated the browsing information.” It means that on encountering a site with Phorm’s javascript embedded into it such as a partner advertiser, that Javascript will take the profiled information from the cookie, send it to an advertising server somewhere. As it says via the internet this model appears to suggest it is not within the ISP although I’m happy to accept you’ve dropped this approach but people will simply have to trust Phorm’s word on that. That advertising server then sends profiled advertising to the site for the end user to see. So at this point, what is to stop a partner advertiser running a malicious code to both extract profiled data and then hook up this information with the end users IP address. It must be noted that Phorm’s system has been specifically designed to take the IP address out of the loop, so why leave this possibility open.)

Q35. You state that the only information that will be collected are search term phrases and categories but according to the technical aspects of the patent application for your technology it allows for the collection of almost any kind of information including IP addresses. To what extent has the system been modified to disallow it from collecting such information that it is capable of and how can you guarantee that in the future it may not be modified to do so?

A35. The patent envisages many applications, most of which have not been implemented. The current system has no disabled functions waiting to be enabled, and your best guarantee about future systems is that they will be handled with the same transparency as this.

(I’m not going to be sarcastic but I’m sure some people may possibly find the statement on transparency amusing)

Q36. In the case of categories, the patent application states that innumerable categories and sub-categories thereof can be created. You give examples of things like travel, sport, cars etc. Do you intend to openly publish the categories and sub-categories thereof that your system is scanning people¹s web browsing for?

A36. Some categories (“Channels”) will be private for reasons of commercial confidentiality, but many will be open (and created under a wikipedia-like environment). However, ALL channels will be vetted for compliance, and will not contain personally-identifiable information or senstitive material.

Q37. What is the geographic location of the proxy server? Is it located within the ISP¹s network or externally?

A37. The system is not proxy-based – data capture is offline. Browsing data is all processed within the ISP network.

Q38. If the proxy server is located externally, where is it (nearest town will do, or in the case of these being more than one, nearest towns)

A28. Please see Q37 above.

Q39. If the proxy server is located within the ISP¹s network then what is the procedure for updates and reconfiguration or fixing if it goes wrong? In particular will someone from Phorm have the ability to remotely connect to this proxy server to change settings or is it a case that Phorm will have staff based within every ISP “minding the box” who will make changes/fix things as and when they arise?

A39. Support arrangments will depend on the ISP contract.

(I think it’s important to note that this question hasn’t been answered. It is of course highly important. How and in what way Phorm are able to access, change or reconfigure their equipment within ISP’s. A simple “we’re going to have remote access or not” would be handy. Or “we’re going to have our own staff based within the ISP to do this under their supervision” or “we’ll have no one at all with access to this kit in the ISP’s and we’ll just advise their staff” would be far more enlightening).

Q40. If the proxy server exists outside of the ISP¹s network and data is merely transmitted to it so that it can analyse web pages and return custom Javascript how does this conform with the provisions of RIPA?

A40. Please see A37 above.

Q41. So we can have an understanding of the capabilities of your system, can you tell us what make and model of hardware is going to form the proxy server set-up?

A41. No, sorry, that would be commercially confidential.

Q42. If it has, why has this service been set up as an “opt-out” service rather than an “opt-in” service. If the benefits to the consumer were so compelling then surely everyone would wish to “opt-in” to it would’¹t they?

A42. We are offering user a choice. They can opt out or in at any time. It’s worth noting that the very first thing you will see when you go online after the technology has been deployed is a full-page notice and at that point you can decide to opt out. In line with our commitment to transparency, you will see banner ads saying that Webwise is on. So if you don’t want it, you will be able to click on these ads and switch them off.

(Just to note that didn’t actually answer the question of why is this not ‘opt-out’ by default?)

Q43. Some of the ISP’?s already quoted by your company as having signed up to this service have issued statements on their site pointing to the benefits of the anti-phishing technology of the system to make the internet safer for users. Can you tell us what additional protection against phishing Phorm’?s technology adds in terms of security to the end user that is not already present in the two most commonly used browsers, Internet Explorer and Firefox?

A43. Being network based, it covers people who do not have the latest browser versions, or have not enabled the anti-phishing features, or have misconfigured it.

(OK, I know this is sarcastic but the answer is presumably sod all benefit to the end user whatsoever unless you’re completely stupid in which case you’re probably best off not using the internet in the first place).

Q44. Leading on from the last question, if Phorm or through it’?s partners have additional knowledge of phishing sites that the maintainers of Firefox and Internet Explorer do not, then why do Phorm and their partners in the altruistic nature of trying to make the internet safer for everyone simply hand over this knowledge to Microsoft or Mozilla or indeed try and sell it directly?

A44. We use commerical providers for our anti-phishing feeds. Some are the same as those used by Google and Microsoft, some are different and have different coverage.

(Note no answer to if you have additional knowledge of phishing sites why don’t you just give them away or sell them on).

Q45. The patent pending application for the technology behind this system gives an instance of if for example an end user wants to download a large file, say a music file then the system has the ability to send an advertisement ­ presumably a pop-up that would be akin to a television advertisement before the download takes place. Thus attempting to extract advertising revenue to offset the higher bandwidth that the user may be consuming. Can you confirm that such a capability of this system will not be implemented?

A45. The system does not have the capability at the moment, and if the ISPs are able to gain a reasonable revenue from participating in the online ad market through Phorm, then it should never be necessary.

(If this was a politicians answer then it would be ripped apart. The key phrase is ‘at the moment’ akin to the often stated ‘we have no plans to’ or ‘I cannot envisage a situation where we might do this’. OK, to be fair to Phorm they’ve got to leave themselves open to doing more things in the future but it seems fairly clear in the second part of the answer that if the ISP’s don’t make enough out of the current arrangement we could well be seeing adverts before downloading files, or the other possibility as laid out in the patent application, pop-up advertising between page loads. Pop-ups that presumably could not be stopped).

Q46. When I inadvertently left a bit of Javascript active in my blog post when I copied and pasted the technical elements of your system I noticed some interesting behaviour when my page loaded. When the page was loading it looked elsewhere for information. Although I accept it was probably a test server and will not have the capabilities of a production grade operation, it delayed the load time of my site. With that in mind, as far as I can tell it was looking for information from an external source. If this was a working system in place and I came across a partner advertisers site with your system’s Javascript embedded, where would it look for information? My reading of this is that the Javascript would look for, profiled information on the cookie on my computer then port that information to another server which would then provide the targeted advertising and insert it into the page that is loading. If this is the case, where is this server going to be located and if it is a core server system of Phorm how does this not only send information to an external location outside of my ISP but if the connection is direct between my browser and this server not allow for the possibility of both the profiled data on my cookie and my IP address to be put together?

We’ll get back to this week with an answer

(Just to note, it’s very similar to question 34 that additional clarification was requested for).

Q47. Will Javascript be embedded into every page that I load irrespective of whether I opt out of this system or simply block cookies from OIX.NET?

A47. Phorm ad tags contain javascript but they will only appear in a page where the website has placed them there. If you are opted out, you will not see a relevant ad, but you are likely to be shown the original, probably less relevant ad by the website anyway.

Q48. Can you confirm or categorically deny that your system was trialled in 2007 with BT?

A48. No.

(I’m not sure whether this is a ‘no we didn’t trial the system with BT in 2007′ or ‘no we can’t confirm or deny it’. Anyway, according to here, here, here, and here. So the answer is yes, it was trialled with BT in 2007 without customers being informed which is presumably why some of them are now planning to sue BT.

Q49. Can you tell us when this system is due to go live with the three ISP’?s already mentioned on your site to have signed up for this?

A49. No, the ISPs will be communicating directly with their customers so look out for the messages…

Q50. Can you tell us when the Javascript will begin to be embedded on you partner advertisers sites (Guardian and Financial Times)?

A50. No, but it’s worth pointing out the “javascript” is nothing more sinister than an ad tag, similar to most others on the market. The difference is in Phorm’s ability to serve a relevant ad into the space on the page.

Q51. Will you be making public and publishing a list of partner advertising sites?

A51. The PR team may!

Q52. Can you tell us at what time various ISP’?s will be running trials on this system prior to full scale implementation?

A52. No, they will be communicating directly with their own customers – does this mean you have more that one ISP yourself?

(Not quite sure what the question about me having more than one ISP is about but no, why would I?)

2 Comments »

18th March 2008 in Techie Stuff

2 Responses to “More answers from Phorm”

  1. Jonah responded on 20 Jan 2009 at 9:50 pm #


    A50. No, but it’s worth pointing out the “javascript” is nothing more sinister than an ad tag, similar to most others on the market. The difference is in Phorm’s ability to serve a relevant ad into the space on the page.

    Because Phorm & the ISP are combining & colluding in a “Man in the Middle Attack” on both the Web User & the Website Owner!

    And Javascript of any sort “is active code” & in theory it is as adaptable as our DNA Structure!

  2. Bart Lansing responded on 29 Jul 2009 at 2:55 pm #

    Alright, I know I am necro’ing this conversation but I’ve just tripped over it. Apologies, etc.

    A Slashdot conversation brought another point which you may or may not have touched on elsewhere…copyright infringement by Phorm. One opinion in the Slashdot discussion holds that by inserting ads into requested pages Phorm would be creating a derivative work and profiting from that action. Is Phorm only doing ad inserts into pages from sites they are partnered with?

Trackback URI | Comments RSS

Leave a Reply