Advertisement
jamo

ColdFusion & jsoup HTML Whitelisting/Cleaning

Apr 3rd, 2012
455
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. <!--- Requires jsoup - http://jsoup.org/ --->  
  2. <!--- Looping over a RSS query object and santizing HTML in a field named "Description" --->
  3. <CFSET jsoup = CreateObject("java", "org.jsoup.Jsoup")>
  4. <CFSET Whitelist = CreateObject("java", "org.jsoup.safety.Whitelist")>
  5.  
  6. <CFOUTPUT QUERY="Feed">
  7.     <cfscript>
  8.     TheHTML = jsoup.parse(Description);
  9.     // Remove first advertisement block in feed
  10.     TheHTML.select('p a[href*=doubleclick]').first().parent().remove();
  11.     HTMLFragment = TheHTML.body().html();
  12.     // Santize the HTML using "basic" whitelist
  13.     HTMLFragment = jsoup.clean(HTMLFragment, Whitelist.basic());
  14.     // Update query row inline
  15.     Feed.Description[CurrentRow] = HTMLFragment;
  16.     </cfscript>
  17. </CFOUTPUT>
  18.  
  19. <cfdump var="#Feed#">
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement