Monday, January 05, 2015

Find Orphan pages on website

You have a website and want to be sure that there are no Orphan pages? Not sure how to develop it? If so - you found a right place to go :), let's talk about most important steps.
I assume you have a list of all pages you want to verify (otherwise - it will be your first task)

Here is a logic/snippets
  1. We need functionality that can extract HTML from a web page. Later we will scan it and get internal links.
    private String getPageContent(String pageurl) throws Exception {
       StringBuffer buf = new StringBuffer();
       URL url = new URL(pageurl);
       InputStream is = url.openConnection().getInputStream();
       BufferedReader reader = new BufferedReader( new InputStreamReader( is )  );
       String line = null;
       while( ( line = reader.readLine() ) != null )  {
          buf.append(line);
       }
       reader.close();  
       return buf.toString();
    }
  2. Make a logic that can deal with DOM. I use jsoup to manipulate with HTML and I really recommend it (easy and fast). The method below select all links that begin with baseurl (it's domain of your website), in that way we can cut all external links and get only internal links.
    private List<string> getAllInernalLinks(String html, String baseurl) throws Exception {
       List<string> res = new ArrayList<string>();
       String select = "a[href^="+baseurl+"]";
       org.jsoup.nodes.Document dom = Jsoup.parse(html);
       Elements links = dom.select(select);
    
       for (Element link : links) {
          String src = link.attr("href");
          res.add(src);
       }
       return res;
    }
  3. Now we must build a List with all internal links from all pages on your website.
    String List<string> alllinks = getAllInernalLinks(html_from_all_pages, baseurl);
  4. We need to make sure that pageurl can be found in alllinks more then in pagelinks (to avoid case when page has link to itself).
    private boolean isOrphan(List<string> pagelinks, List<string> alllinks, String url) throws Exception {
       if (Collections.frequency(alllinks, url) > Collections.frequency(pagelinks, url)) {
          return false;
       }
       return true;
    }

Wednesday, October 29, 2014

Formatting a URL link in Drupal using function l

Another nice function in Drupal. It's called function l.
It formats a URL link as an HTML anchor tag.
I know it looks really easy and simple, it's just nice to use it after doing that manually many times.
$option = array();
$option['attributes'] = array('title' => $label);
l($label_trimmed, $uri['path'], $option));// will output [a href="$uri-path" title="$label">[/a]

Tuesday, October 28, 2014

Truncating a string on a word boundary in Drupal

Typical task: "truncate a string only on a word boundary and add a some postfix to resulting string". Every developer did it many times, I'm sure. Here is a how Drupal solves it with it's api: function views_trim_text
define(MAX_LENGTH, 30);

$alter = array(
    'max_length' => MAX_LENGTH,
    'word_boundary' => TRUE,
    'ellipsis' => TRUE
);

$mystring = "truncate a string only on a word boundary and add a some postfix to resulting string";
$mystring = views_trim_text($alter, $mystring); //result is 'truncate a string only on a...'

Monday, September 08, 2014

Issues when importing WSDL files into Web Service Consumer

Recently I faced up with WSDL which I couldn't import into Web Service Consumer. Our consumer worked well from last 5 years but it is a long period and during that time our Service Provider was updated a lot so we decided to update our Consumer as well. Guess everything went fine?
No WSDL was returned from the URL
I simply created new Consumer in Domino Designer, set URL to our WSDL, picked Java and clicked OK. Oops...
---------------------------
Domino Designer
---------------------------
No WSDL was returned from the URL:
https://api.ourserver.com/secure/api1/WebService?WSDL
---------------------------
OK
---------------------------
The requested operation failed: no import files
Wow, thought I :) let's try to import WSDL as Lotus Script then (just to see if it is not related to Java)
---------------------------
IBM Domino Designer
---------------------------
The requested operation failed: no import files
---------------------------
OK   
---------------------------
Name too long
Hey, what? This WSDL is used by many another applications without any issues, what is going on!? I downloaded WSDL as file to my local PC and tried to import it as Lotus Script again. This time it went fine (except issues with Name too long). Well, great news anyway, at least everything works when WSDL is a local file.


The Web Service implementation code generated from the provided WSDL could not be compiled, so no design element was created
Ok, it worked for Lotus Script, let's set now Java...
---------------------------
IBM Domino Designer
---------------------------
The Web Service implementation code generated from the provided WSDL could not be compiled, so no design element was created.  Please correct the WSDL and try again.  The errors are located in the following file:: C:\Users\dpa\AppData\Local\Temp\notes90C43B\47238811.err
---------------------------
OK   
---------------------------
OK, it's time to blame Designer and IBM! Why it is so difficult just to import WSDL? All another application that use WSDL from our server did not have such issues. It's just not fair :). Found a file with error and quite typical line: java.lang.OutOfMemoryError: Java heap space. I knew what to do, I increased HTTPJVMMaxHeapSize and JavaMaxHeapSize to 512M, restarted Designer/Notes and tried again. Worked well! I restored original values to HTTPJVMMaxHeapSize and JavaMaxHeapSize after that.
The system is out of resources.
Consult the following stack trace for details.
java.lang.OutOfMemoryError: Java heap space
at com.sun.tools.javac.util.Position$LineMapImpl.build(Position.java:151)
at com.sun.tools.javac.util.Position.makeLineMap(Position.java:75)
at com.sun.tools.javac.parser.Scanner.getLineMap(Scanner.java:1117)
at com.sun.tools.javac.main.JavaCompiler.parse(JavaCompiler.java:524)
at com.sun.tools.javac.main.JavaCompiler.parse(JavaCompiler.java:562)
at com.sun.tools.javac.main.JavaCompiler.parseFiles(JavaCompiler.java:816)
at com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:739)
at com.sun.tools.javac.main.Main.compile(Main.java:365)
at com.sun.tools.javac.main.Main.compile(Main.java:291)
at com.sun.tools.javac.main.Main.compile(Main.java:282)
at com.sun.tools.javac.Main.compile(Main.java:99)
at lotus.notes.internal.IDEHelper.compile(Unknown Source)
Simple thing however it costed few hours for me. Hope it will save some time for other people.

Tuesday, July 08, 2014

Track events using google analytics via hitCallback

If you are using google analytics to track clicks/events then at some point you may want to track submits of forms. The only one way to do that is to use hitCallback function. It is easy to do it, however many people forget to verify cases when google analytics library is blocked, f.x. by extensions AdBlock or Ghostery) and it means hitCallback will not be defined and simply will not work.

Google analytics classic
jQuery(".form").on("submit", function(f) {
  var _this = this;
  _gaq.push(['_set','hitCallback',function() {
    $(_this).parents('form').first().submit();
  }]);
  _gaq.push(['_trackEvent', '/signup']);
  // here is check if google-analytics.js is loaded and if not - return true, otherwise false
  return !window._gat;
})
Google analytics universal
jQuery(".form").on("submit", function(f) {
  var _this = this;
  ga('send', 'pageview', '/signup', {
    'hitCallback': function() {
      $(_this).parents('form').first().submit();
    }
  })
  // here is check if google-analytics.js is loaded and if not - return true, otherwise false
  return !(ga.hasOwnProperty('loaded') && ga.loaded === true);
})