Single Source Of Truth

The biggest lesson I’ve learnt in my short time in analytics is being vigilant on sticking with a single source of truth. Every extract should be run once, cleansed through once, and then thrown into production. There should not be multiple extracts for the same ‘theme’ as this serves only to confuse not only other people but yourself (at least in my experience). Simplicity should be strived for; having one go to source will solve that issue. [Read More]

Open Source Data Analytics

Recently, I’ve been exploring outside other programming languages and ideas, in particular open source data analytics and d3. This space has really broaden my mind with what can be achieved outside the boundaries of SAS in data mining, especially with zero-installation modules and applications. Python has been my main weapon, making use of modules such as pyodbc pyper networkx has really allowed me to be able to extract (pyodbc), and perform analysis (pyper/R) and visualize data effectively (networkx). [Read More]

Prototypes

Too often, prototype creep occurs. Prototypes end up being the production version, regardless of whether it is appropriate. “We don’t have enough time.” Is the most common excuses for not creating a production copy. However, we really should plan accordingly. There are many reasons for creating production versions from scratch. One being: Once in production, a prototype will never die. This is why “we don’t have enough time. [Read More]