About

My name is Mark and Ex machina, data is my website for publishing my data analytics projects.  I like writing about these projects and hope to showcase Jupyter notebooks for them at some point on Github.  The last time I had a blog was in the days of Joomla 1.0 (I had barely started secondary school), so please forgive me if the website feels like a work in progress!

Economics is my academic background and my interest in data analytics mainly arose from seeking to model world problems and applications of econometrics.  I have experience with software such as Stata and SPSS for econometric modelling.  However, being out of university has restricted me from using expensive proprietary software, so I spent time self-studying SQL and Python to provide myself a large analytics toolkit, which is free and draws primarily from open-source software.  I have experience with data visualisation software such as Tableau, so I am keen to piece together many tools together for various purposes, e.g. data cleaning, exploration, visualisation.

It is fascinating to discover new approaches to problem solving and apply them to unstructured inputs, whether that involves a new library added to my Python distribution or learning new concepts.

I like multidisciplinary projects and enjoy learning from people with different skills, so it would be great to hear from you about your experience with analytics.

 

I named my website Ex machina, data (Data from the machine) as it encapsulates the modernity of data analytics and draws from my perspectives on data:

  • Machine learning is currently at the forefront of analytics and is revolutionising human activity.  I am passionate about open education and one of the most enjoyable courses I have taken was Andrew Ng’s course on machine learning.  As it happens, I was writing this page not long after reading a news article on how machine learning could bolster open education.
  • I try to be sceptical when doing data analysis and other research.  Even in academic environments where researchers have a solid grounding in statistical theory, issues such as publication bias are problematic and research projects can operate as if in a silo.  Data can tell us a lot about a topic, but it can become tempting to do a lot of fine-tuning to a model to justify all the efforts.  When analytical methods are used, it is important to avoid a deus ex machina that sweeps away all the problems of a data pipeline or a theory:
    • As an economist, I am all too aware that theories and research can rely on a large web of assumptions.  Quite often, analysts can be wedded to unrealistic assumptions or methods that bring caveats.
    • Betting is an area of interest for me, particularly in politics and sport.  Between 2015 and 2017 in particular, it is astonishing to think about how the ‘conventional wisdom’ has been so wrong in predicting events and phenomena, e.g. UK elections, Leicester City, US presidential election.  As was quoted in The Big Short film: “It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so”.
  • Data can be unwieldy and messy.  Some languages do not have a firm syntax order, e.g. Latin, which enables all sorts of literary devices (e.g. chiasmus).  ‘Ex machina, data’ twists the syntax by placing ‘data’ at the end of the phrase, in order to draw attention to that word.  With data analysis, it is so rewarding to start with a messy dataset and work on it to gain insights.

 

The header for my blog is adapted from Pingiivi under CC by 2.0.  With time, I may eventually produce my own web graphics.