Jack McKew's Blog

Define Functions Iteratively With Python

Posted by Jack McKew on Fri 01 January 2021 in Python • Tagged with python, software-development • 4 min read

An interesting problem came up recently, there was a piece of code absolutely full of the same function calls over and over again, meaning if anything ever need to change, that would have to be changed in over 500 places, not ideal. Thoughts go back to single responsbility, and don't …

Differential Privacy

Posted by Jack McKew on Fri 18 December 2020 in Data Science • Tagged with datascience, python • 4 min read

It's quite clear in today's age that the biggest companies in the world, make most of their profits from harvesting and productionalising their user's data. With privacy becoming more and more of a concern in everyday life as we become more connected, it's almost becoming a human right for our …

Property Based Testing in Python

Posted by Jack McKew on Fri 16 October 2020 in Python • Tagged with python, software-development • 4 min read

Building software can be challenging, but maintaining it after that initial build can be even moreso. Being able to test the software such that it verifies the software behaves as expected is crucial in building robust software applications that users depend upon, being able to automate this testing is even …

Shallow vs Deep Copy in Python

Posted by Jack McKew on Fri 09 October 2020 in Data Science • Tagged with datascience, python • 4 min read

Shallow vs Deep Copy in Python¶

One of the utmost crucial parts in all programming languages is maintaining variables. We create, modify, compare, delete our variables to build more complex systems that eventually make up the software we use. This is typically done by using the = operator (eg x = 5 …

Types of Averages (Means)

Posted by Jack McKew on Fri 21 August 2020 in Data Science • Tagged with datascience, python • 4 min read

The most common analytical task is to take a bunch of numbers in dataset and summarise it with fewer numbers, preferably a single number. Enter the 'average', sum all the numbers and divide by the count of the numbers. In mathematical terms this is known as the 'arithmetic mean', and …

Dataclasses vs Attrs vs Pydantic

Posted by Jack McKew on Fri 07 August 2020 in Data Science • Tagged with datascience, python • 6 min read

Python 3.7 introduced dataclasses, a handy decorator that can make creating classes so much easier and seamless. This post will go into comparing a regular class, a 'dataclass' and a class using attrs. Dataclasses were based on attrs, which is a python package that also aims to make creating …

Generators in Python

Posted by Jack McKew on Thu 30 July 2020 in Data Science • Tagged with datascience, python • 4 min read

Generators are a special type of function in python, letting you 'lazy load' data; a function becomes a generator is with the yield statement. Lazy loading is when you access just a portion of a data set that you are interested in (eg, the part you are working with), as …

Street Suffix Analysis & Colouring with Python

Posted by Jack McKew on Fri 24 July 2020 in Data Science • Tagged with datascience, python • 5 min read

Street Suffix Visualisation with Python

Ever thought about how roads and streets are named where you live? How many are roads versus how many are streets? Is there a specific pattern to it where you live or is it just random? This post is going to go into how to …

Sentiment Analysis & Text Cleaning in Python with Vader

Posted by Jack McKew on Fri 17 July 2020 in Data Science • Tagged with datascience, python • 8 min read

Sentiment Analysis in Python with Vader¶

Sentiment analysis is the interpretation and classification of emotions (positive, negative and neutral) within text data using text analysis techniques. Essentially just trying to judge the amount of emotion from the written words & determine what type of emotion. This post we'll go into how …

How Pandas_Alive was Made

Posted by Jack McKew on Fri 26 June 2020 in Python • Tagged with Python • 5 min read

Pandas-Alive is an open source Python package for making animated charts from Pandas dataframes. This project was first inspired by a very specific COVID-19 visualisation, so I set out to make this visualisation a reality.

This visualisation consisted of a bar chart race showing regions, a line chart showing new …

Older Posts Newer Posts