Australia Post Codes Connected

Posted on Mon 29 May 2023 in Python • 4 min read

Have you ever wondered what the sequence to postcodes is? Ever wondered who your postcode neighbour was? I definitely have, so I've built a visualisation to do exactly that, and made it interactive for everybody to play around with!

Before we get into any of the processing, let's take a look at the interactive map!

For those on mobile, ensure to press the full screen button underneath the layers button!

In [14]:
# This is to display the map at the beginning of the blog post
m
Out[14]:
Make this Notebook Trusted to load map: File -> Trust Notebook

For those that aren't able to see the interactive version, here's a photo :)

postcodes connected

First of all, we'll need to get the data of each postcode in Australia and a relevant latitude/longitude such that we can process and plot it. A big thank you to Matthew Proctor who has already gone through this data gathering for us!

Now let's look into what data points have come along in this dataset.

In [6]:
import geopandas as gpd
df = gpd.pd.read_csv('../data/australian_postcodes.csv')
df.iloc[:5,:6]
Out[6]:
id postcode locality state long lat
0 230 200 ANU ACT 149.11900 -35.277700
1 21820 200 Australian National University ACT 149.11890 -35.277700
2 232 800 DARWIN NT 130.83668 -12.458684
3 24049 800 DARWIN CITY NT 130.83668 -12.458684
4 233 801 DARWIN NT 130.83668 -12.458684

Looks like there's a lot of data points with each postcode, so before we begin any further cleaning of the data, let's plot it to see the geographic extent of how far this data goes.

In [7]:
gdf = gpd.GeoDataFrame(
    df, geometry=gpd.points_from_xy(df.long, df.lat), crs="EPSG:4326"
)
gdf.plot()
Out[7]:
<Axes: >

Woah, looks like it's even got coordinates in there for all the territories of Australia (there's 10 of them!), it also seems from looking at the data above, there's lots of duplicates in locations & postcodes, so let's clean these up.

We do this through the process of:

  1. Remove any duplicate postcodes (we keep the first)
  2. Remove any duplicate geometries (keeping the first again)

This should ensure that we have unique postcodes, with unique geometries.

In [8]:
# There's duplicate postcodes, and geometries that need to be removed
gdf = gdf.drop_duplicates(subset=['postcode']).drop_duplicates(subset='geometry')

Next up, since we aren't overly concerned about the territories of Australia, and more interested in the mainland postcodes, let's filter our points such that they exist in a bounding box of Australia.

In [9]:
# http://bboxfinder.com/#-45.559737,109.346924,-10.454727,157.862549

def filter_gdf_by_bounding_box(gdf,minimum_longitude,minimum_latitude,maximum_longitude,maximum_latitude):
    return gdf.cx[minimum_longitude:maximum_longitude, minimum_latitude:maximum_latitude]

filtered_gdf = filter_gdf_by_bounding_box(gdf,109.346924,-45.559737,157.862549,-10.454727)

filtered_gdf.plot()
Out[9]:
<Axes: >

Next up is the part everyone has been waiting for! We want to do a few things here before creating the output dataset:

  1. We want to draw a line between each postcode
  2. For each line we want to embed:
    • starting postcode & name
    • ending postcode & name
    • state postcode exists in
In [10]:
from shapely.geometry import LineString

def check_geometry_valid(geometry):
    if geometry.x == 0 or geometry.y == 0:
        return False
    return True

def convert_gdf_to_linestring(gdf):
    gdf = gdf.reset_index()
    linestring_gdf = gpd.GeoDataFrame()

    for index, row in gdf.iterrows():
        current_row = row
        # Skip the first row
        if index > 0:
            previous_row = gdf.iloc[index - 1, :]

            # Ensure geometry valid
            if check_geometry_valid(current_row.geometry) and check_geometry_valid(
                previous_row.geometry
            ):
                # Create a linestring connecting each postcode to one another
                linestring = LineString([previous_row.geometry, current_row.geometry])
                new_line = gpd.GeoDataFrame(
                    [
                        [
                            f"{previous_row['postcode']}-{current_row['postcode']}",
                            f"{previous_row['locality']}-{current_row['locality']}",
                            current_row["state"],
                        ]
                    ],
                    geometry=[linestring],
                    columns=["Postcodes","Localities", "State"],
                    crs="EPSG:4326",
                )
                # Build resulting dataframe line by line
                linestring_gdf = gpd.pd.concat([linestring_gdf, new_line])

    return linestring_gdf


convert_gdf_to_linestring(gdf).head()
Out[10]:
Postcodes Localities State geometry
0 200-800 ANU-DARWIN NT LINESTRING (149.11900 -35.27770, 130.83668 -12...
0 804-810 PARAP-ALAWA NT LINESTRING (130.87331 -12.42802, 130.86624 -12...
0 810-812 ALAWA-ANULA NT LINESTRING (130.86624 -12.38181, 130.89047 -12...
0 812-813 ANULA-KARAMA NT LINESTRING (130.89047 -12.39125, 130.91610 -12...
0 813-815 KARAMA-CHARLES DARWIN UNIVERSITY NT LINESTRING (130.91610 -12.40478, 130.86900 -12...

Now we should be able to plot this and see how the states connect!

In [11]:
convert_gdf_to_linestring(filtered_gdf).plot('State')
Out[11]:
<Axes: >

Finally, let's use folium to make this interactive.

In [13]:
import folium
from folium import plugins
import random

# Create an interactive Map instance
m = folium.Map(location=[-27, 135], zoom_start=4, control_scale=True)

# Convert the LineString to GeoJSON
converted_gdf = convert_gdf_to_linestring(filtered_gdf)

# Generate random colours for each state to colour them
states = converted_gdf["State"].unique()
state_colours = {type_val: f"#{random.randint(0, 0xFFFFFF):06x}" for type_val in states}


def style_function(feature):
    state = feature["properties"]["State"]
    color = state_colours.get(state, "blue")
    return {"color": color, "fillColor": color}


# Add the LineString to the map as a GeoJSON layer
for state in states:
    folium.GeoJson(
        converted_gdf[converted_gdf["State"] == state].to_json(),
        name=state,
        style_function=style_function,
        tooltip=folium.GeoJsonTooltip(fields=["Postcodes", "Localities", "State"]),
    ).add_to(m)

# Add a layer control to toggle the visibility of the LineString layer
folium.LayerControl().add_to(m)
plugins.Fullscreen(
    position="topright",
    title="FULL SCREEN ON",
    title_cancel="FULL SCREEN OFF",
    force_separate_button=True,
).add_to(m)
Out[13]:
<folium.plugins.fullscreen.Fullscreen at 0x167fd3f90>
In [ ]: