washed_pfafhydrocode_example

Pfafhydrocode¶

A python module to decode the hydrological Pfafstetter Coding System. The module is published on github.com/steinmnn/pfafhydrocode and is a major part of my Watershed Map Project.

It contains the functions upstream and downstream, which help to identify watersheds or affected areas of river manipulations. With the optional parameter oddOrZero=True the modified version of the Pfaffstetter code used in the HydroBASINS dataset can be decoded.

In the following I present the workflow for decoding the HydroBASINS and get the watershed of any given point on earth. However, there are some issues with Pfafstetter levels greater than 8. As for now, I will present a dirty workaround for this. Have fun reading!

Workflow¶

First we clone the repo with git clone https://github.com/steinmnn/pfafhydrocode and import the module and a couple of other packages in our script:

import geopandas as gpd
import matplotlib.pyplot as plt
from matplotlib.patches import Patch

from pfafhydrocode import pfafhydrocode as pc

Let us load the example dataset within the data/hybas_eu_lev00_v1c_rhine.csv file from the repo which contains a subset of the HydroBASINS dataset (Rhine basin). Then, we select a shed on the Pfaffstetter level 8.

df = gpd.read_file('pfafhydrocode/data/hybas_eu_lev00_v1c_rhine.csv')

#plotting
fig, axs = plt.subplots(1,2, figsize=(12,5))
axs[0].set_title('Subset (data/hybas_eu_lev00_v1c_rhine.csv)')
df.plot(ax=axs[0])
axs[1].set_title('Selected shed (PFAF_8=23267093)')
df.plot(ax=axs[1])
df[df.PFAF_8 == '23267093'].plot(ax=axs[1], color='red')
axs[1].set_ylim(46.5,48.5)
axs[1].set_xlim(6.33,9)

(6.33, 9.0)

Let us apply pc.downstream() and pc.upstream() once on Pfafstetter level 8 and once on 12. Let us use a fast list comprehension for that.

# Pfafstetter level 8
a = df.loc[df.PFAF_8 == '23267093','PFAF_8'].iat[0]
mask = [pc.downstream(a,b,oddOrZero=True) for b in df['PFAF_8']]
df_dwnstr = df[mask]
mask = [pc.upstream(a,b,oddOrZero=True) for b in df['PFAF_8']]
df_upstr = df[mask]

# Pfafstetter level 12
a12 = df.loc[df.PFAF_12 == '232670930000','PFAF_12'].iat[0]
mask = [pc.downstream(a12,b,oddOrZero=True) for b in df['PFAF_12']]
df_dwnstr12 = df[mask]
mask = [pc.upstream(a12,b,oddOrZero=True) for b in df['PFAF_12']]
df_upstr12 = df[mask]

fig

Level 8 on the left works as expected.

Issue: The missing sheds on levels > 8¶

Level 12 on the right does not contain all sheds, as you can see from the interruptions along the orange course of the river. Let us check what happens if we take a missing shed as initial.

# Pfafstetter level 12 with missing shed
a12 = df.loc[df.PFAF_12 == '232670913400','PFAF_12'].iat[0]
mask = [pc.downstream(a12,b,oddOrZero=True) for b in df['PFAF_12']]
df_dwnstr12 = df[mask]
mask = [pc.upstream(a12,b,oddOrZero=True) for b in df['PFAF_12']]
df_upstr12 = df[mask]

fig

As you can see, the watershed area could not be determined and only the initial shed is found.

I could not wrap my head around why this happens and similar packages like HydroCode and pfaf_decode have similar issues. In the following I present a dirty workaround to fix this issue.

Workaround: determine watershed with missing initial shed¶

Basically, if the resulting df_downstr contains only one shed, we recalculate with the next downstream basin. With the NEXT_DOWN attribute which refers to the unique HYBAS_ID we can identify the next downstream basin and perform the method on them. After that we remove this next downstream basin again.

a12 = df.loc[df.PFAF_12 == '232670913400','PFAF_12'].iat[0]
mask = [pc.downstream(a12,b,oddOrZero=True) for b in df['PFAF_12']]
df_dwnstr12 = df[mask]

if (len(df_dwnstr12) == 1):
    print('Warning! Only 1 shed is found, so make dirty hack and try with next down.')
    id_nextdown = df.loc[df.PFAF_12 == a12,'NEXT_DOWN'].iat[0]
    a_nextdown = df.loc[df.HYBAS_ID == id_nextdown, 'PFAF_12'].iat[0]
    
    mask = [pc.downstream(a_nextdown,b,oddOrZero=True) for b in df['PFAF_12']]
    df_dwnstr12_hack = df[mask]

    if len(df_dwnstr12) != 2: #if not just 2 sheds
        #new watershed without nextdown shed
        df_dwnstr12 = df_dwnstr12_hack[df_dwnstr12_hack.PFAF_12 != a_nextdown]
    else:
        print('Warning! Dirty hack failed, watershed only contains inital shed.')

Warning! Only 1 shed is found, so make dirty hack and try with next down.

fig

Conclusion¶

In this and most cases this works totally fine and the watershed is determined correct. Unfortunately, this sometimes create bigger sheds as the next down initial shed is sometimes downstream of multiple tributary streams. But this is rare phenomenon.